This collection includes a variety of validated Ansible Roles to automate and manage Day-2 use cases involving NetApp Trident Protect in a Red Hat OpenShift Virtualization environment. It covers common application data management workflows such as Backup and Restore, Snapshot scheduling and Restore, and Disaster Recovery (SnapMirror Replication, Failover, Reverse Resync, and Failback) workflows.
This collection uses the kubernetes.core certified Ansible collection to manage Kubernetes objects such as AppVaults, Applications, Backups, SnapshotSchedules, Snapshots, and Replication resources via the NetApp Trident Protect custom resources.
This collection has been tested with ansible-core >= 2.16.0.
Before using this collection, you need to install it with the Ansible Galaxy command-line tool:
ansible-galaxy collection install netapp.trident_protectYou can also include it in a requirements.yml file and install it with
ansible-galaxy collection install -r requirements.yml, using the format:
---
collections:
- name: netapp.trident_protectNote that if you install the collection from Ansible Galaxy, it will not be upgraded automatically when you upgrade the ansible package. To upgrade the collection to the latest available version, run the following command:
ansible-galaxy collection install netapp.trident_protect --upgradeYou can also install a specific version of the collection, for example, if you need to downgrade when something is broken in the latest version (please report an issue in this repository). Use the following syntax to install a specific version, for example 1.0.0:
ansible-galaxy collection install netapp.trident_protect:==1.0.0See using Ansible collections for more details.
The collection ships several roles that are designed to be composed into end-to-end workflows. Each role's README documents its inputs in detail; the workflows/ diagrams below show the order in which roles are typically run.
The trident_protect_common role
creates the shared objects (Secrets, AppVaults, Application CRs, VMs/PVCs labels)
that the Backup/Restore and Snapshot/Restore scenarios depend on. Run it once
per cluster before the scenario roles below.
trident_protect_common— Create Secret, AppVault, Application, label VMs/PVCs.backup_and_restore_scenario— Perform on-demand backup and restore VMs to a different namespace.
trident_protect_common— Create Secret, AppVault, Application, label VMs/PVCs.create_snapshot_schedule— Create the periodic snapshotScheduleCR.- (Wait for the schedule to produce at least one snapshot)
snapshot_and_restore_scenario— Restore VMs from the latest snapshot to the same namespace.
The DR roles are designed to be executed in sequence across the source and destination OpenShift clusters. The full lifecycle (replication → failover → reverse resync → failback) is:
dr_amr_prerequisites— Set up secrets, AppVaults, Application, and snapshots on source and destination clusters.dr_amr_config— Create theAppMirrorRelationship(AMR) to start replication.dr_failover— Fail over the replicated application/ VMs to the destination cluster after a disaster on the source.dr_reverse_resync_prerequisites— Validate/prepare the new source (original destination) cluster for reverse resync.dr_reverse_resync_config— Reverse resync the AMR so replication flows back toward the original primary.dr_failback_promote— Promote the AMR on the original source to initiate failback of VMs.dr_failback_prepare_forward_amr— Prepare to re-establish forward replication from the original primary.dr_failback_establish_forward_amr— Re-establish the forward AMR from original source to destination, completing failback.
Note on when to run each step:
- Steps 1–2 are always required to establish steady-state SnapMirror replication.
- Step 3 is executed when a disaster or unplanned failover occurs (or for a planned failover). After failover, VMs are running on the destination cluster with no active replication.
- Steps 4–5 (reverse resync) re-establish replication in the reverse direction — from the destination back toward the original source. These are the minimum steps required after a failover to restore data protection. They can be performed immediately after step 3 or deferred until convenient, as long as the customer is comfortable without active replication in the interim.
- Steps 6–8 (failback) promote workloads back to the original source cluster and re-establish forward replication. The timing of these steps is entirely at the customer's discretion:
- Immediate failback: If downtime on the destination cluster is acceptable right away, proceed directly from step 5 to steps 6–8.
- Deferred failback: If the VMs must keep running on the destination cluster without interruption, complete steps 4–5 now to restore data protection, then execute steps 6–8 during a planned maintenance window when downtime can be accommodated.
flowchart TD
A[dr_amr_prerequisites] --> B[dr_amr_config]
B --> C[dr_failover]
C --> D[dr_reverse_resync_prerequisites]
D --> E[dr_reverse_resync_config]
E --> F[dr_failback_promote]
F --> G[dr_failback_prepare_forward_amr]
G --> H[dr_failback_establish_forward_amr]
See the changelog.