Strategy for backup and restore in a clustered environment strategy-for-backup-and-restore-in-a-clustered-environment

NOTE
If your AEM forms implementation stores additional custom data in a different database, you must implement a strategy to back up this data ensuring that it remains in sync with the AEM forms data. Also, the application must be designed so that it is robust enough to handle a scenario where the additional databases get out of sync. It is highly recommended that any database operation that is performed is done in the context of a transaction to help maintain a consistent state.

You need to back up the following parts of the AEM forms system to recover from any error:

  • Database used by AEM forms
  • GDS that has long lived data and other persistent documents
  • AEM database (crx-repository)
NOTE
You need to backup any other data that is being used by your AEM forms setup, such as customer fonts, connecters data, and so on.

Back up a clustered environment back-up-a-clustered-environment

This topic discusses the following strategies to back up any AEM forms clustered environment:

  • Offline backup with downtime
  • Offline backup with no downtime (backup of a secondary node which is shutdown)
  • Online Backup with no downtime but delay in response
  • Back up the Bootstrap properties file

Offline backup with downtime offline-backup-with-downtime

  1. Shut down the entire cluster and related services. (see Starting and stopping services)

  2. On any node, back up the database, GDS, and Connectors. (see Files to back up and recover)

  3. To back up AEM repository offline, perform the following steps:

    1. For each cluster node, back up the file that contains the cluster node id.
    2. Back up all files of any secondary cluster node, including subdirectories.
    3. Back up repository/system id of each cluster node separately.

    For detailed steps, see Backup and Restore.

  4. Back up any other data, such as customer fonts.

  5. Start the cluster again.

Offline backup with no downtime offline-backup-with-no-downtime

  1. Enter the rolling backup mode. (see Entering the backup modes)

    Leave the rolling backup mode after a recovery.

  2. Shut down any of the secondary nodes of the cluster regarding AEM. (see Starting and stopping services)

  3. On any node, back up the database, GDS, and Connectors. (see Files to back up and recover)

  4. To back up AEM repository offline, perform the following steps:

    1. For each cluster node, back up the file that contains the cluster node id.
    2. Back up all files of any secondary cluster node, including subdirectories.
    3. Back up repository/system.id of each cluster node separately.

    For detailed steps, see Backup and Restore.

  5. Back up any other data, such as customer fonts.

  6. Start the cluster again.

Online Backup with no downtime but delay in response online-backup-with-no-downtime-but-delay-in-response

  1. Enter the rolling backup mode. (see Entering the backup modes)

    Leave the rolling backup mode after a recovery.

  2. Shut down any of the secondary nodes of the cluster regarding AEM. (see Starting and stopping services)

  3. On any node, back up the database, GDS, and Connectors. (see Files to back up and recover)

  4. To back up AEM repository online, perform the following steps:

    1. For each cluster node, back up the file that contains the cluster_node.id.
    2. Back up repository/system.id of each cluster node separately.
    3. On any secondary node, take an online backup of the repository for detailed steps see Online backup.
  5. Back up any other data, such as customer fonts.

  6. Start the cluster again.

Back up the Bootstrap properties file back-up-the-bootstrap-properties-file

When we create an AEM cluster, a properties file is created in the application server for all secondary nodes. It is recommended to back up the Bootstrap properties file. You can find the file at the following location on your application server:

  • JBoss®: in the BIN directory
  • WebLogic: in the domain directory
  • WebSphere®: in the profile directory

Back up the file for disaster recovery scenario of AEM secondary node and replace it at the specified location on the application server, if restored.

Recovery in a clustered environment recovery-in-a-clustered-environment

If there is any failure of the entire cluster or a single node, restore it using the backup.

For a single node recovery, shut down the single node and run the single node recovery procedure.

In case the entire cluster fails due to failures like database crash, perform the following steps. Restoration depends on the method of backup used.

Restoring a single node restoring-a-single-node

  1. Stop the corrupted node.

    note note
    NOTE
    If the corrupted node is an AEM primary node, shut down the entire cluster node.
  2. Re-create the physical system from a system image.

  3. Apply patches or updates to AEM forms that were applied since the image was made. This information was recorded during the backup procedure. AEM forms must be recovered to the same patch level as it was when the system was backed up.

  4. (Optional) If all other nodes are working fine, it is possible that the AEM repository is also corrupted. In this case, you will see a repository unsync message in the error.log file of the AEM repository.

    To restore the repository, perform the following steps.

    note note
    NOTE
    If a zipped crx-repository backup was taken online, unzip it at any location and follow the offline restoration process.
    1. Delete the repository, shared, version, and workspaces directories in the clusterNode directory of the node.
    2. Restore the backup of the cluster node (including subdirectories) to the node.
    3. Delete the file clusterNode/revision.log on the node.
    4. Delete the .lock on the node, if exists.
    5. Delete the repository/system.id on the node, if exists.
    6. Delete the files **/listener.properties on the node, if exist.
    7. Restore repository/cluster_node.id for individual cluster nodes.
NOTE
Consider the following points:
  • If the failed node was an AEM primary node, copy all the content from the secondary repository folder (crx-repository\crx.0000 where 0000 can be any digits) to the crx-repository\ repository folder and delete the secondary repository folder.
  • Before restarting any cluster node, ensure that you delete the repository /clustered.txt from the primary node.
  • Ensure that the primary node is started first and after it is up, start other nodes.

Restoring the entire cluster restoring-the-entire-cluster

  1. Stop all the cluster nodes.

  2. Recreate the physical system from a system image.

  3. Apply patches or updates to AEM formsAEM formsthat were applied since the image was made. This information was recorded in step 1 of the backup procedure. AEM forms must be recovered to the same patch level as it was when the system was backed up.

  4. Restore the database, GDS, and Connectors.

  5. Do the following to recover the AEM repository offline:

    note note
    NOTE
    If a zipped crx-repository backup was taken online, unzip it at any location and follow the offline restoration process.
    1. On all cluster nodes, delete the repository, shared, version, and workspaces directories in the clusterNode directory.
    2. Delete all files and directories in the shared directory.
    3. Restore the backup of the cluster node (including subdirectories) to one cluster nodes.
    4. Copy all files of the restored cluster node to all other cluster nodes. Once done, each cluster node contains the same data.
    5. Delete the file clusterNode/revision.log on all cluster nodes.
    6. Delete the .lock on all cluster nodes, if exists.
    7. Delete the repository/system.id all cluster nodes, if exists.
    8. Delete the files **/listener.properties on all cluster nodes, if exist.
    9. Restore repository/cluster_node.id for individual cluster nodes.
NOTE
Consider the following points:
  • If the failed node was an AEM primary node, copy all the content from the secondary repository folder (it looks like crx-repository\crx.0000 where 0000 can be any digits) to the crx-repository\ repository folder.
  • Before restarting any cluster node, ensure that you delete the repository /clustered.txt from the primary node.
  • Ensure that the primary node is started first and after it is up, start other nodes.

Back up and restore Correspondence Management Solution publish node back-up-and-restore-correspondence-management-solution-publish-node

The publisher node does not have any primary-secondary relationship in a clustered environment. You can take backup of any Publisher node by following Backup and Restore.

Recover a single publisher node recover-a-single-publisher-node

  1. Shutdown the node that must be recovered and do not do any publish activity until the node is up again.
  2. Restore the Publish node using Restoring the Backup.

Recover a cluster recover-a-cluster

  1. Shutdown the cluster.
  2. Restore the Publish node using Restoring the Backup.
  3. Start the primary node followed by the secondary node of the author cluster.
recommendation-more-help
19ffd973-7af2-44d0-84b5-d547b0dffee2