Backup and Restore
- Topics:
- Administering
CREATED FOR:
- Admin
There are two ways to back up and restore repository content in AEM:
- You can create an external backup of the repository and store it in a safe location. If the repository breaks down, you can restore it to the previous state.
- You can create internal versions of the repository content. These versions are stored in the repository along with the content, so you can quickly restore nodes and trees you have changed or deleted.
General
The approach described here applies for system backup and recovery.
If you need to backup and/or recover a small amount of content, which is lost, a recovery of the system is not necessarily required:
- Either you can fetch the data from another system via a package
- or you restore the backup on a temporary system, create a content package and deploy it on the system, where this content is missing.
For details, see Package Backup below.
Timing
Do not run backup in parallel with the datastore garbage collection, as it might harm the results of both processes.
Offline Backup
You can always do an offline backup. This requires a downtime of AEM, but can be quite efficient in terms of required time compared to an online backup.
In most cases you will use a filesystem snapshot to create a read-only copy of the storage at that time. To create a offline backup perform these steps:
- stop the application
- make a snapshot backup
- start the application
As the snapshot backup usually takes only a few seconds, the entire downtime is less than a few minutes.
Online Backup
This backup method creates a backup of the entire repository, including any applications deployed under it, such as AEM. The backup includes content, version history, configuration, software, hotfixes, custom applications, log files, search indexes, and so on. If you are using clustering and if the shared folder is a subdirectory of crx-quickstart
(either physically, or using a softlink), the shared directory is also backed up.
You can restore the entire repository (and any applications) at a later point.
This method operates as a “hot” or “online” backup so it can be performed while the repository is running. Therefore the repository is usable while the backup is running. This method works for the default, Tar storage based, repository instances.
When creating a backup, you have the following options:
- Backing up to a directory using AEM’s integrated backup tool.
- Backing up to a directory using a filesystem snapshot
In any case, the backup creates an image (or snapshot) of the repository. Then the systems backup agent should take care to actually transfer this image to a dedicated backup system (tape drive).
crx-quickstart
" directory and backup the datastore separately.AEM Online Backup
An online backup of your repository lets you create, download, and delete backup files. It is a “hot” or “online” backup feature, so can be executed while the repository is being used normally in the read-write mode.
When starting a backup you can specify a Target Path and/or a Delay.
Target Path The backup files are usually saved in the parent folder of the folder holidng the quickstart jar file (.jar). For example, if you have the AEM jar file located under /InstallationKits/AEM, then the backup will be generated under /InstallationKits. You can also specify a target to a location of your choice.
If the TargetPath is a directory, the image of the repository is created in this directory. If the same directory is used multiple times (or always) to storing backup,
- modified files in the repository are modified accordingly in the TargetPath
- deleted files in the repository are deleted in the TargetPath
- created files in the repository are created in the TargetPath
- it requires additional disk storage during the backup process (temporary directory plus the zip file)
- the compression process is done by the repository and might influence its performance.
- It delays the backup process.
- Up to Java 1.6 Java is only able to create ZIP files up to a size of 4 gigabytes.
Delay Indicates a time delay (in milliseconds), so that repository performance is not affected. By default, the repository backup runs at full speed. You can slow down creating an online backup, so that it does not slow down other tasks.
When using a very large delay, ensure that online backup does not take more than 24 hours. If it did, discard this backup, as it may not contain all binaries.
A delay of 1 millisecond typically results in 10% CPU usage, and a delay of 10 milliseconds usually results in less than 3% CPU usage. The total delay in seconds can be estimated as follows: Repository size in MB, multiplied by delay in milliseconds, divided by 2 (if the zip option is used), or divided by 4 (when backing up to a directory). That means a backup to a directory of a 200 MB repository with 1 ms delay increases the backup time by about 50 seconds.
To create a backup:
-
Log in to AEM as the administrator.
-
Go to Tools - Operations - Backup.
-
Click Create. The backup console will open.
-
On the backup console, specify the Target Path and Delay.
NOTE
The backup console is also available using:https://<*hostname*>:<*port-number*>/libs/granite/backup/content/admin.html
-
Click Save, a progress bar will indicate the progress of the backup.
NOTE
You can Cancel a running backup at any time. -
When the backup is complete, the zip files are listed in the backup window.
NOTE
Backup files that are no longer needed can be removed using the console. Select the backup file in the left pane then click Delete.NOTE
If you have backed up to a directory: after the backup process is finished AEM will not write to the target directory.
Automating AEM Online Backup
If possible, the online backup should be run when there is little load on the system, for example in the morning.
Backups can be automated using the wget
or curl
HTTP clients. The following show examples of how to automate backup by using curl.
Backing up to the default Target Directory
curl
command might need to be configured for your instance; for example, the hostname ( localhost
), port ( 4502
), admin password ( xyz
) and file name ( backup.zip
).curl -u admin:admin -X POST http://localhost:4502/system/console/jmx/com.adobe.granite:type=Repository/op/startBackup/java.lang.String?target=backup.zip
The backup file/directory is created on the server in the parent folder of the folder containing the crx-quickstart
folder (the same as if you were creating the backup using the browser). For example, if you have installed AEM in the directory /InstallationKits/crx-quickstart/
, then the backup is created in the /InstallationKits
directory.
The curl command returns immediately, so you must monitor this directory to see when the zip file is ready. While the backup is being created a temp directory (with the name based on that of the final zip file) can be seen, at the end this will be zipped. For example:
- name of resulting zip file:
backup.zip
- name of temporary directory:
backup.f4d5.temp
Backing up to a non-default Target Directory
Usually the backup file/directory is created on the server in the parent folder of the folder containing the crx-quickstart
folder.
If you want to save your backup (of either sort) to a different location you can set an absolute path ``to the target
parameter in the curl
command.
For example, to generate backupJune.zip
in the directory /Backups/2012
:
curl -u admin:admin -X POST http://localhost:4502/system/console/jmx/com.adobe.granite:type=Repository/op/startBackup/java.lang.String?target=/Backups/2012/backupJune.zip"
Filesystem Snapshot Backup
The process described here is specially suited for large repositories.
-
Do a snapshot of the filesystem AEM is deployed on.
-
Mount the filesystem snapshot.
-
Perform a backup and unmount the snapshot.
How AEM Online Backup Works
AEM Online Backup is comprised of a series of internal actions to ensure the integrity of the data being backed up and the backup file(s) being created. These are listed below for those interested.
The online backup uses the following algorithm:
-
When creating a zip file, the first step is to create or locate the target directory.
-
If backing up to a zip file, a temporary directory is created. The directory name starts with
backup.
and ends with.temp
; for examplebackup.f4d3.temp
. -
If backing up to a directory, the name specified in the target path is used. An existing directory can be used, otherwise a new directory will be created.
An empty file named
backupInProgress.txt
is created in the target directory when the backup starts. This file is deleted when the backup is finished.
-
-
The files are copied from the source directory to the target directory (or temporary directory when creating a zip file). The segmentstore is copied before the datastore to avoid repository corruption. The index and cache data are omitted when creating the backup. As a result, data from
crx-quickstart/repository/cache
andcrx-quickstart/repository/index
is not included in the backup. The progress bar indicator of the process is between 0% - 70% when creating a zip file, or 0% - 100% if no zip file is created. -
If the backup is being made to a pre-existing directory, then “old” files in the target directory are deleted. Old files are files that do not exist in the source directory.
The files are copied to the target directory in four stages:
-
In the first copy stage (progress indicator 0% - 63% when creating a zip file or 0% - 90% if no zip file is created), all files are copied while the repository is running normally. The process has two phases:
- Phase A - everything is copied except for the datastore (with delay).
- Phase B - only the datastore is copied (with delay).
-
In the second copy stage (progress indicator 63% - 65.8% when creating a zip file or 90% - 94% if no zip file is created) only files that were created or modified in the source directory since the first copy stage was started are copied. Depending on the activity of the repository, this might range from no files at all, up to a significant number of files (because the first file copy stage usually takes a lot of time). The copy process is similar to the first stage (Phase A and Phase B with delay).
-
In the third copy stage (progress indicator 65.8% - 68.6% when creating a zip file or 94% - 98% if no zip file is created) only files that were created or modified in the source directory since the second copy stage was started are copied. Depending on the activity of the repository, there might be no files to copy, or a very small number of files (because the second file copy stage is usually fast). The copy process is similar to the second stage - Phase A and Phase B but without delay.
-
File copy stages one to three are all done concurrently while the repository is running. Only files that were created or modified in the source directory since the third copy stage was started are copied. Depending on the activity of the repository, there might be no files to copy, or a very, very small number of files (because the second file copy stage usually is very fast). Progress indicator 68.6% - 70% when creating a zip file or 98% - 100% if no zip file is created. The copy process is similar to the third stage.
-
Depending on the target:
- If a zip file was specified, this is now created from the temporary directory. Progress indicator 70% - 100%. The temporary directory is then deleted.
- If the target was a directory, the empty file named
backupInProgress.txt
is deleted to indicate that the backup is finished.
Restoring the Backup
You can restore a backup as follows:
- In case you performed a Filesystem Snapshot Backup, you can simply restore an image of the system.
- In case you created the backup as a zip file, just unzip the contents in a new folder and start AEM from that location.
Package Backup
To back up and restore content, you can use one of the Package Manager, which uses the Content Package format to back up and restore content. The Package Manager provides more flexibility in defining and managing packages.
For details on the features and tradeoffs of each of these individual content package formats, see How to Work With Packages.
Scope of Backup
When you back up nodes using either the Package Manager or the Content Zipper, CRX saves the following information:
- The repository content below the tree you have selected.
- The Node type definitions that are used for the content you back up.
- The Namespace definitions that are used for the content you back up.
When backing up, AEM loses the following information:
- The version history.
Experience Manager
- Administering User Guide overview
- Sites Features
- Website Administration
- Reusing Content: Multi Site Manager and Live Copy
- Live Copy Overview Console
- Configuring Live Copy Synchronization
- Creating and Synchronizing Live Copies
- MSM Rollout Conflicts
- MSM Best Practices
- Translating Content for Multilingual Sites
- Managing Translation Projects
- Identifying Content to Translate
- Preparing Content for Translation
- Creating a Language Root Using the Classic UI
- Connecting to Microsoft Translator
- Configuring the Translation Integration Framework
- Language Copy Wizard
- Translation Enhancements
- Translation Best Practices
- Configurations and the Configuration Browser
- AEM FAQs
- Operations
- Dashboards
- Operations Dashboard
- Backup and Restore
- Data Store Garbage Collection
- Monitoring Server Resources Using the JMX Console
- Working with Logs
- Configure the Rich Text Editor
- Configure the Video component
- The Bulk Editor
- Configuring Email Notification
- Configuring RTE for Producing Accessible Sites
- The Link Checker
- Troubleshooting AEM
- Audit Log Maintenance in AEM 6
- Editor
- Managing Access to Workflows
- Using cURL with AEM
- Configuring Undo for Page Editing
- Proxy Server Tool (proxy.jar)
- Configuring for AEM Apps
- Administering Workflows
- Configuring Search Forms
- Tools Consoles
- Reporting
- Administering Workflow Instances
- Configuring Layout Container and Layout Mode
- Enabling Access to Classic UI
- Starting Workflows
- Configure the Rich Text Editor plug-ins
- Admin Consoles
- Security
- User Administration and Security
- User, Group and Access Rights Administration
- Security Checklist
- OWASP Top 10
- Running AEM in Production Ready Mode
- Identity Management
- Adobe IMS Authentication and Admin Console Support for AEM Managed Services
- Creating a Closed User Group
- Mitigating serialization issues in AEM
- User Synchronization
- Encapsulated Token Support
- Single Sign On
- How to Audit User Management Operations in AEM
- SSL By Default
- SAML 2.0 Authentication Handler
- Closed User Groups in AEM
- Granite Operations - User and Group Administration
- Enabling CRXDE Lite in AEM
- Configuring LDAP with AEM 6
- Configure the Admin Password on Installation
- Service Users in AEM
- Encryption Support for Configuration Properties
- Handling GDPR Requests for the AEM Foundation
- Content Disposition Filter
- Personalization
- eCommerce
- Integration
- Integrating with Third-Party Services
- Integrating with Salesforce
- Integrating with Adobe Target
- Integrating with Adobe Analytics
- Connecting to Adobe Analytics and Creating Frameworks
- Configuring Link Tracking for Adobe Analytics
- Mapping Component Data with Adobe Analytics Properties
- Configuring Video Tracking for Adobe Analytics
- HTTP2 Delivery of Content FAQ
- Troubleshooting your Adobe Campaign Integration
- SharePoint Connector Licenses, Copyright Notices, and Disclaimers
- SharePoint Connector
- DHTML Viewer End-of-Life FAQs
- Integrating with Adobe Campaign Classic
- Related Community Articles
- Integrating with Adobe Campaign Standard
- Flash Viewers End-of-Life Notice
- Integrating with Adobe Creative Cloud
- Integrating with Adobe Dynamic Tag Management
- Opting Into Adobe Analytics and Adobe Target
- AEM Portals and Portlets
- Integrating with Dynamic Media Classic
- Troubleshooting Integration Issues
- Integrating with BrightEdge Content Optimizer
- Best Practices for Email Templates
- Catalog Producer
- Integrating with Silverpop Engage
- Integrating with Adobe Campaign
- Integrating with ExactTarget
- Analytics with External Providers
- Integrating with the Adobe Marketing Cloud
- Manually Configuring the Integration with Adobe Target
- Prerequisites for Integrating with Adobe Target
- Adobe Classifications
- Solutions Integration
- Target Integration with Experience Fragments
- Best Practices
- Content Management