When migrating assets into AEM, there are several steps to consider. Extracting assets and metadata out of their current home is outside the scope of this document as it varies widely between implementations. Instead, this document describes how to bring these assets into AEM, apply their metadata, generate renditions, and activate or publish the assets.
Before performing any of the steps described below, review and implement the guidance in Assets performance tuning tips. Many steps, such as configuring maximum concurrent jobs, enhance the server’s stability and performance under load. Other steps, such as File Data Store configuration, are difficult to perform after the system has been loaded with assets.
The following asset migration tools are not part of Adobe Experience Manager. Adobe Customer Support does not support these tools.
This software are open source and covered by the Apache v2 License. To ask a question or report an issue, visit the respective GitHub Issues for ACS Experience Manager Tools and ACS Experience Manager Commons.
Migrating assets to Experience Manager requires several steps and should be viewed as a phased process. The phases of the migration are as follows:
Before you start a migration, disable the launchers for the
DAM Update Asset workflow. It is best to ingest all assets into the system and then run the workflows in batches. If you are already live while the migration is taking place, you can schedule these activities to execute during off-hours.
You may already have a tag taxonomy in place that you are applying to your images. Tools such as the CSV Asset Importer and the metadata profiles functionality can help automate application of tags to assets. Before this, add the tags in Experience Manager. The ACS Experience Manager Tools Tag Maker feature lets you populate tags by using a Microsoft Excel spreadsheet that is loaded into the system.
Performance and stability are important concerns when ingesting assets into the system. When loading a lot of data in Experience Manager, ensure that the system performs well. This minimized the time required to add the data and helps to avoid overloading the system. This helps prevent system crash, especially in systems that already are in production.
There are two approaches to loading the assets into the system: a push-based approach using HTTP or a pull-based approach using the JCR APIs.
Adobe’s Managed Services team uses a tool called Glutton to load data into customer environments. Glutton is a small Java application that loads all assets from one directory into another directory on an Experience Manager instance. Instead of Glutton, you could also use tools such as Perl scripts to post the assets into the repository.
There are two main downsides to using the approach of pushing through https:
The other approach to ingesting assets is to pull assets from the local file system. However, if you cannot get an external drive or network share mounted to the server to perform a pull-based approach, posting the assets over HTTP is the best option.
The ACS Experience Manager Tools CSV Asset Importer pulls assets from the file system and asset metadata from a CSV file for the asset import. The Experience Manager Asset Manager API is used to import the assets into the system and apply the configured metadata properties. Ideally, assets are mounted on the server via a network file mount or through an external drive.
When assets are not transmitted over a network the overall performance improves a lot. This method is usually the most efficient method to load assets into the repository. Additionally, you can import all assets and metadata in a single step as the tool supports metadata ingestion. No other step is required to apply the metadata, say using a separate tool.
After you load the assets into the system, you need to process them through the DAM Update Asset workflow to extract metadata and generate renditions. Before performing this step, you need to duplicate and modify the DAM Update Asset workflow to fit your needs. Some steps in the default workflow may not be necessary for you, such as Dynamic Media Classic PTIFF generation or InDesign server integration.
After you have configured the workflow according to your needs, you have two options to execute it:
For deployments that have a publish tier, you need to activate the assets out to the publish farm. While Adobe recommends running more than a single publish instance, it is most efficient to replicate all of the assets to a single publish instance and then clone that instance. When activating large numbers of assets, after triggering a tree activation, you may need to intervene. Here’s why: When firing off activations, items are added to the Sling jobs/event queue. After the size of this queue begins to exceed approximately 40,000 items, processing slows dramatically. After the size of this queue exceeds 100,000 items, system stability starts to suffer.
To work around this issue, you can use the Fast Action Manager to manage asset replication. This works without using the Sling queues, lowering overhead, while throttling the workload to prevent the server from becoming overloaded. An example of using FAM to manage replication is shown on the feature’s documentation page.
Other options for getting assets to the publish farm include using vlt-rcp or oak-run, which are provided as tools as part of Jackrabbit. Another option is to use an open-sourced tool for your Experience Manager infrastructure called Grabbit, which claims to have faster performance than vlt.
For any of these approaches, the caveat is that the assets on the author instance do not show as having been activated. To handle flagging these assets with correct activation status, you need to also run a script to mark the assets as activated.
Adobe does not maintain or support Grabbit.
After the assets have been activated, you can clone your publish instance to create as many copies as are necessary for the deployment. Cloning a server is fairly straightforward, but there are some important steps to remember. To clone publish:
sling.id. Delete this file.
crx-quickstart/launchpad/config/org/apache/jackrabbit/oak/plugins/blob/datastore/FileDataStore.configto point to the location of the datastore on the new environment.
Once we have completed migration, the launchers for the DAM Update Asset workflows should be re-enabled to support rendition generation and metadata extraction for ongoing day-to-day system usage.
While not nearly as common, sometimes you need to migrate large amounts of data from one Experience Manager instance to another; for example, when you perform an Experience Manager upgrade, upgrade your hardware, or migrate to a new datacenter, such as with an AMS migration.
In this case, your assets are already populated with metadata and renditions are already generated. You can simply focus on moving assets from one instance to another. When migrating between Experience Manager instances, you perform the following steps:
Disable workflows: Because you are migrating renditions along with our assets, you want to disable the workflow launchers for DAM Update Asset.
Migrate tags: Because you already have tags loaded in the source Experience Manager instance, you can build them in a content package and install the package on the target instance.
Migrate assets: There are two tools that are recommended for moving assets from one Experience Manager instance to another:
vlt rcp, allows you to use vlt across a network. You can specify a source and destination directory and vlt downloads all repository data from one instance and loads it into the other. Vlt rcp is documented at https://jackrabbit.apache.org/filevault/rcp.html
Activate assets: Follow the instructions for activating assets documented for the initial migration to AEM.
Clone publish: As with a new migration, loading a single publish instance and cloning it is more efficient than activating the content on both nodes. See Cloning Publish.
Enabling workflows: After you have completed migration, re-enable the launchers for the DAM Update Asset workflows to support rendition generation and metadata extraction for ongoing day-to-day system usage.