When migrating assets into Adobe Experience Manager, there are several steps to consider. Extracting assets and metadata out of their current home is outside the scope of this document as it varies widely between implementations, but this document describes how to bring these assets into Experience Manager, apply their metadata, generate renditions, and activate them to publish instances.
Before actually performing any of the steps in this methodology, review and implement the guidance in Assets performance tuning tips. Many of the steps, such as configuring maximum concurrent jobs, greatly enhance the server’s stability and performance under load. Other steps, such as configuring a File Data Store, are much more difficult to perform after the system has been loaded with assets.
The following asset migration tools are not part of Experience Manager and are not supported by Adobe:
Migrating assets to Experience Manager requires several steps and should be viewed as a phased process. The phases of the migration are as follows:
Before starting your migration, disable your launchers for the DAM Update Asset workflow. It is best to ingest all of the assets into the system and then run the workflows in batches. If you are already live while the migration is taking place, you can schedule these activities to run on off-hours.
You may already have a tag taxonomy in place that you are applying to your images. While tools like the CSV Asset Importer and Experience Manager support for metadata profiles can automate the process of applying tags to assets, the tags need to be loaded into the system. The ACS AEM Tools Tag Maker feature lets you populate tags by using a Microsoft Excel spreadsheet that is loaded into the system.
Performance and stability are important concerns when ingesting assets into the system. Because you are loading a large amount of data into the system, you want to make sure that the system performs as well as it can to minimize the amount of time required and to avoid overloading the system, which can lead to a system crash, especially in systems that already are in production.
There are two approaches to loading the assets into the system: a push-based approach using HTTP or a pull-based approach using the JCR APIs.
Adobe’s Managed Services team uses a tool called Glutton to load data into customer environments. Glutton is a small Java application that loads all assets from one directory into another directory on an Experience Manager deployment. Instead of Glutton, you could also use tools such as Perl scripts to post the assets into the repository.
There are two main downsides to using the approach of pushing through https:
The other approach to ingesting assets is to pull assets from the local file system. However, if you cannot get an external drive or network share mounted to the server to perform a pull-based approach, posting the assets over HTTP is the best option.
The ACS AEM Tools CSV Asset Importer pulls assets from the filesystem and asset metadata from a CSV file for the asset import. The Experience Manager Asset Manager API is used to import the assets into the system and apply the configured metadata properties. Ideally, assets are mounted on the server via a network file mount or through an external drive.
Because assets do not need to be transmitted over a network, overall performance improves dramatically and this method is generally considered to be the most efficient way to load assets into the repository. Additionally, because the tool supports metadata ingestion, you can import all assets and metadata in a single step rather than also create a second step to apply the metadata through a separate tool.
After you load the assets into the system, you need to process them through the DAM Update Asset workflow to extract metadata and generate renditions. Before performing this step, you need to duplicate and modify the DAM Update Asset workflow to fit your needs. The out-of-the-box workflow contains many steps that may not necessary for you, such as Dynamic Media PTIFF generation or InDesign Server integration.
After you have configured the workflow according to your needs, you have two options for executing it:
For deployments that have a publish tier, you need to activate the assets out to the publish farm. While Adobe recommends running more than a single publish instance, it is most efficient to replicate all of the assets to a single publish instance and then clone that instance. When activating large numbers of assets, after triggering a tree activation, you may need to intervene. Here’s why: When firing off activations, items are added to the Sling jobs/event queue. After the size of this queue begins to exceed approximately 40,000 items, processing slows dramatically. After the size of this queue exceeds 100,000 items, system stability starts to suffer.
To work around this issue, you can use the Fast Action Manager to manage asset replication. This works without using the Sling queues, lowering overhead, while throttling the workload to prevent the server from becoming overloaded. An example of using FAM to manage replication is shown on the feature’s documentation page.
Other options for getting assets to the publish farm include using vlt-rcp or oak-run, which are provided as tools as part of Jackrabbit. Another option is to use an open-sourced tool for your Experience Manager infrastructure called Grabbit, which claims to have faster performance than vlt.
For any of these approaches, the caveat is that the assets on the author instance do not show as having been activated. To handle flagging these assets with correct activation status, you need to also run a script to mark the assets as activated.
Adobe does not maintain or support Grabbit.
After the assets have been activated, you can clone your publish instance to create as many copies as are necessary for the deployment. Cloning a server is fairly straightforward, but there are some important steps to remember. To clone publish:
sling.id. Delete this file.
crx-quickstart/launchpad/config/org/apache/jackrabbit/oak/plugins/blob/datastore/FileDataStore.configto point to the location of the datastore on the new environment.
Once we have completed migration, the launchers for the DAM Update Asset workflows should be re-enabled to support rendition generation and metadata extraction for ongoing day-to-day system usage.
While not nearly as common, sometimes you need to migrate large amounts of data from one Experience Manager deployment to another; for example, when you perform an Experience Manager upgrade, upgrade your hardware, or migrate to a new datacenter, such as with an AMS migration.
In this case, your assets are already populated with metadata and renditions are already generated. You can simply focus on moving assets from one instance to another. When migrating between Experience Manager deployment, you perform the following steps:
Disable workflows: Because you are migrating renditions along with our assets, you want to disable the workflow launchers for DAM Update Asset workflow.
Migrate tags: Because you already have tags loaded in the source Experience Manager deployment, you can build them in a content package and install the package on the target instance.
Migrate assets: There are two tools that are recommended for moving assets from one Experience Manager deployment to another:
Activate assets: Follow the instructions for activating assets documented for the initial migration to Experience Manager.
Clone publish: As with a new migration, loading a single publish instance and cloning it is more efficient than activating the content on both nodes. See Cloning Publish.
Enable workflows: After you have completed migration, re-enable the launchers for the DAM Update Asset workflow to support rendition generation and metadata extraction for ongoing day-to-day system usage.