Common asset ingestion issues and solutions

This article describes the most common AEM assets ingestion issue scenarios and how to analyze them, including high processing, high volume, large DAM repositories, and many concurrent authors.

Description description

Environment

Adobe Experience Manager (AEM)

Issue

This article describes the most common AEM assets ingestion issue scenarios and how to analyze them for:

  • High processing
  • High volume
  • Large DAM repositories
  • Many concurrent authors

Resolution resolution

Ingestion Scenarios and Solutions

Scenario 1: High processing

Situations like bulk imports, such as 2,000 images at once, causes author instances high CPU and memory.

Solution

Offload jobs to another AEM instance. You can offload entire workflows or just a few heavy steps by connecting processing instance to the primary author instances via DAM proxy workers. The primary author instance thereby remains free to serve other users. DAM proxy workers are in charge of supervising remote tasks, gathering the results, and feeding them to the local workflow execution.

Scenario 2: High Volume​

These are instances where a database of few million products that have 12,000 modifications per day. The repository becomes the bottleneck in such scenarios. While writes are happening, reads are blocked for consistency purposes.

Solution

To prevent this situation, segregate the import process on a dedicated author instance with its own repository. At completion, replicate a full delta to the author environment, with chained replication to the publish environment, if necessary. Use a reserved replication queue to avoid delaying important editorial changes from publication.

Scenario 3: Large DAM repositories

This happens with huge repositories, such as over 7 million assets, 20 million nodes, and 15TB disk size. This impacts the instance performance.

Solution

Split the persistent store and the data store (optimized for handling large binaries). The persistent store requires very low latency I/O, hence local storage works best. For the data store, a higher latency is acceptable.

Scenario 4: Many Concurrent Authors

Many concurrent authors can impact the performance and the processing.

Solution

Concurrent authors are users who are actively working on the system. Logged-in but inactive authors do not place additional load on the system. Operations like editing, upload assets, trigger workflows CPU, memory, search and download assets, and modify metadata.

Forming a cluster of author instances with a dispatcher in front, helps distribute the CPU load evenly. With a large number of authors in active production, it is recommended to spin off each project into a separate author instance or environment in which the work in progress takes place. This technique is called content partitioning.

Ask Questions In Our Experience League Campaign Community

If you have any questions you’d like answered about this topic, or read previous answered-questions, we invite you to view our Experience League Community blog post that includes this article, send us your questions and comments, and join our Experience League Campaign Community!

recommendation-more-help
3d58f420-19b5-47a0-a122-5c9dab55ec7f