Performance Guidelines performance-guidelines
This page provides general guidelines on how to optimize the performance of your AEM deployment. If you are new to AEM, please go over the following pages before you start reading the performance guidelines:
Illustrated below are the deployment options available for AEM (scroll to view all the options):
When to Use the Performance Guidelines when-to-use-the-performance-guidelines
You should use the performance guidelines in the following situations:
- First time deployment: When planning to deploy AEM Sites or Assets for the first time, it is important to understand the options available when configuring the Micro Kernel, Node Store, and Data Store (compared to the default settings). For example, changing the default settings of the Data Store for TarMK to File Data Store.
- Upgrading to a new version: When upgrading to a new version, it is important to understand the performance differences compared to the running environment. For example, upgrading from AEM 6.1 to 6.2, or from AEM 6.0 CRX2 to 6.2 OAK.
- Response time is slow: When the selected Nodestore architecture is not meeting your requirements, it is important to understand the performance differences compared to other topology options. For example, deploying TarMK instead of MongoMK, or using a File Data Sore instead of an Amazon S3 or Microsoft Azure Data Store.
- Adding more authors: When the recommended TarMK topology is not meeting the performance requirements and upsizing the Author node has reached the maximum capacity available, it is important to understand the performance differences compared to using MongoMK with three or more Author nodes. For example, deploying MongoMK instead of TarMK.
- Adding more content: When the recommended Data Store architecture is not meeting your requirements, it’s important to understand the performance differences compared to other Data Store options. Example: using the Amazon S3 or Microsoft Azure Data Store instead of a File Data Store.
Introduction introduction
This chapter gives a general overview of the AEM architecture and its most important components. It also provides development guidelines and describes the testing scenarios used in the TarMK and MongoMK benchmark tests.
The AEM Platform the-aem-platform
The AEM platform consists of the following components:
For more information on the AEM platform, see What is AEM.
The AEM Architecture the-aem-architecture
There are three important building blocks to an AEM deployment. The Author Instance which is used by content authors, editors, and approvers to create and review content. When the content is approved, it is published to a second instance type named the Publish Instance from where it is accessed by the end users. The third building block is the Dispatcher which is a module that handles caching and URL filtering and is installed on the webserver. For additional information about the AEM architecture, see Typical Deployment Scenarios.
Micro Kernels micro-kernels
Micro Kernels act as persistence managers in AEM. There are three types of Micro Kernels used with AEM: TarMK, MongoDB, and Relational Database (under restricted support). Choosing one to fit your needs depends on the purpose of your instance and the deployment type you are considering. For additional information about Micro Kernels, see the Recommended Deployments page.
Nodestore nodestore
In AEM, binary data can be stored independently from content nodes. The location where the binary data is stored is referred to as the Data Store, while the location of the content nodes and properties is called the Node Store.
Data Store data-store
When dealing with large number of binaries, it is recommended that an external data store be used instead of the default node stores in order to maximize performance. For example, if your project requires a large number of media assets, storing them under the File or Azure/S3 Data Store will make accessing them faster than storing them directly inside a MongoDB.
For further details on the available configuration options, see Configuring Node and Data Stores.
Search search-features
Listed in this section are the custom index providers used with AEM. To know more about indexing, see Oak Queries and Indexing.
Development Guidelines development-guidelines
You should develop for AEM aiming for performance and scalability. Presented below are a number of best practices that you can follow:
DO
- Apply separation of presentation, logic, and content
- Use existing AEM APIs (ex: Sling) and tooling (ex: Replication)
- Develop in the context of actual content
- Develop for optimum cacheability
- Minimize number of saves (ex: by using transient workflows)
- Make sure all HTTP end points are RESTful
- Restrict the scope of JCR observation
- Be mindful of asynchronous thread
DON’T
-
Don’t use JCR APIs directly, if you can
-
Don’t change /libs, but rather use overlays
-
Don’t use queries wherever possible
-
Don’t use Sling Bindings to get OSGi services in Java code, but rather use:
- @Reference in a DS component
- @Inject in a Sling Model
- sling.getService() in a Sightly Use Class
- sling.getService() in a JSP
- a ServiceTracker
- direct access to the OSGi service registry
For further details about developing on AEM, read Developing - The Basics. For additional best practices, see Development Best Practices.
Benchmark Scenarios benchmark-scenarios
The testing scenarios detailed below are used for the benchmark sections of the TarMK, MongoMk and TarMK vs MongoMk chapters. To see which scenario was used for a particular benchmark test, read the Scenario field from the Technical Specifications table.
Single Product Scenario
AEM Assets:
- User interactions: Browse Assets / Search Assets / Download Asset / Read Asset Metadata / Update Asset Metadata / Upload Asset / Run Upload Asset Workflow
- Execution mode: concurrent users, single interaction per user
Mix Products Scenario
AEM Sites + Assets:
- Sites user interactions: Read Article Page / Read Page / Create Paragraph / Edit Paragraph / Create Content Page / Activate Content Page / Author Search
- Assets user interactions: Browse Assets / Search Assets / Download Asset / Read Asset Metadata / Update Asset Metadata / Upload Asset / Run Upload Asset Workflow
- Execution mode: concurrent users, mixed interactions per user
Vertical Use Case Scenario
Media:
- Read Article Page (27.4%), Read Page (10.9%), Create Session (2.6%), Activate Content Page (1.7%), Create Content Page (0.4%), Create Paragraph (4.3%), Edit Paragraph (0.9%), Image Component (0.9%), Browse Assets (20%), Read Asset Metadata (8.5%), Download Asset (4.2%), Search Asset (0.2%), Update Asset Metadata (2.4%), Upload Asset (1.2%), Browse Project (4.9%), Read Project (6.6%), Project Add Asset (1.2%), Project Add Site (1.2%), Create Project (0.1%), Author Search (0.4%)
- Execution mode: concurrent users, mixed interactions per user
TarMK tarmk
This chapter gives general performance guidelines for TarMK specifying the minimum architecture requirements and the settings configuration. Benchmark tests are also provided for further clarification.
Adobe recommends TarMK to be the default persistence technology used by customers in all deployment scenarios, for both the AEM Author and Publish instances.
For more information about TarMK, see Deployment Scenarios and Tar Storage.
TarMK Minimum Architecture Guidelines tarmk-minimum-architecture-guidelines
To establish good performance when using TarMK, you should start from the following architecture:
- One Author instance
- Two Publish instances
- Two Dispatchers
Illustrated below are the architecture guidelines for AEM sites and AEM Assets.
Tar Architecture Guidelines for AEM Sites
Tar Architecture Guidelines for AEM Assets
TarMK Settings Guideline tarmk-settings-guideline
For good performance, you should follow the settings guidelines presented below. For instructions on how to change the settings, see this page.
TarMK Performance Benchmark tarmk-performance-benchmark
Technical Specifications technical-specifications
The benchmark tests were performed on the following specifications:
Performance Bechmark Results performance-bechmark-results
MongoMK mongomk
The primary reason for choosing the MongoMK persistence backend over TarMK is to scale the instances horizontally. This means having two or more active author instances running at all times and using MongoDB as the persistence storage system. The need to run more than one author instance results generally from the fact that the CPU and memory capacity of a single server, supporting all concurrent authoring activities, is no longer sustainable.
For more information about TarMK, see Deployment Scenarios and Mongo Storage.
MongoMK Minimum Architecture Guidelines mongomk-minimum-architecture-guidelines
To establish good performance when using MongoMK, you should start from the following architecture:
- Three Author instances
- Two Publish instances
- Three MongoDB instances
- Two Dispatchers
MongoMK Settings Guidelines mongomk-settings-guidelines
For good performance, you should follow the settings guidelines presented below. For instructions on how to change the settings, see this page.
MongoMK Performance Benchmark mongomk-performance-benchmark
Technical Specifications technical-specifications-1
The benchmark tests were performed on the following specifications:
Performance Benchmark Results performance-benchmark-results
TarMK vs MongoMK tarmk-vs-mongomk
The basic rule that needs to be taken into account when choosing between the two is that TarMK is designed for performance, while MongoMK is used for scalability. Adobe recommends TarMK to be the default persistence technology used by customers in all deployment scenarios, for both the AEM Author and Publish instances.
The primary reason for choosing the MongoMK persistence backend over TarMK is to scale the instances horizontally. This means having two or more active author instances running at all times and using MongoDB as the persistence storage system. The need to run more than one author instance generally results from the fact that the CPU and memory capacity of a single server, supporting all concurrent authoring activities, is no longer sustainable.
For further details on TarMK vs MongoMK, see Recommended Deployments.
TarMK vs MongoMk Guidelines tarmk-vs-mongomk-guidelines
Benefits of TarMK
- Purpose-built for content management applications
- Files are always consistent and can be backed up using any file-based backup tool
- Provides a failover mechanism - see Cold Standby for more details
- Provides high performance and reliable data storage with minimal operational overhead
- Lower TCO (total cost of ownership)
Criteria for choosing MongoMK
- Number of named users connected in a day: in the thousands or more
- Number of concurrent users: in the hundreds or more
- Volume of asset ingestions per day: in hundreds of thousands or more
- Volume of page edits per day: in hundreds of thousands or more
- Volume of searches per day: in tens of thousands or more
TarMK vs MongoMK Benchmarks tarmk-vs-mongomk-benchmarks
Scenario 1 Technical Specifications scenario-technical-specifications
Scenario 1 Performance Benchmark Results scenario-performance-benchmark-results
Scenario 2 Technical Specifications scenario-technical-specifications-1
Scenario 2 Performance Benchmark Results scenario-performance-benchmark-results-1
Architecture Scalability Guidelines For AEM Sites and Assets architecture-scalability-guidelines-for-aem-sites-and-assets
Summary of Performance Guidelines summary-of-performance-guidelines
The guidelines presented on this page can be summarized as follows:
-
TarMK with File Datastore is the recommended architecture for most customers:
- Minimum topology: one Author instance, two Publish instances, two Dispatchers
- Binary-less replication turned on if the File Datastore is shared
-
MongoMK with File Datastore is the recommended architecture for horizontal scalability of the Author tier:
- Minimum topology: three Author instances, three MongoDB instances, two Publish instances, two Dispatchers
- Binary-less replication turned on if the File Datastore is shared
-
Nodestore should be stored on the local disk, not a network attached storage (NAS)
-
When using Amazon S3:
- The Amazon S3 datastore is shared between the Author and Publish tier
- Binary-less replication must be turned on
- Datastore Garbage Collection requires a first run on all Author and Publish nodes, then a second run on Author
-
Custom index should be created in addition to the out of the box index based on most common searches
- Lucene indexes should be used for the custom indexes
-
Customizing workflow can substantially improve the performance, for example, removing the video step in the “Update Asset” workflow, disabling listeners which are not used, etc.
For more details, also read the Recommended Deployments page.