Analyzing AEM Indexing Issues
The article provides guidance on analyzing slow queries, reindexing asynchronous indexes, and troubleshooting indexing issues in Adobe Experience Manager.
Description
Environment
Adobe Experience Manager
Issue/Symptoms
How to troubleshoot all the common indexing issues?
Resolution
Analyzing Slow Queries
Going to Tools >
Operations >
Dashboard >
Diagnosis >
Query Performance will show the list of slow queries.
See this document for details on analyzing the slow queries.
In non-production environments, the Adobe Consulting Services (ACS) AEM tools package can be installed to provide an explain query tool for further debugging.
Reindexing Asynchronous Indexes
Note: Re-indexing typically does not resolve query issues, such as queries not returning the right results. Also, reindexing can take a long time Reindexing is to be avoided unless covered by a reason listed there.
How to reindex an asynchronous index
-
Login to your AEM instance
-
Open http://aemhost:port/crx/de/index.jsp
-
Browse to the index definition under /oak:index
-
Set a Boolean property reindex=true on the index to reindex, then save. When indexing starts you will see a log message like this:
23.06.2015 14:26:23.070 *INFO* [ FelixStartLevel] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing will be performed for following indexes: [ /oak:index/cqAcUUID, /oak:index/nodetype, /oak:index/deviceIdentificationMode, /oak:index/campaignpath, /oak:index/active, /oak:index/jcrFrozenMixinTypes]
-
While reindexing is in progress the status attribute of IndexStatsMBean would have value running.
23.06.2015 14:26:23.517 *INFO* [ FelixStartLevel] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing Traversed #10000 /jcr:system/jcr:versionStorage/c8/5f... 23.06.2015 14:28:51.999 *INFO* [ pool-8-thread-1] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Indexing report - /oak:index/counter (708) - /oak:index/authorizables (159) - /oak:index/cqPageLucene (1913) - /oak:index/ntBaseLucene (444) - /oak:index/cqTagLucene (512) - /oak:index/workflowDataLucene*(116)
-
You can confirm that indexing completed by checking your index definition node, the property reindexCount should have been incremented and the reindex boolean should be set to false.
In the log file, if indexing completed successfully, the following entry will be displayed:
23.06.2015 14:28:52.009 *INFO* [ pool-8-thread-1] org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate Reindexing (async) completed for indexes: [ /oak:index/counter*(708), /oak:index/authorizables*(159), /oak:index/cqPageLucene*(1913), /oak:index/ntBaseLucene*(444), /oak:index/cqTagLucene*(512), /oak:index/workflowDataLucene*(116)] in 30.36 s
-
If reindexing doesn’t complete and has gone into a loop then see the section below titled Analyzing Failed Reindexing.
-
The asynchronous index activity can be identified at INFO level by grepping on:
org.apache.jackrabbit.plugins.index.IndexUpdate, and org.apache.jackrabbit.plugins.index.AsyncIndexUpdate
Async Indexing Performance Issues
-
Enable logging to validate timing:
TRACE Level (only in Oak 1.0.17 or later):
org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorContext.perf
DEBUG Level:
org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate org.apache.jackrabbit.oak.plugins.index.IndexUpdate org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorContext org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier
Logs like these would be generated showing the timing of indexing:
09.07.2015 08:13:38.401 *TRACE* [ 192.168.193.1 [ 1436444018387] POST /content/dam/site/test.createasset.html HTTP/1.1] org.apache.jackrabbit.oak.jcr.operations.writes [ session-101777] Adding node [ /content/dam/site/test/jackrabbit-oak7.png] 09.07.2015 08:13:38.583 *TRACE* [ 192.168.193.1 [ 1436444018387] POST /content/dam/site/test.createasset.html HTTP/1.1] org.apache.jackrabbit.oak.jcr.operations.writes [ session-101777] save 09.07.2015 08:13:42.823 *DEBUG* [ pool-9-thread-1] org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate Running background index task async 09.07.2015 08:13:42.963 *TRACE* [ pool-9-thread-1] org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor [ /oak:index/lucene] Indexed document for /content/dam/site/test/jackrabbit-oak7.png/jcr:content/related is Document<stored,indexed,omitNorms,indexOptions=DOCS_ONLY<:path:/content/dam/site/test/jackrabbit-oak7.png/jcr:content/related> indexed,tokenized<:fulltext:related>> 09.07.2015 08:13:43.579 *DEBUG* [ pool-9-thread-1] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Indexing report - /oak:index/siteDamIndex(2) - /oak:index/lucene(15) 09.07.2015 08:13:43.779 *TRACE* [ oak-lucene-0] org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.perf [ /oak:index/siteDamIndex] Index found to be updated. Reopening the IndexNode [ took 150ms] 09.07.2015 08:13:45.248 *TRACE* [ oak-lucene-0] org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier.perf [ /oak:index/lucene] Copied 0 files totaling 0 B [ took 1465ms] 09.07.2015 08:13:45.248 *TRACE* [ oak-lucene-0] org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier [ /oak:index/lucene] opening remote only file segments.gen 09.07.2015 08:13:45.361 *TRACE* [ oak-lucene-0] org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.perf [ /oak:index/lucene] Index found to be updated. Reopening the IndexNode [ took 1581ms]
-
Take a series of thread dumps and search for the thread containing AsyncIndexUpdate in the stack to see where most of the indexing time is spent or if the indexing is waiting on some other threads. To Take thread dumps see this article.
-
Set the Boolean property saveDirectoryListing=true (only if using Oak 1.0.16, if later version then not needed)and set property excludedPaths=
[
/var, /etc/workflow/instances, /jcr:system]
on the /oak:index/lucene OOTB index to optimize it. -
Click Save All.
-
When you create custom lucene property indexes, make sure to set saveDirectoryListing=true (Boolean) if using Oak 1.0.16 and set includedPaths (String
[
]
) property to restrict your index to only index certain paths. -
Go to this URL
/system/console/configMgr/org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProviderService
Copy on Read, Copy on Write(only safe to use in Oak 1.0.18 and later) and “Prefetch Index Files" (only in Oak 1.0.18 and later)
Analyzing Failed Reindexing
-
When async indexing fails then when you check the Async Indexer stats JMX UI (link below), you would see that LastIndexedTime is an old date and time:
http://localhost:4502/system/console/jmx/org.apache.jackrabbit.oak%3Aid%3D11%2Cname%3D"async"%2Ctype%3D"IndexStats"
-
When reindexing fails it goes into a loop like this:
08.01.2015 01:22:04.474 *INFO* [ pool-9-thread-2] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing will be performed for following indexes [ /oak:index/damFileSize, /oak:index/lucene, /oak:index/cqLastModified] 08.01.2015 01:52:08.365 *INFO* [ pool-9-thread-5] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing will be performed for following indexes [ /oak:index/damFileSize, /oak:index/lucene, /oak:index/cqLastModified] 08.01.2015 09:33:23.306 *INFO* [ pool-9-thread-5] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing will be performed for following indexes [ /oak:index/damFileSize, /oak:index/lucene, /oak:index/cqLastModified]
-
If you see that then you should enable debug level logging for
org.apache.jackrabbit.oak.plugins.index
-
The next time the indexing loops with the log message above a debug log message above it would give more details as to why indexing is failing. Then you can address the issue, whether it is a corrupt node, missing blob or other problem.