Analyzing AEM Indexing Issues

Description

The objective of this article is to have details on how to troubleshoot all the common indexing issues.

Resolution



Analyzing Slow Queries



 


Going to Tools - Operations - Dashboard - Diagnosis - Query Performance will show the list of slow queries.

See this document for details on analyzing the slow queries.

In non-production environments, ACS AEM tools package can be installed to provide an explain query tool for further debugging.




 

Reindexing Async Indexes






Note:

Re-indexing typically does not resolve query issues, such as queries not returning the right results. Also, reindexing can take a long time Reindexing is to be avoided unless covered by a reason listed there.




 


How to reindex an asynchronous index

  • Login to your AEM instance
  • Open http://aemhost:port/crx/de/index.jsp
  • Browse to the index definition under /oak:index
  • Set a Boolean property reindex=true on the index to reindex, then save. When indexing starts you would see a log message like this:
23.06.2015 14:26:23.070 \*INFO\* FelixStartLevel org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing will be performed for following indexes: /oak:index/cqAcUUID, /oak:index/nodetype, /oak:index/deviceIdentificationMode, /oak:index/campaignpath, /oak:index/active, /oak:index/jcrFrozenMixinTypes
  • While reindexing is in progress the ‘status’ attribute of IndexStatsMBean would have value ‘running’.
 23.06.2015 14:26:23.517 \*INFO\* FelixStartLevel org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing Traversed #10000 /jcr:system/jcr:versionStorage/c8/5f...23.06.2015 14:28:51.999 \*INFO\* pool-8-thread-1 org.apache.jackrabbit.oak.plugins.index.IndexUpdate Indexing report    - /oak:index/counter\*(708)    - /oak:index/authorizables\*(159)    - /oak:index/cqPageLucene\*(1913)    - /oak:index/ntBaseLucene\*(444)    - /oak:index/cqTagLucene\*(512)    - /oak:index/workflowDataLucene\*(116)
  • You can confirm that indexing completed by checking your index definition node, the property reindexCount should have been incremented and the reindex boolean should be set to “false

    In the log file, if indexing completed successfully, the following entry will be displayed

23.06.2015 14:28:52.009 \*INFO\* pool-8-thread-1 org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate Reindexing (async) completed for indexes: /oak:index/counter\*(708), /oak:index/authorizables\*(159), /oak:index/cqPageLucene\*(1913), /oak:index/ntBaseLucene\*(444), /oak:index/cqTagLucene\*(512), /oak:index/workflowDataLucene\*(116) in 30.36 s
  • If reindexing doesn’t complete and has gone into a loop then see the section below.

The asynchronous index activity can be identified at INFO level by grepping on org.apache.jackrabbit.plugins.index.IndexUpdate and org.apache.jackrabbit.plugins.index.AsyncIndexUpdate.
 

Async Indexing Performance Issues



  • Enable logging to validate timing:

TRACE Level (only in Oak 1.0.17 or later):

org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorContext.perf

DEBUG Level:

org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate

org.apache.jackrabbit.oak.plugins.index.IndexUpdate

org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor

org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker

org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorContext

org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier

Logs like this would be generated showing timing of indexing:

09.07.2015 08:13:38.401 \*TRACE\* 192.168.193.1 1436444018387 POST /content/dam/site/test.createasset.html HTTP/1.1 org.apache.jackrabbit.oak.jcr.operations.writes session-101777 Adding node /content/dam/site/test/jackrabbit-oak7.png
09.07.2015 08:13:38.583 \*TRACE\* 192.168.193.1 1436444018387 POST /content/dam/site/test.createasset.html HTTP/1.1 org.apache.jackrabbit.oak.jcr.operations.writes session-101777 save
09.07.2015 08:13:42.823 \*DEBUG\* pool-9-thread-1 org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate Running background index task async
09.07.2015 08:13:42.963 \*TRACE\* pool-9-thread-1 org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor /oak:index/lucene Indexed document for /content/dam/site/test/jackrabbit-oak7.png/jcr:content/related is Documentstored,indexed,omitNorms,indexOptions=DOCS_ONLY:path:/content/dam/site/test/jackrabbit-oak7.png/jcr:content/related indexed,tokenized:fulltext:related
09.07.2015 08:13:43.579 \*DEBUG\* pool-9-thread-1 org.apache.jackrabbit.oak.plugins.index.IndexUpdate Indexing report
  • /oak:index/siteDamIndex(2)
  • /oak:index/lucene(15)
09.07.2015 08:13:43.779 \*TRACE\* oak-lucene-0 org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.perf /oak:index/siteDamIndex Index found to be updated. Reopening the IndexNode took 150ms
09.07.2015 08:13:45.248 \*TRACE\* oak-lucene-0 org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier.perf /oak:index/lucene Copied 0 files totaling 0 B took 1465ms
09.07.2015 08:13:45.248 \*TRACE\* oak-lucene-0 org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier /oak:index/lucene opening remote only file segments.gen
09.07.2015 08:13:45.361 \*TRACE\* oak-lucene-0 org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.perf /oak:index/lucene Index found to be updated. Reopening the IndexNode took 1581ms
  • Take a series of thread dumps and search for the thread containing AsyncIndexUpdate in the stack to see where most of the indexing time is spent or if the indexing is waiting on some other threads. To Take thread dumps see this article.
  • Set the Boolean property saveDirectoryListing=true (only if using Oak 1.0.16, if later version then not needed)and set property excludedPaths=/var, /etc/workflow/instances, /jcr:system on the /oak:index/lucene OOTB index to optimize it.
  • Click “Save All”
  • When you create custom lucene property indexes, make sure to set saveDirectoryListing=true (Boolean) if using Oak 1.0.16 and set includedPaths (String) property to restrict your index to only index certain paths.

Go to this URL /system/console/configMgr/org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProviderService “Copy on Read”, “Copy on Write” (only safe to use in Oak 1.0.18 and later) and “Prefetch Index Files" (only in Oak 1.0.18 and later)



Analyzing Failed Reindexing



  • When async indexing fails then when you check the Async Indexer stats JMX UI (link below), you would see that LastIndexedTime is an old date and time:
http://localhost:4502/system/console/jmx/org.apache.jackrabbit.oak%3Aid%3D11%2Cname%3D"async"%2Ctype%3D"IndexStats"
  • Also, when reindexing fails it goes into a loop like this:
08.01.2015 01:22:04.474 \*INFO\* pool-9-thread-2 org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing will be performed for following indexes /oak:index/damFileSize, /oak:index/lucene, /oak:index/cqLastModified08.01.2015 01:52:08.365 \*INFO\* pool-9-thread-5 org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing will be performed for following indexes /oak:index/damFileSize, /oak:index/lucene, /oak:index/cqLastModified08.01.2015 09:33:23.306 \*INFO\* pool-9-thread-5 org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing will be performed for following indexes /oak:index/damFileSize, /oak:index/lucene, /oak:index/cqLastModified
  • If you see that then you should enable debug level logging for org.apache.jackrabbit.oak.plugins.index

  • The next time the indexing loops with the log message above a debug log message above it would give more details as to why indexing is failing.  Then you can address the issue, whether it is a corrupt node, missing blob or other problem.

On this page