Analyzing AEM Indexing Issues

This article explains how to analyze slow queries, reindex asynchronous indexes and how to troubleshoot AEM indexing issues.

Description description

Environment

Experience Manager

Issue/Symptoms

How to troubleshoot all the common indexing issues?

Resolution resolution

Analyzing Slow Queries

Going to Tools  > Operations  > Dashboard  > Diagnosis  > Query Performance will show the list of slow queries.

See this document for details on analyzing the slow queries.

In non-production environments, ACS AEM tools package can be installed to provide an explain query tool for further debugging.

Reindexing Async Indexes

Note: Re-indexing typically does not resolve query issues, such as queries not returning the right results. Also, reindexing can take a long time Reindexing is to be avoided unless covered by a reason listed there.

How to reindex an asynchronous index

  • Login to your AEM instance

  • Open http://aemhost:port/crx/de/index.jsp

  • Browse to the index definition under /oak:index

  • Set a Boolean property reindex=true on the index to reindex, then save. When indexing starts you would see a log message like this:

    code language-none
    23.06.2015 14:26:23.070 *INFO* [ FelixStartLevel]
    org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing will be performed for following indexes:
    [ /oak:index/cqAcUUID, /oak:index/nodetype, /oak:index/deviceIdentificationMode, /oak:index/campaignpath, /oak:index/active, /oak:index/jcrFrozenMixinTypes]
    
  • While reindexing is in progress the ‘status’ attribute of IndexStatsMBean would have value ‘running’.

    code language-none
    23.06.2015 14:26:23.517 *INFO* [ FelixStartLevel]
    org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing Traversed #10000 /jcr:system/jcr:versionStorage/c8/5f...
    23.06.2015 14:28:51.999 *INFO* [ pool-8-thread-1]  org.apache.jackrabbit.oak.plugins.index.IndexUpdate Indexing
    report    - /oak:index/counter
    (708)    - /oak:index/authorizables
    (159)    - /oak:index/cqPageLucene
    (1913)    - /oak:index/ntBaseLucene
    (444)    - /oak:index/cqTagLucene
    (512)    - /oak:index/workflowDataLucene*(116)
    
  • You can confirm that indexing completed by checking your index definition node, the property reindexCount should have been incremented and the reindex boolean should be set to “false”.

    In the log file, if indexing completed successfully, the following entry will be displayed:

    code language-none
    23.06.2015 14:28:52.009 *INFO* [ pool-8-thread-1]
    org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate Reindexing (async) completed for indexes:
    [ /oak:index/counter*(708), /oak:index/authorizables*(159),
    /oak:index/cqPageLucene*(1913), /oak:index/ntBaseLucene*(444),
    /oak:index/cqTagLucene*(512), /oak:index/workflowDataLucene*(116)]
    in 30.36 s
    
  • If reindexing doesn’t complete and has gone into a loop then see the section below titled Analyzing Failed Reindexing.

  • The asynchronous index activity can be identified at INFO level by grepping on:

    code language-none
    org.apache.jackrabbit.plugins.index.IndexUpdate, and
    org.apache.jackrabbit.plugins.index.AsyncIndexUpdate
    

Async Indexing Performance Issues

  1. Enable logging to validate timing:

    TRACE Level (only in Oak 1.0.17 or later):

    code language-none
    org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorContext.perf
    

    DEBUG Level:

    code language-none
    org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate
    org.apache.jackrabbit.oak.plugins.index.IndexUpdate
    org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor
    org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker
    org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorContext
    org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier
    

    Logs like these would be generated showing timing of indexing:

    code language-none
    09.07.2015 08:13:38.401 *TRACE* [ 192.168.193.1 [ 1436444018387]  POST /content/dam/site/test.createasset.html HTTP/1.1]  org.apache.jackrabbit.oak.jcr.operations.writes [ session-101777]  Adding node [ /content/dam/site/test/jackrabbit-oak7.png]
    
    
    09.07.2015 08:13:38.583 *TRACE* [ 192.168.193.1 [ 1436444018387]  POST /content/dam/site/test.createasset.html HTTP/1.1]  org.apache.jackrabbit.oak.jcr.operations.writes [ session-101777]  save
    
    
    09.07.2015 08:13:42.823 *DEBUG* [ pool-9-thread-1]  org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate Running background index task async
    
    
    09.07.2015 08:13:42.963 *TRACE* [ pool-9-thread-1]  org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor [ /oak:index/lucene]  Indexed document for /content/dam/site/test/jackrabbit-oak7.png/jcr:content/related is Document<stored,indexed,omitNorms,indexOptions=DOCS_ONLY<:path:/content/dam/site/test/jackrabbit-oak7.png/jcr:content/related> indexed,tokenized<:fulltext:related>>
    
    
    09.07.2015 08:13:43.579 *DEBUG* [ pool-9-thread-1]  org.apache.jackrabbit.oak.plugins.index.IndexUpdate Indexing report
    - /oak:index/siteDamIndex(2)
    - /oak:index/lucene(15)
    
    
    09.07.2015 08:13:43.779 *TRACE* [ oak-lucene-0]  org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.perf [ /oak:index/siteDamIndex]  Index found to be updated. Reopening the IndexNode [ took 150ms]
    
    
    09.07.2015 08:13:45.248 *TRACE* [ oak-lucene-0]  org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier.perf [ /oak:index/lucene]  Copied 0 files totaling 0 B [ took 1465ms]
    
    
    09.07.2015 08:13:45.248 *TRACE* [ oak-lucene-0]  org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier [ /oak:index/lucene]  opening remote only file segments.gen
    
    
    09.07.2015 08:13:45.361 *TRACE* [ oak-lucene-0]  org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.perf [ /oak:index/lucene]  Index found to be updated. Reopening the IndexNode [ took 1581ms]
    
  2. Take a series of thread dumps and search for the thread containing AsyncIndexUpdate in the stack to see where most of the indexing time is spent or if the indexing is waiting on some other threads. To Take thread dumps see this article.

  3. Set the Boolean property saveDirectoryListing=true (only if using Oak 1.0.16, if later version then not needed)and set property excludedPaths=[ /var, /etc/workflow/instances, /jcr:system] on the /oak:index/lucene OOTB index to optimize it.

  4. Click Save All.

  5. When you create custom lucene property indexes, make sure to set saveDirectoryListing=true (Boolean) if using Oak 1.0.16 and set includedPaths (String[ ] ) property to restrict your index to only index certain paths.

  6. Go to this URL /system/console/configMgr/org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProviderService “Copy on Read”, “Copy on Write” (only safe to use in Oak 1.0.18 and later) and “Prefetch Index Files" (only in Oak 1.0.18 and later)

Analyzing Failed Reindexing

  1. When async indexing fails then when you check the Async Indexer stats JMX UI (link below), you would see that LastIndexedTime is an old date and time: http://localhost:4502/system/console/jmx/org.apache.jackrabbit.oak%3Aid%3D11%2Cname%3D"async"%2Ctype%3D"IndexStats"

  2. When reindexing fails it goes into a loop like this:

    code language-none
    08.01.2015 01:22:04.474 *INFO* [ pool-9-thread-2]
    org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing will be performed for following indexes
    [ /oak:index/damFileSize, /oak:index/lucene, /oak:index/cqLastModified] 08.01.2015 01:52:08.365 *INFO* [ pool-9-thread-5]
    org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing will be performed for following indexes
    [ /oak:index/damFileSize, /oak:index/lucene, /oak:index/cqLastModified] 08.01.2015 09:33:23.306 *INFO* [ pool-9-thread-5]
    org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing will be performed for following indexes
    [ /oak:index/damFileSize, /oak:index/lucene, /oak:index/cqLastModified]
    
  3. If you see that then you should enable debug level logging for org.apache.jackrabbit.oak.plugins.index

  4. The next time the indexing loops with the log message above a debug log message above it would give more details as to why indexing is failing. Then you can address the issue, whether it is a corrupt node, missing blob or other problem.

recommendation-more-help
3d58f420-19b5-47a0-a122-5c9dab55ec7f