Disabling full-text search by mime type with custom Tika configuration in AEM

This article will tell you how to customize the Tika configuration to disable full-text search based on file mime types.

Description description

Environment

  • Adobe Experience Manager 6.1
  • Adobe Experience Manager 6.2
  • Adobe Experience Manager 6.3
  • Adobe Experience Manager 6.4

Issue/Symptoms

How to disable full-text search by file mime type with custom Tika configuration in Adobe Experience Manager (AEM).

Resolution resolution

Adobe recommends disabling full-text search for binary files via the Tika index.

For more details on Adobe’s recommendation and how to optimize asset performance, refer to the asset performance tuning Helpx article.

Solution 1:

To address Adobe’s recommendation, follow these steps:

  1. Install the package that is provided.

  2. Navigate to the following locations using CRX/DE:

    • /oak:index/lucene/tika/config.xml
    • /oak:index/damAssetLucene/tika/config.xml
  3. Add the file mime type that you want to disable:

    • <mime>application/zip</mime>
  4. Save the changes.

  5. Set the boolean property refresh=true for these nodes using CRX/DE and save the changes:

    • /oak:index/lucene

    • /oak:index/damAssetLucene

  6. Wait for the updated changes.

Solution 2:

For an alternative approach:

  1. Search for Oak-Lucene in the AEM web console and note the bundle number.
  2. Shutdown AEM instance.
  3. Navigate to /crx-quickstart/launchpad/felix/bundlexxx directory.
  4. Go to the subdirectory labeled with versionX.Y, such as felix/bundle102/version0.2 using the cdversion.
  5. Retrieve all the content of the tika-config.xml file from the jar file:
    • jar -xvf bundle.jar org/apache/jackrabbit/oak/plugins/index/lucene/tika-config.xml
  6. Edit tika-config.xml file:
    • vi org/apache/jackrabbit/oak/plugins/index/lucene/tika-config.xml
  7. For instance, add the file mime type that you want to disable:
    • <mime>application/zip</mime>
  8. Save changes to bundle.jar.
  9. Restart AEM and verify changes by searching for zip file assets.
recommendation-more-help
3d58f420-19b5-47a0-a122-5c9dab55ec7f