Disabling full-text search by mime type with custom Tika configuration in AEM
This article will tell you how to customize the Tika configuration to disable full-text search based on file mime types.
Description description
Environment
- Adobe Experience Manager 6.1
- Adobe Experience Manager 6.2
- Adobe Experience Manager 6.3
- Adobe Experience Manager 6.4
Issue/Symptoms
How to disable full-text search by file mime type with custom Tika configuration in Adobe Experience Manager (AEM).
Resolution resolution
Adobe recommends disabling full-text search for binary files via the Tika index.
For more details on Adobe’s recommendation and how to optimize asset performance, refer to the asset performance tuning Helpx article.
Solution 1:
To address Adobe’s recommendation, follow these steps:
-
Install the package that is provided.
-
Navigate to the following locations using CRX/DE:
/oak:index/lucene/tika/config.xml
/oak:index/damAssetLucene/tika/config.xml
-
Add the file mime type that you want to disable:
<mime>application/zip</mime>
-
Save the changes.
-
Set the boolean property refresh=true for these nodes using CRX/DE and save the changes:
-
/oak:index/lucene
-
/oak:index/damAssetLucene
-
-
Wait for the updated changes.
Solution 2:
For an alternative approach:
- Search for Oak-Lucene in the AEM web console and note the bundle number.
- Shutdown AEM instance.
- Navigate to
/crx-quickstart/launchpad/felix/bundlexxx
directory. - Go to the subdirectory labeled with versionX.Y, such as felix/bundle102/version0.2 using the
cd
version. - Retrieve all the content of the tika-config.xml file from the jar file:
jar -xvf bundle.jar org/apache/jackrabbit/oak/plugins/index/lucene/tika-config.xml
- Edit tika-config.xml file:
vi org/apache/jackrabbit/oak/plugins/index/lucene/tika-config.xml
- For instance, add the file mime type that you want to disable:
<mime>application/zip</mime>
- Save changes to bundle.jar.
- Restart AEM and verify changes by searching for zip file assets.