Queries and Indexing in AEM

Kinjal P Darji

1 min readMay 3, 2022

In JackRabbit all the contents were indexed. However, since AEM 6, Oak has been introduced, and it doesn’t do indexing on it’s own. oak:index node has to be present/created.
Queries can be avoided while rending components. There are two alternatives to this — Traversing nodes and Prefetching results.
AEM allows querying in three ways 1. Via QueryBuilder APIs (recommended) 2. xPath (recommended) and 3. SQL2
DEBUG logging can be added for these three packages in order to get additional information. — org.apache.jackrabbit.oak.plugins.index, org.apache.jackrabbit.oak.query and com.day.cq.search
To see the index statistics , log in to JMX console and search for Lucene Index Statistics. Also look for IndexStats mbean.
Set low thresholds for oak.queryLimitInMemory (eg. 10000) and oak. queryLimitReads (eg. 5000)
Three types of indexing — Lucene indexing, Property indexing and Solr indexing.
Lucene Indexing is asynchronous and is slow while data is being written and while it is being modified. Lucene can not enforce uniqueness constraints.
Solr should be considered only if AEM instance do not have enough CPU capacity to handle the queries in a search driven solution. Solr can be configured to crawl the data if there are multiple sites and are configured on different platforms and when the data aggregation is required from all of these sites.
CopyOnRead can be configured when NodeStore is configured remotely.
Before Removing indexes it is preferable to disable it.
Oak re-indexing should be avoided and- done only if — oak indexing configuration changes, Lucene index binary is missing, or it is corrupted.

Queries and Indexing in AEM

Written by Kinjal P Darji