AEM Revision Cleanup Quick Guide
Each update of the repository creates a new content revision. As a result the size of the repository grows. To avoid exponential growth of the repository, older versions of the repository need to be cleaned up. The process of cleaning up repository is called revision cleanup.
There are two versions of this process. Online and offline. In offline clean up AEM instance has to be shut down. However, since 6.3 online version is supported.
Revision cleanup has three phases. Estimation, compaction and clean up. Estimation phase decides if 2nd phase, compaction should happen or not. In compaction phase tar files are rewritten leaving out any unused contents. In the clean up phase the older segments are removed and garbage is collected.
Offline version frees up more space then online version as in online version the running instances working set which may occupy more segments.
Offline revision should be done only in exceptional cases like upgrading to the different storage format or support advises to do so.
Online clean up runs once a day on author and publish both. You can define the values of daily window and weekly windows in tools — operations — dashboard — maintenance
Online clean up supports Oak tar as used in 6.3.
Online revision cleanup reclaims old revisions by generation. A fresh generation is created each time the revisions runs. So for the first time there’s only one generation created. For cleanup reclaim the contents those are two generations old will be reclaimed by online revision.
Maintenance windows should be defined beyond the main production times like pick customer hours.
It is recommended to have disk space two or three times larger than the expected repository.
Revisions has to be 24 hours old to be collected.
The system will go into forceCompact mode, as explained in more detail in the oak documentation. During force compact, an exclusive write lock is acquired in order to finally commit the changes without any concurrent writes interfering.
In a cold stand by set up only primary instance need to do revision clean up the stand by instance does not require specific time window for the same. The stand by instance’s revision clean up will happen after the primary instance’s cleanup finishes.
Repository integrity check is not needed.
Change the maintenance window if the traffic is high at the configured time. Increase the maintenance window if the revision couldn’t finish in configured one. Provide more memory if required. If no online revision cleanup could be successful in a week, offline revision cleanup should be planned.
To run offline revision clean up use oak-run tool.
Increase performance of online clean up by setting following run time arguments
- -Dtar.memoryMapped
- -Dupdate.limit
- -Dcompress-interval
- -Dcompaction-progress-log
- -Dtar.PersistCompactionMap