Content tracking with configurable txn batch size #427

AFaust · 2023-06-09T09:24:37Z

This pull requests adds two optional configurations to the content tracking process that allow users / customers to

set a transaction ID lookup offset instead of relying on a hard-coded 500 offset
set a transaction ID processing batch size for collecting documents to be content-indexed

This purpose of these configurations is to allow optimisations for (re-)indexation processes over Alfresco systems with extremely sparse transaction / content update distributions. This affects e.g. systems undergoing a lot of fine grained updates where content transactions with indexable content updates may be spread substantially. In such systems, the costly getDocsWithUncleanContent operation may often be invoked yielding only a single- to low double-digit number of content-containing nodes to be indexed. This may significantly prolong content indexation as the phases to perform concurrent content indexation are very short and may not even be able to use all allowed concurrent threads in the fork-join pool.

As for default values, both options use the previously hard-coded 500 txn offset, so that there is no difference in behaviour to previous versions. Users / customers with sparse transaction / content update distributions may configure substantially higher values as needed. I personally would recommend that Alfresco consider setting a default value for alfresco.content.txnIdLookupBatchSize that is maybe an order of magnitude larger than for alfresco.content.txnProcessingBatchSize - due to backwards consistency concerns I have not included that in the PR.

CLAassistant · 2023-06-09T09:24:42Z

All committers have signed the CLA.

aitseitz · 2023-06-13T10:18:26Z

Linked to https://alfresco.atlassian.net/browse/MNT-23759

Content tracking with configurable txn batch size

67b4607

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Content tracking with configurable txn batch size #427

Content tracking with configurable txn batch size #427

AFaust commented Jun 9, 2023

CLAassistant commented Jun 9, 2023 •

edited

Loading

aitseitz commented Jun 13, 2023

Content tracking with configurable txn batch size #427

Are you sure you want to change the base?

Content tracking with configurable txn batch size #427

Conversation

AFaust commented Jun 9, 2023

CLAassistant commented Jun 9, 2023 • edited Loading

aitseitz commented Jun 13, 2023

CLAassistant commented Jun 9, 2023 •

edited

Loading