fix(cli): org:search:dump issues when dump is fetched across multiple queries #1474

fbeaudoincoveo · 2024-06-19T14:20:28Z

Proposed changes

Context:

The org:search:dump command executes queries under the hood. When we execute this command, it's possible that we hit the Search API maximum response size. The likelihood of hitting that limit is increased in the HIPAA environment, as the HIPAA maximum response size is 4x smaller than in PROD.

When the dump is executed across multiple queries, we ensure that results are fetched from the same index by passing the indexToken of the initial query to each subsequent queries. This is, by definition, incompatible with using the index cache, as stated in the indexToken parameter documentation:

The Base64 encoded identifier of the index mirror to forward the request to. See also the index parameter.
If you do not specify an indexToken (or index) value, any index mirror could be used.
Note: Passing an indexToken (or index) value has no effect when the results of a specific request can be returned from cache (see the maximumAge parameter).

Another issue is that in order to know what result to start from in each subsequent query after the first, we request results to be ordered by ascending rowid value, and we request only results whose rowid is greater than the rowid of the last result returned in the previous query. However, the rowid value is returned as a very large integer (greater than the maximum safe integer size JavaScript can handle without losing precision). Therefore, the expression rowid>ROW_ID_OF_LAST_RESULT typically behaves like rowid>=ROW_ID_OF_LAST_RESULT because JavaScript rounds the rowid during the JSON.parse conversion of the response body.

Fix:

Setting maximumAge: 0 on every query performed for fetching source dumps trivially addresses the index cache issue.
Using response handler to the platform client that uses the new context parameter of the JSON.parse reviver (see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/parse#parameters) allows us to avoid losing precision when converting rowid values. However, this new parameter is only available from Node 21, so we have to polyfill it for now. EDIT: After discussion with the DX team, this change was made directly in the platform-client project. See fix(success response handler): parse unsafe integers as strings platform-client#834.

Breaking changes

None

Testing

Unit Tests: Yes
Functionnal Tests: No, unit tests are sufficient to ensure that maximumAge is set to 0
Manual Tests: Tested the org:search:dump command locally before / after platform-client update against problematic HIPAA test organization. The duplicate / inconsistent dump result issue is no longer present after the fix.

github-actions · 2024-06-19T14:21:58Z

Pull Request Report

PR Title

✅ Title follows the conventional commit spec.

packages/cli/commons/src/platform/authenticatedClient.ts

This reverts commit 2421095.

This reverts commit ca0e6cf.

…le queries" This reverts commit 440a20d.

fbeaudoincoveo · 2024-06-20T14:01:19Z

I moved the bulk of the changes to the platform-client project: coveo/platform-client#834

So basically all I'm doing in this PR now is setting maximumAge: 0 on dump queries.

louis-bompart

LGTM, after the release of the platform-client and its update in our project ofc

github-actions · 2024-06-21T12:05:57Z

Sorry @fbeaudoincoveo, this PR does not satisifies all conditions of mergeability:

Required status check "Lint" is in progress.
You're not authorized to push to this branch.
Visit https://docs.github.com/repositories/configuring-branches-and-merges-in-your-repository/managing-protected-branches/about-protected-branches for more informat

Modify your pull request to satisfies them and check the box below to try again:

Merge!

You can also reach out to a maintainer if you think your contribution should be merged regardless of the reported issues.

Fix org:search:dump issues when dump is fetched across multiple queries

440a20d

fbeaudoincoveo requested a review from a team as a code owner June 19, 2024 14:20

fbeaudoincoveo requested review from olamothe, y-lakhdar and louis-bompart and removed request for a team June 19, 2024 14:20

fbeaudoincoveo changed the title ~~Fix org:search:dump issues when dump is fetched across multiple queries~~ fix(cli): org:search:dump issues when dump is fetched across multiple queries Jun 19, 2024

y-lakhdar approved these changes Jun 19, 2024

View reviewed changes

packages/cli/commons/src/platform/authenticatedClient.ts Outdated Show resolved Hide resolved

olamothe reviewed Jun 19, 2024

View reviewed changes

packages/cli/commons/src/platform/authenticatedClient.ts Outdated Show resolved Hide resolved

fbeaudoincoveo added 2 commits June 19, 2024 11:04

Always return context.source when value is an unsafe integer

ca0e6cf

Destructure / spread default response handlers

2421095

louis-bompart reviewed Jun 19, 2024

View reviewed changes

packages/cli/commons/src/platform/authenticatedClient.ts Outdated Show resolved Hide resolved

fbeaudoincoveo added 4 commits June 20, 2024 09:58

Revert "Destructure / spread default response handlers"

9edad41

This reverts commit 2421095.

Revert "Always return context.source when value is an unsafe integer"

e40dc56

This reverts commit ca0e6cf.

Revert "Fix org:search:dump issues when dump is fetched across multip…

5e4ee50

…le queries" This reverts commit 440a20d.

Set maximumAge to 0 on dump queries

cbfe676

Adjust tests

0d8fb72

louis-bompart approved these changes Jun 20, 2024

View reviewed changes

Update @coveo/platform-client

a55b3a6

fbeaudoincoveo merged commit 2ef81a1 into master Jun 21, 2024
57 checks passed

fbeaudoincoveo deleted the KIT-3321 branch June 21, 2024 12:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cli): org:search:dump issues when dump is fetched across multiple queries #1474

fix(cli): org:search:dump issues when dump is fetched across multiple queries #1474

fbeaudoincoveo commented Jun 19, 2024 •

edited

Loading

github-actions bot commented Jun 19, 2024 •

edited

Loading

fbeaudoincoveo commented Jun 20, 2024

louis-bompart left a comment

github-actions bot commented Jun 21, 2024

fix(cli): org:search:dump issues when dump is fetched across multiple queries #1474

fix(cli): org:search:dump issues when dump is fetched across multiple queries #1474

Conversation

fbeaudoincoveo commented Jun 19, 2024 • edited Loading

Proposed changes

Breaking changes

Testing

github-actions bot commented Jun 19, 2024 • edited Loading

fbeaudoincoveo commented Jun 20, 2024

louis-bompart left a comment

Choose a reason for hiding this comment

github-actions bot commented Jun 21, 2024

fbeaudoincoveo commented Jun 19, 2024 •

edited

Loading

github-actions bot commented Jun 19, 2024 •

edited

Loading