Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: compaction support pick by max_seq #1041

Merged
merged 8 commits into from
Jul 5, 2023

Conversation

jiacai2050
Copy link
Contributor

@jiacai2050 jiacai2050 commented Jun 29, 2023

Rationale

Part of #987.

Current implementation will compact by file size, max_seq is not considered, this may cause data corruption
in corner case, eg:

  • sst1, max_seq:10, PK1=10
  • sst2, max_seq:11, PK1=9
  • sst3, max_seq:12, no PK1

If compact pick sst1 and sst3, and output sst4, its max_seq will be 12, now PK1 exists in two files:

  • sst2, max_seq:11, PK1=9
  • sst4, max_seq:12, PK1=10

That's to say, PK1's value is 10 now, which is wrong value(9 is right).

Detailed Changes

When do compaction, first sort sst by max_seq desc, then only pick adjacent ssts, the original issue is fixed in this way.
At the same time picked ssts are ensured to meet other requirements such as min_threshold, max_threshold, max_input_size.

Test Plan

UT and manually.

@jiacai2050 jiacai2050 changed the title add pick by seq fix: size tired compaction support pick by max_seq Jun 29, 2023
@jiacai2050 jiacai2050 changed the title fix: size tired compaction support pick by max_seq WIP fix: size tired compaction support pick by max_seq Jun 29, 2023
@jiacai2050 jiacai2050 force-pushed the feat-compact-by-seq branch 2 times, most recently from bed176a to 6f47274 Compare July 3, 2023 03:02
@jiacai2050 jiacai2050 force-pushed the feat-compact-by-seq branch from 6f47274 to 54c9175 Compare July 3, 2023 03:04
@jiacai2050 jiacai2050 changed the title WIP fix: size tired compaction support pick by max_seq fix: size tired compaction support pick by max_seq Jul 3, 2023
@jiacai2050 jiacai2050 force-pushed the feat-compact-by-seq branch from fae4492 to c9f1a9c Compare July 3, 2023 07:09
@jiacai2050 jiacai2050 force-pushed the feat-compact-by-seq branch from c9f1a9c to bddc27d Compare July 3, 2023 07:31
@jiacai2050 jiacai2050 changed the title fix: size tired compaction support pick by max_seq fix: compaction support pick by max_seq Jul 3, 2023
analytic_engine/src/compaction/picker.rs Outdated Show resolved Hide resolved
analytic_engine/src/compaction/picker.rs Show resolved Hide resolved
analytic_engine/src/compaction/picker.rs Show resolved Hide resolved
analytic_engine/src/compaction/picker.rs Show resolved Hide resolved
analytic_engine/src/compaction/picker.rs Show resolved Hide resolved
analytic_engine/src/compaction/picker.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@Rachelint Rachelint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jiacai2050 jiacai2050 merged commit ab1b509 into apache:main Jul 5, 2023
@jiacai2050 jiacai2050 deleted the feat-compact-by-seq branch November 28, 2024 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants