Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flink] Add merge-engine check when executing row level batch update and delete #3181

Merged
merged 2 commits into from
Apr 15, 2024

Conversation

yuzelin
Copy link
Contributor

@yuzelin yuzelin commented Apr 9, 2024

Purpose

  1. partial-update and aggregate should not accept batch update because there is no UB and the aggregate result will be wrong.

  2. It's not very clear about how to handle row level batch delete with partial update currently. Doris just delete the record. But it's not easy to delete the record because [core] Refactor: merge table-level ignore-delete options and filter delete data in writer layer #3128 and [core] Fix that ignore-delete option is not compatible with old delete records and LocalMergeOperator #3139 .

Linked issue: close #xxx

Tests

API and Format

Support batch update merge engines: deduplicate and first-row.
Support batch delete merge engines: deduplicate.

Documentation


FIRST_ROW("first-row", "De-duplicate and keep the first row.");
FIRST_ROW("first-row", "De-duplicate and keep the first row.", true, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first-row does not need to support update


PARTIAL_UPDATE("partial-update", "Partial update non-null fields."),
PARTIAL_UPDATE("partial-update", "Partial update non-null fields.", false, false),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using partial-update and no col contains agg, update is theoretically supported


private final String value;
private final String description;

MergeEngine(String value, String description) {
private final boolean supportBatchUpdate;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not modify this, this is public api, we don't need to expose here.

@JingsongLi
Copy link
Contributor

+1

@JingsongLi JingsongLi merged commit a5d1b3c into apache:master Apr 15, 2024
10 checks passed
@yuzelin yuzelin deleted the merge_engine_check branch May 30, 2024 06:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants