Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] throw exception if table is recreated when it still being read #4445

Merged
merged 4 commits into from
Nov 7, 2024

Conversation

LsomeYeah
Copy link
Contributor

Purpose

Linked issue: close #xxx

A table which is read in streaming read, will not throw exception even if the table is recreated currently. For example, the expected snapshot id is 10001, but the latest snapshot of the recreated table might be null or much less than 10001.

This is because when try to get the next snapshot, if the next snapshot is not exists in filesystem and its id is greater than earliest snapshot id, it will be treat as the next snapshot is still in generating progress.

This feature adds a check in NextSnapshotFetcher#getNextSnapshot. If next expected snapshot id is greater than the latest snapshot id plus one, it will throw an exception.

Tests

PrimaryKeyFileStoreTableITCase#testRecreateTableWithException

API and Format

Documentation

"The next expected snapshot is too big! Most possible cause might be the table had been recreated."
+ "The next snapshot id is %d, while the latest snapshot id is %s",
nextSnapshotId,
latestSnapshotId == null ? "null" : latestSnapshotId));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
latestSnapshotId == null ? "null" : latestSnapshotId));
latestSnapshotId));

Comment on lines 383 to 384
// wait the read job to read the current table
Thread.sleep(10000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assert that all snapshots are read by checking the result of SELECT. Don't sleep for this long!

Comment on lines 349 to 353
"CREATE TABLE t ( pt INT, k INT, v INT, PRIMARY KEY (pt, k) NOT ENFORCED ) "
+ "PARTITIONED BY (pt) "
+ "WITH ("
+ " 'bucket' = '2'\n"
+ ")");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set continuous.discovery-interval to a smaller value (for example 1s or 500ms) so this test can run faster. No need to wait for 10 seconds.

Comment on lines 410 to 417
assertThatCode(
() -> {
while (true) {
if (it.hasNext()) {
it.next();
}
}
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assertThatCode(
() -> {
while (true) {
if (it.hasNext()) {
it.next();
}
}
})
assertThatCode(() -> it.next())

@tsreaper tsreaper merged commit c4f1273 into apache:master Nov 7, 2024
12 checks passed
hang8929201 pushed a commit to hang8929201/paimon that referenced this pull request Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants