-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] Support the rollback and orphan file clean with changelog deco… #3144
Conversation
taggedSnapshot.schemaId(), | ||
taggedSnapshot.baseManifestList(), | ||
taggedSnapshot.deltaManifestList(), | ||
null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to 'mute' the changelogManifestList
here, if the changelog and snapshot already not there. Otherwise, the scanner may read on a non-exist manifest list. WDYT ? @JingsongLi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you show the exception stack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see testRollbackWithChangelog
, but rollbacked table has no changelog, why you want to read changelog?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is a separate issue which is unrelated to changelog decouple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see
testRollbackWithChangelog
, but rollbacked table has no changelog, why you want to read changelog?
But there is no check on this now. After rollback, user can still specify a point to run the stream read, if the id is point to the rollback snapshot, then it will fail with the above exception.
In another thought, If we support decouple the changelog lifecycle, I have a table with 2 days changelog,
2024-04-01
2024-04-02
then I rollback the table to 2024-04-02 00:00, I can still stream read from 2024-04-01 to 2024-04-02 00:00. But we can not stream read before the 2024-04-01.
So, my main point is to prevent user read on a non-exist changelog data with an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discuss with @JingsongLi , it's a separate issue, we could solve it in another pr. So revert this change.
Is the failure CI related to the recent commit ? |
rebase latest master |
5e5f33d
to
e8b996a
Compare
9ee6be6
to
bff560e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
…uple
Purpose
The third step of #2899
Handle the orphan file cleaner with the changelog metadata
Tests
API and Format
Documentation