-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] Introduce ExpireChangelogImpl to decouple the changelog lifecycle #3110
Conversation
e750a4c
to
2992cd3
Compare
After discuss with @JingsongLi . We decide not to support decouple the delta log in first version. The reason is that, it's not easy to determine whether a |
ab2ab88
to
c64f5fb
Compare
*/ | ||
public class Changelog extends Snapshot { | ||
|
||
private static final int CURRENT_VERSION = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current version is set to 3, equal to the version of snapshot. Otherwise such as 1
, when pass the Changelog
as Snapshot
to org.apache.paimon.operation.AbstractFileStoreScan#readManifests()
it will be recognize as a table_store_02_version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set version to 1 now. Refactor logic in AbstractFileStoreScan#readManifests()
* changelog of the table can outlive the snapshot's lifecycle. A table's changelog can come from | ||
* two source: | ||
* <li>The changelog file. Eg: from the changelog-producer = 'input' | ||
* <li>The delta files in the APPEND commits when the changelog-producer = 'none' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If currently this feature is not work. We can not document it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
*/ | ||
public class Changelog extends Snapshot { | ||
|
||
private static final int CURRENT_VERSION = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why its version is 1? But the version of snapshot is 3?
We can set it to Snapshot.CURRENT_VERSION
@@ -386,6 +387,10 @@ private List<ManifestFileMeta> readManifests(Snapshot snapshot) { | |||
case DELTA: | |||
return snapshot.deltaManifests(manifestList); | |||
case CHANGELOG: | |||
if (snapshot instanceof Changelog) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just set version to 3.
snapshotDeletion.cleanUnusedManifests(snapshot, skippingSet); | ||
|
||
// delete snapshot last | ||
if (changelogDecoupled) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you should write changelog first? Image there is a failover here.
Maybe you can add test in FileStoreExpireTestBase
, to make sure changelog files generated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The snapshot path is deleted after changelog files are generated. The test org.apache.paimon.operation.ExpireSnapshotsTest#testChangelogOutLivedSnapshot
will verify the changelog metadata is generated.
if (changelog.changelogManifestList() != null) { | ||
snapshotDeletion.deleteAddedDataFiles(changelog.changelogManifestList()); | ||
} | ||
if (changelog.deltaManifestList() != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not consider deltaManifest now.
@@ -33,6 +34,7 @@ | |||
import java.io.UncheckedIOException; | |||
import java.util.ArrayList; | |||
import java.util.Collections; | |||
import java.util.HashSet; | |||
import java.util.List; | |||
import java.util.Set; | |||
import java.util.SortedMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not only Rollback, you need to consider orphan file clean too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think this can be done in another pr ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the workload is not significant, you can consider completing it here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
69dae82
to
35f8b73
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
Purpose
This PR is the first step for #2899. It's mainly do the following things.
changelog
related option to define the changlog lifecycle.ExpireChangelogImpl
to handle the changelog expireTests
API and Format
Documentation