-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add Manifest Stats in snapshot summary. #13
Conversation
void addedManifestStats(ManifestFile manifest) { | ||
switch (manifest.content()) { | ||
case DATA: | ||
this.totalDataManifestFiles++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not increment these in addedManifest
method itself? why we need a separate method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems addedManifest is being used with Spark import action only.
Ref: https://github.com/apache/iceberg/blob/01bc864b8eb8c4ca4240af85f00f4e67c78c466e/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkTableUtil.java#L573
.count(); | ||
assertThat(table.currentSnapshot().summary().get(TOTAL_DATA_MANIFEST_FILES)) | ||
.isEqualTo(String.valueOf(dataManifestCount)); | ||
int deletedManifestCount = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int deletedManifestCount = | |
int deleteManifestCount = |
@@ -275,4 +287,22 @@ private void runAddedDeleteFileSequenceNumberTest( | |||
.as("File sequence number mismatch") | |||
.isEqualTo(expectedSequenceNumber); | |||
} | |||
|
|||
public static void testManifestStats(Table table) { | |||
List<ManifestFile> manifestFiles = table.currentSnapshot().allManifests(table.io()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since it is a test case, if we use, currentSnapshot().dataManifests and currentSnapshot().deleteManifests, it can look simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, allManifests is filling the dataManifests
and deleteManifests
fields if it is null. So, with one IO we can get both the fields.
@@ -201,7 +203,7 @@ public void testBinPackPartitionedTable() { | |||
|
|||
shouldHaveFiles(table, 4); | |||
List<Object[]> actualRecords = currentData(); | |||
|
|||
testManifestStats(table); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it should be placed before computing actualRecords
, check all the places.
Also, keep the new line as before for all the places.
also fixed the bug of missing jmh in the 1.19 module.
The angle brackets were without any escapes so docs renderer treated them as HTML. The resulting text on the website looked like an unfinished sentence: The unified partition type looks like Struct. Putting the angle brackets in backticks prevent them from being interpreted as HTML. Surrounding names like spec#0, field#1 are also put inside backticks for consistence.
…che#10888) This is a follow-up to apache#10881
Scheduled for removal in 1.7.0.
No description provided.