Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flink] Support analysis statistics in flink #4280

Merged
merged 6 commits into from
Oct 10, 2024

Conversation

herefree
Copy link
Contributor

Purpose

support statistics in flink

Linked issue: close #xxx

Tests

API and Format

Documentation

@herefree herefree closed this Sep 30, 2024
@herefree herefree reopened this Sep 30, 2024
try {
Table table = catalog.getTable(toIdentifier(tablePath));
Preconditions.checkArgument(
table instanceof FileStoreTable, "Can't analyze system table.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change "Can't analyze system table." to "Now only support analyze FileStoreTable."

Because lots of table implement Table.

Preconditions.checkArgument(
table instanceof FileStoreTable, "Can't analyze system table.");
if (!table.latestSnapshotId().isPresent()) {
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a log?

TableStatsUtil.createTableStats((FileStoreTable) table, tableStatistics);

FileStoreCommit commit =
((FileStoreTable) table).store().newCommit(UUID.randomUUID().toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use CoreOptions#createCommitUser instead.

FileStoreCommit commit =
((FileStoreTable) table).store().newCommit(UUID.randomUUID().toString());
commit.commitStatistics(tableStats, BatchWriteBuilder.COMMIT_IDENTIFIER);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete the excess blank line.

if (!ignoreIfNotExists) {
throw new TableNotExistException(getName(), tablePath);
}
}
}

@Override
public final void alterTableColumnStatistics(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1、Same suggests with this method and alterTableStatistics.
2、Abstract a new method named alterTableStatisticsInternal, because alterTableStatistics and alterTableColumnStatistics has lots of same code.

sql("ANALYZE TABLE T COMPUTE STATISTICS");
Statistics stats = paimonTable("T").statistics().get();

Assertions.assertEquals(2L, stats.mergedRecordCount().getAsLong());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'OptionalLong.getAsLong()' without 'isPresent()' check


@Test
public void testAnalyzeTableColumn() throws Catalog.TableNotExistException {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete the excess blank line.


sql("ANALYZE TABLE T COMPUTE STATISTICS FOR ALL COLUMNS");

Statistics stats = paimonTable("T").statistics().get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'Optional.get()' without 'isPresent()' check


Statistics stats = paimonTable("T").statistics().get();

Assertions.assertEquals(4L, stats.mergedRecordCount().getAsLong());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'OptionalLong.getAsLong()' without 'isPresent()' check

colStats.get("bytes_col"));

Assertions.assertEquals(
ColStats.newColStats(3, 2L, new Integer(1), new Integer(4), 0L, null, null),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add two unnecessary boxing here?

@herefree
Copy link
Contributor Author

herefree commented Oct 3, 2024

@wwj6591812 thanks for your review,I have completed the modification.

@herefree herefree force-pushed the support-flink-statistics branch from b5f1983 to 10bebab Compare October 9, 2024 02:53
@wwj6591812
Copy link
Contributor

+1

Copy link
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @herefree , looks good to me!

@JingsongLi JingsongLi changed the title [Flink] Support statistics in flink [Flink] Support analysis statistics in flink Oct 10, 2024
@JingsongLi JingsongLi merged commit cafb215 into apache:master Oct 10, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants