Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flink] Support to set the table option by dynamic global job configu… #2104

Merged
merged 4 commits into from
Oct 24, 2023

Conversation

Aitozi
Copy link
Contributor

@Aitozi Aitozi commented Oct 10, 2023

…ration

Purpose

Linked issue: close #2037

Tests

API and Format

Documentation

* @param context The table factory context.
* @return The dynamic options of this target table.
*/
static Map<String, String> getDynamicTableConfigOptions(DynamicTableFactory.Context context) {
Copy link
Contributor

@JingsongLi JingsongLi Oct 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also document this in engine/flink page?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

conf.keySet()
.forEach(
(key) -> {
Matcher matcher = pattern.matcher(key);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pattern.matcher is very slow, can we just check starts with paimon. first?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only happened in client planner phase and the pattern is not complex

Copy link
Contributor

@JingsongLi JingsongLi Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do a little benchmark for these two methods, and I think there may be hundreds options need to be check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, will do a test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Mac OS X 10.16
Apple M1 Pro
test-pattern: Best/Avg Time(ms) Row Rate(M/s) Per Row(ns) Relative

OPERATORTEST_test-pattern_pattern 12530 / 12779 0.0 41767.0 1.0X
OPERATORTEST_test-pattern_startswith 500 / 538 0.6 1666.2 25.1X

Per row will run for 100 key. so the max cost for 100 key match is 41767 ns. The startsWith is much faster, but I think both of the cost is acceptable, and the startWith check is not flexible when we only want to set for the specific table

SELECT * FROM T;

-- set scan.timestamp-millis=1697018249000 for the table default.T in any catalog
SET 'paimon.*.default.T.scan.timestamp-millis' = '1697018249000';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just support paimon.{key} for all tables?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for all table may cause unexpected behavior, if we use two table but only want to set option to one of them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is unexpected behavior, but there is also user case to define options for all tables.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not scan.timestamp-millis, but some options about execution.

Copy link
Contributor Author

@Aitozi Aitozi Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is unexpected behavior, but there is also user case to define options for all tables.

if user want to define for all tables they can set paimon.*.*.*.config_key=value. This means all tables will take this config_key. if user want to set options for table in a database, they can set paimon.*.db.*.config_key=value. They just have to keep in mind that these three part controls the catalog, database, and table scope.

@Aitozi
Copy link
Contributor Author

Aitozi commented Oct 19, 2023

Updated, Please take a look again @JingsongLi

Copy link
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@JingsongLi JingsongLi merged commit 5c001b4 into apache:master Oct 24, 2023
9 checks passed
pongandnoon pushed a commit to tongcheng-elong/incubator-paimon that referenced this pull request Nov 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Support to set the table option by dynamic global job configuration
2 participants