Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Optizime IN filter pushdown to snapshot/tag/schema system tables #4436

Merged
merged 10 commits into from
Nov 6, 2024

Conversation

xuzifu666
Copy link
Member

Purpose

Linked issue: close #xxx

  1. Currently IN filter cannot pushdown to snapshot/tag/schema system tables and IN would tramsform to OR(contians all Equal LeafPredicate), if user query with IN filter would cost more unnessary IO in a large number of snapshots/tag/schema;
  2. TagTable predicate specified as LeafPredicate which cannot resolve other kind of Predicate, this pr had improved it.
  3. Add APIs for get snapshots/tags/schemas which get multiple numbers for a list argument.

such like sql: select snapshot_id, schema_id, commit_user from T$snapshots where snapshot_id in (1, 3);
before the pr would query all snapshots files firstly, then filter with 1 and 3;
after the pr would query only snapshots file with 1 and 3, not query all snapshots file, which can reduce unnessary IO.
note: if filter contains other fields would handle regress as before( like: select snapshot_id, schema_id, commit_user from T$snapshots where snapshot_id in (1, 3) or/and commit_user='xxx'; ), just work only snapshot_id IN filter.

Tests

API and Format

Documentation

@xuzifu666 xuzifu666 changed the title System opt in [core] Optizeme IN filter pushdown to snapshot/tag/schema system tables Nov 4, 2024
@xuzifu666 xuzifu666 changed the title [core] Optizeme IN filter pushdown to snapshot/tag/schema system tables [core] Optizime IN filter pushdown to snapshot/tag/schema system tables Nov 4, 2024
@xuzifu666 xuzifu666 closed this Nov 4, 2024
@xuzifu666 xuzifu666 reopened this Nov 4, 2024
@xuzifu666 xuzifu666 closed this Nov 4, 2024
@xuzifu666 xuzifu666 reopened this Nov 4, 2024
@xuzifu666 xuzifu666 closed this Nov 4, 2024
@xuzifu666 xuzifu666 reopened this Nov 4, 2024
@@ -128,6 +128,10 @@ public List<TableSchema> listAll() {
return listAllIds().stream().map(this::schema).collect(Collectors.toList());
}

public List<TableSchema> schemasWithIds(List<Long> schemaIds) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

schemasWithId

@@ -203,6 +205,7 @@ private class SchemasRead implements InnerTableRead {

private Optional<Long> optionalFilterSchemaIdMax = Optional.empty();
private Optional<Long> optionalFilterSchemaIdMin = Optional.empty();
private List<Long> schemaIds = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final

@@ -417,6 +417,13 @@ public List<Path> snapshotPaths(Predicate<Long> predicate) throws IOException {
.collect(Collectors.toList());
}

public Iterator<Snapshot> snapshotsWithIds(List<Long> snapshotIds) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snapshotsWithId

@@ -206,6 +208,7 @@ private class SnapshotsRead implements InnerTableRead {
private RowType readType;
private Optional<Long> optionalFilterSnapshotIdMax = Optional.empty();
private Optional<Long> optionalFilterSnapshotIdMin = Optional.empty();
private List<Long> snapshotIds = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final

@@ -226,6 +229,22 @@ public InnerTableRead withFilter(Predicate predicate) {
handleLeafPredicate(leaf, leafName);
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List children = compoundPredicate.children();
if ((compoundPredicate.function()) instanceof And) {
xxx;
} else if (if ((compoundPredicate.function()) instanceof And) {) {
xxx;
} else {
throw xxx;
}

TagManager tagManager = new TagManager(fileIO, location, branch);
String leafName = "tag_name";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first : private static final TAG_NAME = "tag_name";
second : replace all "tag_name" to TAG_NAME in this class

!= null) {
String equalValue =
((LeafPredicate) leaf).literals().get(0).toString();
predicateMap.put(equalValue, tagManager.tag(equalValue));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (tagManager.tagExists(equalValue)) {
predicateMap.put(equalValue, tagManager.tag(equalValue));
}

TagManager tagManager = new TagManager(fileIO, location, branch);
String leafName = "tag_name";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first : private static final TAG_NAME = "tag_name";
second : replace all "tag_name" to TAG_NAME in this class

List<Predicate> children = compoundPredicate.children();
for (Predicate leaf : children) {
if (leaf instanceof LeafPredicate
&& (((LeafPredicate) leaf).function() instanceof Equal)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why here not judge:
((LeafPredicate) predicate).literals().get(0) instanceof BinaryString
?

!= null) {
String equalValue =
((LeafPredicate) leaf).literals().get(0).toString();
predicateMap.put(equalValue, tagManager.tag(equalValue));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (tagManager.tagExists(equalValue)) {
predicateMap.put(equalValue, tagManager.tag(equalValue));
}

@xuzifu666
Copy link
Member Author

@wwj6591812 Thanks review,had addressed.

@wwj6591812
Copy link
Contributor

+1

1 similar comment
@LinMingQiang
Copy link
Contributor

+1

@@ -223,6 +226,22 @@ public InnerTableRead withFilter(Predicate predicate) {
handleLeafPredicate(leaf, leafName);
}
}

// optimize for IN filter
if ((compoundPredicate.function()) instanceof Or) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should extract a class to reduce duplicate codes, this can be done in a new PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK,would refactor it as suggest,Thanks @JingsongLi

Copy link
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@JingsongLi JingsongLi merged commit 321c8ef into apache:master Nov 6, 2024
12 checks passed
hang8929201 pushed a commit to hang8929201/paimon that referenced this pull request Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants