Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Enable file index for DV table #4310

Merged
merged 1 commit into from
Nov 4, 2024

Conversation

leaves12138
Copy link
Contributor

@leaves12138 leaves12138 commented Oct 11, 2024

Purpose

File index is only support Append Only tables yet. But it is able to support primary-key table like Deletion Vector Table.

This pull request, enable File Index in DV table.

Tests

API and Format

Documentation

@@ -121,10 +134,33 @@ protected boolean filterByStats(ManifestEntry entry) {
}

return filter.test(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add "long schemaId = entry.file().schemaId();" in KeyValueFileStoreScan#filterByStats, and then use schemaId in KeyValueFileStoreScan#filterByStats and KeyValueFileStoreScan#testFileIndex.

new FileIndexPredicate(embeddedIndexBytes, dataRowType)) {
return predicate.testPredicate(dataPredicate);
} catch (IOException e) {
throw new RuntimeException("Exception happens while checking predicate.", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exception happens while checking fileIndex predicate.


private boolean testFileIndex(@Nullable byte[] embeddedIndexBytes, ManifestEntry entry) {
if (embeddedIndexBytes == null) {
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If fileIndex does not exist in embeddedIndex but in a separate file, what logic of filter by FileIndex in pk table with DV?

@@ -238,7 +238,8 @@ private KeyValueFileStoreScan newScan(boolean forWrite) {
options.scanManifestParallelism(),
options.deletionVectorsEnabled(),
options.mergeEngine(),
options.changelogProducer());
options.changelogProducer(),
options.fileIndexReadEnabled() && options.deletionVectorsEnabled());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add "File index support Append Only tables and pk table with DV" in fileIndex doc

Copy link
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@JingsongLi JingsongLi merged commit d5040b5 into apache:master Nov 4, 2024
11 of 12 checks passed
hang8929201 pushed a commit to hang8929201/paimon that referenced this pull request Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants