Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tracing existing entries when there are deletes #1046

Merged
merged 1 commit into from
Aug 13, 2024

Conversation

Fokko
Copy link
Contributor

@Fokko Fokko commented Aug 12, 2024

Resolves #1044

cc @sungwy I'm afraid that we also want to include this in 0.7.1 :(

@sungwy sungwy added this to the PyIceberg 0.7.1 release milestone Aug 12, 2024
Copy link
Collaborator

@sungwy sungwy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

# Based on the metadata, it is unsure to say if the file can be deleted
partial_rewrites_needed = True
# Based on the metadata, we cannot determine if it can be deleted
existing_entries.append(_copy_with_new_status(entry, ManifestEntryStatus.EXISTING))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so even when we are rewriting the data partially, we still need to add the new manifestentries as "existing" entries in order to track the new data files that are re-written.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, these files are unaffected by the delete and should be kept in the manifest as an existing entry. I should have tested more extensively 😱

@sungwy
Copy link
Collaborator

sungwy commented Aug 12, 2024

cc @sungwy I'm afraid that we also want to include this in 0.7.1 :(

No worries @Fokko - this was such a large release, and I'm quite glad that all these issues are being reported as soon as we went live with the minor release :) It's creating a great opportunity for us to vamp up our test suite.

I've just canceled the VOTE so that this fix can be included with this patch release.

@Fokko
Copy link
Contributor Author

Fokko commented Aug 12, 2024

@sungwy Thanks for the quick follow-up, appreciate it 👍

@sungwy sungwy merged commit e891bcd into apache:main Aug 13, 2024
7 checks passed
sungwy pushed a commit to sungwy/iceberg-python that referenced this pull request Dec 7, 2024
sungwy pushed a commit to sungwy/iceberg-python that referenced this pull request Dec 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deleting with condition in partitioned tables is buggy
2 participants