Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Newly created pyiceberg table does not detect commit conflicts #10062

Closed
HaraldVanWoerkom opened this issue Dec 9, 2024 · 4 comments · Fixed by #10064
Closed

[Bug]: Newly created pyiceberg table does not detect commit conflicts #10062

HaraldVanWoerkom opened this issue Dec 9, 2024 · 4 comments · Fixed by #10064

Comments

@HaraldVanWoerkom
Copy link

What happened

Using pyiceberg and Nessie 0.99, I create a table.
Then I start multiple processes that write to that table in parallel. When these processes write at the same time, conflicts may occur, which is expected. However, for the first write, this conflict detection is not operational.
I asked about this with the pyiceberg people and they pointed me to the Nessie REST catalog implementation (see apache/iceberg-python#1366). Apparently there is an interface misinterpretation:

  • When a snapshot update is accompanied with an assert-ref-snapshot-id with an id set to None, this should be interpreted as "only do this commit if there is no snapshot", as is specified in the interface code documentation.
  • However, it seems that Nessie skips this assert and also applies the change if a snapshot is available at the time of commit
  • This overwrites the existing snapshot, thereby effectively deleting the existing information from the table

How to reproduce it

  1. Use pyiceberg and Nessie
  2. Create a table
  3. Launch 2 processes
  4. Let those processes write a lot of data (several MB). Let them write in two batches. Let them retry on conflict (e.g. use tenacity)
  5. After they are done check that all data from the two processes has been written
  6. You will find that the second batch will always be written, but the first batch from one of the processes will often be dropped (if you write enough data this is >90% reproduceable)

Nessie server type (docker/uber-jar/built from source) and version

build from source, 0.99.1

Client type (Ex: UI/Spark/pynessie ...) and version

pyiceberg

Additional information

No response

snazy added a commit to snazy/nessie that referenced this issue Dec 9, 2024
…ent properly

The current implementation of the `assert-ref-snapshot-id` update requirement neglects the `null` requirement case, which means "there must be no current snapshot". The fix is simple. Also added a check that the ref-name in that requirement-check is `main`.

Also updated a few cases to eliminate IDE warnings in the same class.

Fixes projectnessie#10062
snazy added a commit to snazy/nessie that referenced this issue Dec 9, 2024
…ent properly

The current implementation of the `assert-ref-snapshot-id` update requirement neglects the `null` requirement case, which means "there must be no current snapshot". The fix is simple. Also added a check that the ref-name in that requirement-check is `main`.

Also updated a few cases to eliminate IDE warnings in the same class.

Fixes projectnessie#10062
snazy added a commit that referenced this issue Dec 9, 2024
…ent properly (#10064)

The current implementation of the `assert-ref-snapshot-id` update requirement neglects the `null` requirement case, which means "there must be no current snapshot". The fix is simple. Also added a check that the ref-name in that requirement-check is `main`.

Also updated a few cases to eliminate IDE warnings in the same class.

Fixes #10062
@snazy
Copy link
Member

snazy commented Dec 9, 2024

Thanks for the bug report and analysis!

We'll publish a Nessie patch release with a fix shortly.

@HaraldVanWoerkom
Copy link
Author

Thank you for the swift response! We'll test it once we've integrated.

@HaraldVanWoerkom
Copy link
Author

We ran the reproduction test on the latest code and we can confirm that this solves the issue. Thank you.

@snazy
Copy link
Member

snazy commented Dec 10, 2024

We ran the reproduction test on the latest code and we can confirm that this solves the issue. Thank you.

Thanks for checking! Glad that it works for you now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants