Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Branch can only read data in latest schema #4407

Closed
1 of 2 tasks
gmdfalk opened this issue Oct 30, 2024 · 1 comment · Fixed by #4454
Closed
1 of 2 tasks

[Bug] Branch can only read data in latest schema #4407

gmdfalk opened this issue Oct 30, 2024 · 1 comment · Fixed by #4454
Assignees
Labels
bug Something isn't working

Comments

@gmdfalk
Copy link

gmdfalk commented Oct 30, 2024

Search before asking

  • I searched in the issues and found nothing similar.

Paimon version

1.0-SNAPSHOT

Compute Engine

Flink

Minimal reproduce step

  1. Create table with schema version 1
  2. Insert some values
  3. Alter table to create schema version 2 (but don't insert any new values)
  4. Create tag & branch
  5. Select * from branch

The select job will fail because it only has schema version 2 but will try to read schema version 1.
Example stacktrace: https://gist.github.com/gmdfalk/802eb18c912a4d85e17f206820a0c55a

What doesn't meet your expectations?

I expect the branch to be able to read any data, not just in the latest schema.
The branch only knows schema 2 and cannot read entries written in schema 1.

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@gmdfalk gmdfalk added the bug Something isn't working label Oct 30, 2024
@liming30
Copy link
Contributor

liming30 commented Nov 4, 2024

@gmdfalk Thanks for your report! I would like to update with more detailed reproduction steps:

CREATE TABLE T (
 pt INT,
 k INT,
 v STRING,
 PRIMARY KEY (pt, k) NOT ENFORCED
 ) PARTITIONED BY (pt) WITH (
 'bucket' = '2'
)
  1. INSERT INTO T VALUES (1, 10, 'apple'), (1, 20, 'banana')
  2. ALTER TABLE T ADD (v2 INT)
  3. INSERT INTO T VALUES (2, 10, 'cat', 2), (2, 20, 'dog', 2)
  4. CALL sys.create_tag('default.T', 'tag1', 2)
  5. CALL sys.create_branch('default.T', 'test', 'tag1')
  6. SELECT * FROM T$branch_test

The reason for this bug is that only the tag or latest schema is copied when creating a branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants