[Bug] query values with the same primary key #3157

zhourui999 · 2024-04-05T07:28:51Z

Search before asking

I searched in the issues and found nothing similar.

Paimon version

0.8-snapshot

Compute Engine

spark 3.1
flink-1.17.2

Minimal reproduce step

A large online table with frequent data writing and updates, with a data volume of approximately 600 million. CDC synchronizes to the payment table, and after running for a period of time, it is found that the data in the table has duplicate primary keys (using SparkSQL and FlinkSQL queries will result in duplicate data). After executing compact, the same primary key will be merged out

What doesn't meet your expectations?

The query result does not have duplicate primary keys

Anything else?

No response

Are you willing to submit a PR?

I'm willing to submit a PR!

JingsongLi · 2024-04-08T04:59:35Z

Thanks @zhourui999

zhourui999 added the bug Something isn't working label Apr 5, 2024

zhourui999 mentioned this issue Apr 5, 2024

[core] Fix duplicate primary keys in query results #3158

Merged

Zouxxyy mentioned this issue Apr 7, 2024

[test] Add test case for read with raw convertible splits #3163

Merged

JingsongLi closed this as completed Apr 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] query values with the same primary key #3157

[Bug] query values with the same primary key #3157

zhourui999 commented Apr 5, 2024 •

edited

Loading

JingsongLi commented Apr 8, 2024

[Bug] query values with the same primary key #3157

[Bug] query values with the same primary key #3157

Comments

zhourui999 commented Apr 5, 2024 • edited Loading

Search before asking

Paimon version

Compute Engine

Minimal reproduce step

What doesn't meet your expectations?

Anything else?

Are you willing to submit a PR?

JingsongLi commented Apr 8, 2024

zhourui999 commented Apr 5, 2024 •

edited

Loading