You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A large online table with frequent data writing and updates, with a data volume of approximately 600 million. CDC synchronizes to the payment table, and after running for a period of time, it is found that the data in the table has duplicate primary keys (using SparkSQL and FlinkSQL queries will result in duplicate data). After executing compact, the same primary key will be merged out
What doesn't meet your expectations?
The query result does not have duplicate primary keys
Anything else?
No response
Are you willing to submit a PR?
I'm willing to submit a PR!
The text was updated successfully, but these errors were encountered:
Search before asking
Paimon version
0.8-snapshot
Compute Engine
spark 3.1
flink-1.17.2
Minimal reproduce step
A large online table with frequent data writing and updates, with a data volume of approximately 600 million. CDC synchronizes to the payment table, and after running for a period of time, it is found that the data in the table has duplicate primary keys (using SparkSQL and FlinkSQL queries will result in duplicate data). After executing compact, the same primary key will be merged out
What doesn't meet your expectations?
The query result does not have duplicate primary keys
Anything else?
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: