You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I searched in the issues and found nothing similar.
Paimon version
0.9
Compute Engine
spark-3.2
Minimal reproduce step
Spark SQL:
create bucket unaware table without primary keys like below:
create table paimon.paimon_test.test_compact(id int, data string) TBLPROPERTIES('bucket' = '-1');
Add few records into table paimon.paimon_test.test_compact;
insert into paimon.paimon_test.test_compact values(1, 'data01'), (2, 'data02'),(3, 'data03'),(4, 'data04');
call compact procedure:
CALL paimon.sys.compact(table => 'paimon_test.test_compact', order_strategy => 'order', order_by => 'id');
At the same time, add a new record with another spark-sql cli:
insert into paimon.paimon_test.test_compact values(666, 'data666')
Insert action is successful and data with values(666, 'data666') is missing after compact success.
What doesn't meet your expectations?
The insert data with values(666, 'data666') should not be lost, It's better to throw a compact error instead of making data lose
Anything else?
I checked the code, the compact action just implemented same as the overwrite action, and system always uses the latest snapshot to mark all the files as deleted instead of using the snapshot which the compact action invoked. Currently, the compact/overwrite action only obeys SNAPSHOT isolation instead of SERIALIZABLE isolation. I am not sure if it is as expected? however, current implement of compaction is very dangerous in our scene.
Are you willing to submit a PR?
I'm willing to submit a PR!
The text was updated successfully, but these errors were encountered:
Search before asking
Paimon version
0.9
Compute Engine
spark-3.2
Minimal reproduce step
Spark SQL:
create table paimon.paimon_test.test_compact(id int, data string) TBLPROPERTIES('bucket' = '-1');
insert into paimon.paimon_test.test_compact values(1, 'data01'), (2, 'data02'),(3, 'data03'),(4, 'data04');
CALL paimon.sys.compact(table => 'paimon_test.test_compact', order_strategy => 'order', order_by => 'id');
At the same time, add a new record with another spark-sql cli:
insert into paimon.paimon_test.test_compact values(666, 'data666')
What doesn't meet your expectations?
The insert data with values(666, 'data666') should not be lost, It's better to throw a compact error instead of making data lose
Anything else?
I checked the code, the compact action just implemented same as the overwrite action, and system always uses the latest snapshot to mark all the files as deleted instead of using the snapshot which the compact action invoked. Currently, the compact/overwrite action only obeys SNAPSHOT isolation instead of SERIALIZABLE isolation. I am not sure if it is as expected? however, current implement of compaction is very dangerous in our scene.
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: