Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] delete sql will fail when creating a table without explicitly declaring the number of buckets. #2956

Closed
1 of 2 tasks
Pandas886 opened this issue Mar 6, 2024 · 1 comment · Fixed by #3214
Closed
1 of 2 tasks
Labels
bug Something isn't working

Comments

@Pandas886
Copy link
Contributor

Search before asking

  • I searched in the issues and found nothing similar.

Paimon version

0.7

Compute Engine

flink-1.17.2

Minimal reproduce step

CREATE TABLE  if not EXISTS use_be_hours_4 (
    user_id BIGINT,
    item_id BIGINT,
    behavior STRING,
    dt STRING,
    hh STRING,
    PRIMARY KEY (user_id) NOT ENFORCED
) 
with (
  'manifest.format'='orc',
  'changelog-producer' = 'lookup',
  'incremental-between-scan-mode'='changelog',
  'changelog-producer.row-deduplicate'='true'
  
);


INSERT INTO use_be_hours_4
VALUES
    (1, 1, 'watch', '2022-01-01', '10'),
    (2, 2, 'like', '2022-01-01', '10'),
     (0, 0, 'watch', '2022-01-01', '10'); 

-- ERROR HAPPEN AS BELOW     
delete from use_be_hours_4 where user_id =2;

ERROR MSG:

Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException: Can't extract bucket from row in dynamic bucket mode, you should use 'TableWrite.write(InternalRow row, int bucket)' method.
	at org.apache.paimon.table.TableUtils.deleteWhere(TableUtils.java:74)
	at org.apache.paimon.flink.sink.SupportsRowLevelOperationFlinkTableSink.executeDeletion(SupportsRowLevelOperationFlinkTableSink.java:175)
	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:889)
	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:874)
	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:991)
	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:765)
	at org.dinky.executor.DefaultTableEnvironment.executeSql(DefaultTableEnvironment.java:300)
	at org.dinky.executor.Executor.executeSql(Executor.java:208)
	at org.dinky.job.builder.JobDDLBuilder.run(JobDDLBuilder.java:47)
	at org.dinky.job.JobManager.executeSql(JobManager.java:339)
	... 136 more
Caused by: java.lang.IllegalArgumentException: Can't extract bucket from row in dynamic bucket mode, you should use 'TableWrite.write(InternalRow row, int bucket)' method.
	at org.apache.paimon.table.sink.DynamicBucketRowKeyExtractor.bucket(DynamicBucketRowKeyExtractor.java:44)
	at org.apache.paimon.table.sink.TableWriteImpl.toSinkRecord(TableWriteImpl.java:148)
	at org.apache.paimon.table.sink.TableWriteImpl.writeAndReturn(TableWriteImpl.java:125)
	at org.apache.paimon.table.sink.TableWriteImpl.write(TableWriteImpl.java:116)
	at org.apache.paimon.table.TableUtils.deleteWhere(TableUtils.java:67)
	... 145 more

However, if I explicitly declare the number of buckets when creating the table as in the statement below, the issue does not occur.

CREATE TABLE  if not EXISTS use_be_hours_4 (
    user_id BIGINT,
    item_id BIGINT,
    behavior STRING,
    dt STRING,
    hh STRING,
    PRIMARY KEY (user_id) NOT ENFORCED
) 
with (
  'manifest.format'='orc',
  'changelog-producer' = 'lookup',
  'incremental-between-scan-mode'='changelog',
  'changelog-producer.row-deduplicate'='true',
  'bucket' = '1'
  
);

What doesn't meet your expectations?

I can run successfully regardless of whether I explicitly declare the number of buckets when creating the table (as it defaults to one bucket if not declared).

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@Pandas886 Pandas886 added the bug Something isn't working label Mar 6, 2024
@kingpluspk
Copy link

hi @Pandas886 , the default value of bucket is -1, that means that it's dynamic bucket. So, it's value is -1 if you don't explicitly declare the number of buckets

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants