You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I searched in the issues and found nothing similar.
Paimon version
0.9.0
Compute Engine
flink 1.17.2
Minimal reproduce step
CREATETABLEif not exists a_one_billion_table (
id STRING,
... ...
PRIMARY KEY (id) NOT ENFORCED
) WITH ('bucket'='-1');
# make sure you already inserted more then 1 billion data into table `a_one_billion_table`# then run the blow new filnk job, `source_table` may only have 10000 records, then the checkpoint will fail.insert into a_one_billion_table
select*from source_table;
What doesn't meet your expectations?
checkpoint failed with error:
2024-07-18 16:09:32,895 WARN org.apache.flink.runtime.taskmanager.Task [] - dynamic-bucket-assigner (1/1)#0 switched from RUNNING to FAILED with failure cause:
java.lang.IllegalArgumentException: Too large (1466616922 expected elements with load factor 0.75)
at org.apache.paimon.shade.it.unimi.dsi.fastutil.HashCommon.arraySize(HashCommon.java:208)
at org.apache.paimon.shade.it.unimi.dsi.fastutil.ints.Int2ShortOpenHashMap.<init>(Int2ShortOpenHashMap.java:103)
at org.apache.paimon.shade.it.unimi.dsi.fastutil.ints.Int2ShortOpenHashMap.<init>(Int2ShortOpenHashMap.java:116)
at org.apache.paimon.utils.Int2ShortHashMap.<init>(Int2ShortHashMap.java:35)
at org.apache.paimon.utils.Int2ShortHashMap$Builder.build(Int2ShortHashMap.java:70)
at org.apache.paimon.index.PartitionIndex.loadIndex(PartitionIndex.java:138)
at org.apache.paimon.index.HashBucketAssigner.loadIndex(HashBucketAssigner.java:166)
at org.apache.paimon.index.HashBucketAssigner.assign(HashBucketAssigner.java:83)
at org.apache.paimon.flink.sink.HashBucketAssignerOperator.processElement(HashBucketAssignerOperator.java:98)
at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:246)
at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:217)
at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:169)
at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:68)
at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:616)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:1080)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:1029)
at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:959)
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:938)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:751)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:567)
at java.lang.Thread.run(Thread.java:879) [?:1.8.0_372]
Anything else?
No response
Are you willing to submit a PR?
I'm willing to submit a PR!
The text was updated successfully, but these errors were encountered:
pls reopen this ticket, this issue is still not resolved. adding parallelism is a workaround. Ideally, the resource allocation for the task should match the data flow, rather than matching the total amount of data at rest. The current situation does not meet this ideal. Is there any plan for optimization?
Search before asking
Paimon version
0.9.0
Compute Engine
flink 1.17.2
Minimal reproduce step
What doesn't meet your expectations?
checkpoint failed with error:
Anything else?
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: