[core] Support parallel close writers #4297

neuyilan · 2024-10-09T11:15:38Z

support parallel close writes.

Currently, when generating snapshots, the files of each bucket under each partition are closed sequentially, which takes a long time. The purpose of this PR is to enable parallel close of these files.

Documentation

Users only need to modify the properties of the table to adjust the number of threads that concurrently close files.

-- flink sql
ALTER TABLE my_table SET (
    'table.close-writers-thread-number' = '16' 
);

-- spark sql
ALTER TABLE my_table SET TBLPROPERTIES (
    'table.close-writers-thread-number' = '16' 
);

JingsongLi · 2024-10-15T04:11:25Z

it looks like the test failed.

neuyilan · 2024-10-21T01:36:38Z

it looks like the test failed.

Hi Jingsong, Thanks for your view.

The logic of this PR is relatively straightforward; it aims to perform concurrent flushing and closing of files in the buckets under the partition during the precommit phase. With this modification, all close operations for writers in the buckets under the partition during precommit are now handled in a thread pool.

However, this PR is encountering the following error during IT testing[1]. Could you offer some suggestions? I'm not quite clear on the cause of this error.

[1] https://github.com/apache/paimon/actions/runs/11399443830/job/31718241423

JingsongLi · 2024-10-31T02:42:00Z

You should set thread context classloader in the async thread.

JingsongLi · 2024-10-31T02:43:34Z

But overall, I don't think this PR should be optimized like this, because in this situation, you should try to minimize partitions as much as possible (I guess it's because there are too many partitions causing too many writers)?

neuyilan · 2024-10-31T05:19:25Z

But overall, I don't think this PR should be optimized like this, because in this situation, you should try to minimize partitions as much as possible (I guess it's because there are too many partitions causing too many writers)?

In fact, there are not many partitions and buckets, but in our scenario, the data in Flink is consumed from Talos (like Kafka), so there may be many cross partition and cross bucket data streams within a checkpoint time interval. So it will result in a checkpoint where the bucket data for each partition is flushed sequentially, leading to longer processing times.

neuyilan · 2024-11-05T07:43:15Z

@JingsongLi What do you think? I think it's common sence

support parallel close writers

1b18ea8

neuyilan closed this Oct 14, 2024

neuyilan reopened this Oct 14, 2024

neuyilan added 2 commits October 16, 2024 16:48

add doc description

9239267

Merge branch 'master' into dev/support_parallel_close

11108f0

neuyilan closed this Oct 16, 2024

neuyilan reopened this Oct 16, 2024

neuyilan marked this pull request as draft October 17, 2024 00:36

merge master

b34433a

neuyilan marked this pull request as ready for review October 18, 2024 06:14

neuyilan added 5 commits October 18, 2024 15:14

shutdown the closeWritersExecutor

899926a

remove useless codes

3d3459b

refine the code

a65d795

refine the code

3a932ea

change the default thread number

8a0b507

JingsongLi closed this Oct 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] Support parallel close writers #4297

[core] Support parallel close writers #4297

neuyilan commented Oct 9, 2024 •

edited

Loading

JingsongLi commented Oct 15, 2024

neuyilan commented Oct 21, 2024

JingsongLi commented Oct 31, 2024 •

edited

Loading

JingsongLi commented Oct 31, 2024

neuyilan commented Oct 31, 2024

neuyilan commented Nov 5, 2024

[core] Support parallel close writers #4297

[core] Support parallel close writers #4297

Conversation

neuyilan commented Oct 9, 2024 • edited Loading

support parallel close writes.

Documentation

JingsongLi commented Oct 15, 2024

neuyilan commented Oct 21, 2024

JingsongLi commented Oct 31, 2024 • edited Loading

JingsongLi commented Oct 31, 2024

neuyilan commented Oct 31, 2024

neuyilan commented Nov 5, 2024

neuyilan commented Oct 9, 2024 •

edited

Loading

JingsongLi commented Oct 31, 2024 •

edited

Loading