[flink] Bypass operator should never block checkpoint #4039

JingsongLi · 2024-08-22T14:15:21Z

Purpose

The previous AppendBypassCoordinateOperator spent a lot of time generating compact tasks when there were many files, leading to significant blocking of normal threads and even resulting in checkpoint delays.

Modify it:

Scanning in a separate thread.
Make sure there is no message in mail box, and send compact tasks to downstream.

Tests

API and Format

Documentation

wwj6591812 · 2024-08-22T15:28:13Z

...link-common/src/main/java/org/apache/paimon/flink/source/AppendBypassCoordinateOperator.java

-                output.collect(new StreamRecord<>(Either.Right(task)));
-            }
-
+            compactTasks.addAll(tasks);


Should we first check if tasks.isEmpty()? If it's not empty, then compactTasks.addAll(tasks);

Because if tasks.isEmpty(), then there is no need to execute compactTasks.addAll (tasks);

So what is the benefit?

wwj6591812 · 2024-08-22T15:33:51Z

...link-common/src/main/java/org/apache/paimon/flink/source/AppendBypassCoordinateOperator.java

-    @Override
-    public void onProcessingTime(long time) {
-        while (true) {
+    public void asyncPlan(UnawareAppendTableCompactionCoordinator coordinator) {


wwj6591812 · 2024-08-22T15:56:35Z

Hi, thx for this @JingsongLi

I have a doubt that you say generating compact tasks blocking normal threads and even resulting in checkpoint delays.
But In my test, our checkpoint is normal. But the problem is that the first operator parallelism is 1024, and the Compact Coordinator parallelism is 1. This leads to the phenomenon of back pressure.

So, should we support both this new topology and old union topology simultaneously?

JingsongLi · 2024-08-23T01:32:45Z

@wwj6591812

What is your point? You mean you use new code and the job still back pressure?
If not, please valid this first.

wwj6591812 · 2024-08-23T01:37:17Z

@wwj6591812

What is your point? You mean you use new code and the job still back pressure? If not, please valid this first.

ok，after merge，I will test soon.

leaves12138

+1

[flink] Bypass operator should never block checkpoint

7445695

wwj6591812 reviewed Aug 22, 2024

View reviewed changes

fix

1b9f78d

leaves12138 approved these changes Aug 23, 2024

View reviewed changes

leaves12138 merged commit 16cf881 into apache:master Aug 23, 2024
9 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[flink] Bypass operator should never block checkpoint #4039

[flink] Bypass operator should never block checkpoint #4039

JingsongLi commented Aug 22, 2024

wwj6591812 Aug 22, 2024

JingsongLi Aug 23, 2024

wwj6591812 Aug 23, 2024

JingsongLi Aug 23, 2024 •

edited

Loading

wwj6591812 Aug 22, 2024

wwj6591812 commented Aug 22, 2024 •

edited

Loading

JingsongLi commented Aug 23, 2024 •

edited

Loading

wwj6591812 commented Aug 23, 2024

leaves12138 left a comment

[flink] Bypass operator should never block checkpoint #4039

[flink] Bypass operator should never block checkpoint #4039

Conversation

JingsongLi commented Aug 22, 2024

Purpose

Tests

API and Format

Documentation

wwj6591812 Aug 22, 2024

Choose a reason for hiding this comment

JingsongLi Aug 23, 2024

Choose a reason for hiding this comment

wwj6591812 Aug 23, 2024

Choose a reason for hiding this comment

JingsongLi Aug 23, 2024 • edited Loading

Choose a reason for hiding this comment

wwj6591812 Aug 22, 2024

Choose a reason for hiding this comment

wwj6591812 commented Aug 22, 2024 • edited Loading

JingsongLi commented Aug 23, 2024 • edited Loading

wwj6591812 commented Aug 23, 2024

leaves12138 left a comment

Choose a reason for hiding this comment

JingsongLi Aug 23, 2024 •

edited

Loading

wwj6591812 commented Aug 22, 2024 •

edited

Loading

JingsongLi commented Aug 23, 2024 •

edited

Loading