Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flink] Bypass operator should never block checkpoint #4039

Merged
merged 2 commits into from
Aug 23, 2024

Conversation

JingsongLi
Copy link
Contributor

Purpose

The previous AppendBypassCoordinateOperator spent a lot of time generating compact tasks when there were many files, leading to significant blocking of normal threads and even resulting in checkpoint delays.

Modify it:

  1. Scanning in a separate thread.
  2. Make sure there is no message in mail box, and send compact tasks to downstream.

Tests

API and Format

Documentation

output.collect(new StreamRecord<>(Either.Right(task)));
}

compactTasks.addAll(tasks);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we first check if tasks.isEmpty()? If it's not empty, then compactTasks.addAll(tasks);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because if tasks.isEmpty(), then there is no need to execute compactTasks.addAll (tasks);

Copy link
Contributor Author

@JingsongLi JingsongLi Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what is the benefit?

@Override
public void onProcessingTime(long time) {
while (true) {
public void asyncPlan(UnawareAppendTableCompactionCoordinator coordinator) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private

@wwj6591812
Copy link
Contributor

wwj6591812 commented Aug 22, 2024

Hi, thx for this @JingsongLi

I have a doubt that you say generating compact tasks blocking normal threads and even resulting in checkpoint delays.
But In my test, our checkpoint is normal. But the problem is that the first operator parallelism is 1024, and the Compact Coordinator parallelism is 1. This leads to the phenomenon of back pressure.

So, should we support both this new topology and old union topology simultaneously?

image

image

@JingsongLi
Copy link
Contributor Author

JingsongLi commented Aug 23, 2024

@wwj6591812

What is your point? You mean you use new code and the job still back pressure?
If not, please valid this first.

@wwj6591812
Copy link
Contributor

@wwj6591812

What is your point? You mean you use new code and the job still back pressure? If not, please valid this first.

ok,after merge,I will test soon.

Copy link
Contributor

@leaves12138 leaves12138 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@leaves12138 leaves12138 merged commit 16cf881 into apache:master Aug 23, 2024
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants