Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flink] When reading DataSplit, calculate fetch event time lag with earliest file creation time instead of latest #3952

Merged
merged 1 commit into from
Aug 13, 2024

Conversation

tsreaper
Copy link
Contributor

Purpose

Currently when reading DataSplit, we calculate currentFetchEventTimeLag with the latest file creation time in this split. This is incorrect because fetch event time lag should be defined by the oldest data in this split.

This PR changes this behavior by using the earliest file creation time to calculate fetch event time lag.

Tests

Tested by hand.

API and Format

No format changes.

Documentation

No new feature.

…arliest file creation time instead of latest
@JingsongLi
Copy link
Contributor

+1

@JingsongLi JingsongLi merged commit 7dd2edb into apache:master Aug 13, 2024
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants