Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix timestamp inaccuracy during domain scan #349

Merged
merged 6 commits into from
Apr 18, 2024

Conversation

mattjala
Copy link
Contributor

getNow() introduces inter-node inaccuracies which lead to issue #348 on POSIX systems. If it's available with precision, time.time() is better, so we use that instead.

Resolves #348.

@mattjala mattjala requested a review from jreadey April 16, 2024 21:58
@mattjala mattjala self-assigned this Apr 16, 2024
@jreadey
Copy link
Member

jreadey commented Apr 17, 2024

Is it still possible to get stuck in a loop for system without a precise time.time()?

@mattjala
Copy link
Contributor Author

Is it still possible to get stuck in a loop for system without a precise time.time()?

I don't think so - it hasn't cropped up on a Windows test runner yet.

@mattjala
Copy link
Contributor Author

This would be a slightly separate issue, but depending on inter-node timestamp accuracy for scans is a fragility issue in general. Would it be wise to modify the DN to respond to scan requests with 202 (Accepted) and 102 (In Progress) until the scan is complete? This would require users doing scans to re-poll a provided endpoint, so it'd be a breaking API change, but it would make scans more robust and might eliminate similar issues in the future.

@jreadey
Copy link
Member

jreadey commented Apr 18, 2024

If it's not causing a problem in the test runners, I'd leave the scan request stuff as is for now.
I'm planning to create a general workflow for async tasks (see https://github.com/HDFGroup/hsds/blob/master/docs/design/async_tasks/async_tasks.md). Once that's in place we can move the scan logic there and that should be a better solution.

Copy link
Member

@jreadey jreadey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@mattjala mattjala merged commit 8e395e6 into HDFGroup:master Apr 18, 2024
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MAX_WAIT_TIME for rescan should be a config option
2 participants