-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure tests are deterministic and faster #77
Open
gonzalo-bulnes
wants to merge
4
commits into
m-lundberg:master
Choose a base branch
from
gonzalo-bulnes:ensure-tests-are-deterministic
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Ensure tests are deterministic and faster #77
gonzalo-bulnes
wants to merge
4
commits into
m-lundberg:master
from
gonzalo-bulnes:ensure-tests-are-deterministic
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The inaccuracy is small, but it becomes visible when operating within tight margins of the `sample_time`, like for example, when working with a simulated clock in a test environment. Specifically, `dt` cannot be exactly zero because it is used as a divider. When it would be zero, it is set to a very small, conventional value instead, that I've renamed MIN_DT for clarity. When `dt == MIN_DT` it is really representing the case when it would be zero. Not taking that value into consideration when comparing the elapsed time since last computation does effectively increase `sample_time` by `MIN_DT`. It is not a lot, but it is not accurate either. By correcting the comparison, very precise use of `sample_time` becomes possible. One concrete use case is using simulated time for testing.
Unless I'm misunderstanding something, the time necessary for the system to converge should be stable, for any given configuration of the PID and a given runtime environment. I notice that the values 120s and 12s are close enough that 120 might have been a typo? Either way, my testing suggests that 10s is sufficient for the system to converge on my machine. However, the reliance on wall time for these tests means that how fast the code is being executed does affect that value. Before we fix that, I think that the 20% margin provided by the existing value of 12s is reasonnable. On my machine, this change reduces the overall time to run the tests suite by a factor of five (from ~2m14 to ~26s).
gonzalo-bulnes
force-pushed
the
ensure-tests-are-deterministic
branch
from
October 2, 2023 12:00
d841e8e
to
d4f55f5
Compare
This is ready for review on my end, let me know if you'd like any additions / removals / editions @m-lundberg, or generally if that's what you had in mind with #70. 🙂 Edit: I read my own docs, and used less |
gonzalo-bulnes
force-pushed
the
ensure-tests-are-deterministic
branch
from
October 2, 2023 13:30
1757e9f
to
a64b783
Compare
This has a few advantages: - Since time is frozen unless explicit resets or ticks happen, how long the test code takes to execute does not affect the test results. Specifically, this means that the `sample_time` can be matched precisely, and that makes for tests that are easier to reason about. - Since time is controlled, it can flow at any speed we want. Two consecutive calls to `stime.tick()` will make two seconds pass as fast as your machine can execute the two statements. The first advantage makes this test suite deterministic (because the PID configuration is stable in convergence tests). The second speeds up the overall execution of the test suite by a factor of a hundred on my machine (from ~26s to ~200ms).
The `dt` option was added when `time_fn` was not available to initialize PID instances. Now that `time_fn` allows the time function to be injected as a dependency, the `dt` option is redundant. It could be kept around for backward compatibility, but I'd personally remove it because IMHO the simpler the code the better.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This PR does two related things in three steps:
It also suggests the removal of some code, because the
time_fn
option makes thedt
option redundant (as demonstrated by the use of simulated time in the test suite). Context in #10. (:warning: That's very much YMMV, admittedly, and I can't judge if removing it breaks backwards-compatibility, it's your definition that matters, not mine. See commit message for more details.)🔮 There are extensive comments in the commit messages.
Fixes #70
Implementation
I happen to have written a drop-in replacement for
time.time()
andtime.monotonic()
some time ago, that provides simulated time for testing purposes. (stime on GitHub, and PyPi)The introduction of the
time_fn
option to inject the time function enables the best case scenario to introduce stime.This is the first time I actually use this little package, and I'm pretty happy with how it fits this project. (Fun fact: I wrote as a simple pretext to learn how to write a Python package, at a time I thought I'd need it eventually when implementing a PID, before I found this package. So the projects were somehow related in my mind from the start. Needless to say I never ended up writing that PID package.)
The general idea is:
import stime
0
is as good a choice as any other Unix epoch) withstime.reset(0)
.stime.tick()
or any arbitrary valuestime.tick(0.1)
, or reset it at will.That's it.
Outcome
Unless I'm misunderstanding something about how the PID works, the current test suite should now be deterministic.
The overall speed of the test suite has increased significantly. For reference, an entire test suite run on my machine went from 2m14, to 26s, to 200ms. That's a roughly 600x speed-up, of which I'd consider the last 100x factor meaningful.
I also think that tests with simulated time are easier to reason about, because the test code itself doesn't affect the timings. (That's another effect of the same property that enables writing deterministic tests.)
Checklist