-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix reproducibility issues in tests (#63)
* Fix reproducibility issues in tests Signed-off-by: Fabrice Normandin <[email protected]> * Fix creation of `lightning_logs` dir in tests Signed-off-by: Fabrice Normandin <[email protected]> * Re-add regression files to git index Signed-off-by: Fabrice Normandin <[email protected]> * Try to fix issues with hf_example_test.py Signed-off-by: Fabrice Normandin <[email protected]> * re-enable --slow tests in last CI step * use rye run, not pdm run * Don't skip if files are missing Signed-off-by: Fabrice Normandin <[email protected]> * Run full regression tests on dev machine Signed-off-by: Fabrice Normandin <[email protected]> * Remove GPU name from regression files Signed-off-by: Fabrice Normandin <[email protected]> * Remove code to select tests based on duration Signed-off-by: Fabrice Normandin <[email protected]> * Tweak incremental testing annotation Signed-off-by: Fabrice Normandin <[email protected]> * Tweak the `rye sync` in local integration tests Signed-off-by: Fabrice Normandin <[email protected]> * Try disabling rye cache in local_integration_tests Signed-off-by: Fabrice Normandin <[email protected]> * Update tensor_regression and precision in files Signed-off-by: Fabrice Normandin <[email protected]> * Revert "Try disabling rye cache in local_integration_tests" This reverts commit 8fd85a4. * Revert "Tweak the `rye sync` in local integration tests" This reverts commit 54c1d55. * Fix hash function used in regression tests Signed-off-by: Fabrice Normandin <[email protected]> * Don't include the tensor hash in regression files Signed-off-by: Fabrice Normandin <[email protected]> * Remove hashes from existing regression files Signed-off-by: Fabrice Normandin <[email protected]> * Show installed packages in slurm integration tests Signed-off-by: Fabrice Normandin <[email protected]> * Add an xfail on a specific regression test Signed-off-by: Fabrice Normandin <[email protected]> --------- Signed-off-by: Fabrice Normandin <[email protected]>
- Loading branch information
Showing
65 changed files
with
8,445 additions
and
179 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
94 changes: 94 additions & 0 deletions
94
...algorithms/example_test/test_backward_pass_is_reproducible/cpu/fcnet_cifar10_example.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
batch.0: | ||
device: cpu | ||
max: '2.126e+00' | ||
mean: '-6.179e-03' | ||
min: '-1.989e+00' | ||
shape: | ||
- 128 | ||
- 3 | ||
- 32 | ||
- 32 | ||
sum: '-2.43e+03' | ||
batch.1: | ||
device: cpu | ||
max: 9 | ||
mean: '4.555e+00' | ||
min: 0 | ||
shape: | ||
- 128 | ||
sum: 583 | ||
grads.network.0.1.bias: | ||
device: cpu | ||
max: '6.107e-03' | ||
mean: '1.775e-04' | ||
min: '-5.292e-03' | ||
shape: | ||
- 128 | ||
sum: '2.272e-02' | ||
grads.network.0.1.weight: | ||
device: cpu | ||
max: '1.307e-02' | ||
mean: '4.693e-05' | ||
min: '-1.141e-02' | ||
shape: | ||
- 128 | ||
- 3072 | ||
sum: '1.845e+01' | ||
grads.network.1.0.bias: | ||
device: cpu | ||
max: '1.041e-02' | ||
mean: '6.975e-04' | ||
min: '-8.782e-03' | ||
shape: | ||
- 128 | ||
sum: '8.928e-02' | ||
grads.network.1.0.weight: | ||
device: cpu | ||
max: '1.584e-02' | ||
mean: '1.481e-04' | ||
min: '-1.507e-02' | ||
shape: | ||
- 128 | ||
- 128 | ||
sum: '2.426e+00' | ||
grads.network.2.0.bias: | ||
device: cpu | ||
max: '3.282e-02' | ||
mean: '-1.956e-09' | ||
min: '-2.134e-02' | ||
shape: | ||
- 10 | ||
sum: '-1.956e-08' | ||
grads.network.2.0.weight: | ||
device: cpu | ||
max: '2.200e-02' | ||
mean: '-2.874e-10' | ||
min: '-5.831e-02' | ||
shape: | ||
- 10 | ||
- 128 | ||
sum: '-3.679e-07' | ||
outputs.logits: | ||
device: cpu | ||
max: '7.036e-01' | ||
mean: '-8.651e-03' | ||
min: '-8.180e-01' | ||
shape: | ||
- 128 | ||
- 10 | ||
sum: '-1.107e+01' | ||
outputs.loss: | ||
device: cpu | ||
max: '2.316e+00' | ||
mean: '2.316e+00' | ||
min: '2.316e+00' | ||
shape: [] | ||
sum: '2.316e+00' | ||
outputs.y: | ||
device: cpu | ||
max: 9 | ||
mean: '4.555e+00' | ||
min: 0 | ||
shape: | ||
- 128 | ||
sum: 583 |
94 changes: 94 additions & 0 deletions
94
...thms/example_test/test_backward_pass_is_reproducible/cpu/fcnet_fashion_mnist_example.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
batch.0: | ||
device: cpu | ||
max: '2.821e+00' | ||
mean: '4.822e-01' | ||
min: '-4.242e-01' | ||
shape: | ||
- 128 | ||
- 1 | ||
- 28 | ||
- 28 | ||
sum: '4.839e+04' | ||
batch.1: | ||
device: cpu | ||
max: 9 | ||
mean: '4.555e+00' | ||
min: 0 | ||
shape: | ||
- 128 | ||
sum: 583 | ||
grads.network.0.1.bias: | ||
device: cpu | ||
max: '6.875e-03' | ||
mean: '2.096e-04' | ||
min: '-8.370e-03' | ||
shape: | ||
- 128 | ||
sum: '2.683e-02' | ||
grads.network.0.1.weight: | ||
device: cpu | ||
max: '1.948e-02' | ||
mean: '2.916e-04' | ||
min: '-2.213e-02' | ||
shape: | ||
- 128 | ||
- 784 | ||
sum: '2.926e+01' | ||
grads.network.1.0.bias: | ||
device: cpu | ||
max: '1.109e-02' | ||
mean: '2.213e-04' | ||
min: '-1.267e-02' | ||
shape: | ||
- 128 | ||
sum: '2.832e-02' | ||
grads.network.1.0.weight: | ||
device: cpu | ||
max: '2.374e-02' | ||
mean: '9.326e-05' | ||
min: '-2.32e-02' | ||
shape: | ||
- 128 | ||
- 128 | ||
sum: '1.528e+00' | ||
grads.network.2.0.bias: | ||
device: cpu | ||
max: '3.847e-02' | ||
mean: '-3.353e-09' | ||
min: '-4.706e-02' | ||
shape: | ||
- 10 | ||
sum: '-3.353e-08' | ||
grads.network.2.0.weight: | ||
device: cpu | ||
max: '5.741e-02' | ||
mean: '-4.195e-10' | ||
min: '-6.431e-02' | ||
shape: | ||
- 10 | ||
- 128 | ||
sum: '-5.369e-07' | ||
outputs.logits: | ||
device: cpu | ||
max: '9.872e-01' | ||
mean: '-1.288e-02' | ||
min: '-7.225e-01' | ||
shape: | ||
- 128 | ||
- 10 | ||
sum: '-1.648e+01' | ||
outputs.loss: | ||
device: cpu | ||
max: '2.311e+00' | ||
mean: '2.311e+00' | ||
min: '2.311e+00' | ||
shape: [] | ||
sum: '2.311e+00' | ||
outputs.y: | ||
device: cpu | ||
max: 9 | ||
mean: '4.555e+00' | ||
min: 0 | ||
shape: | ||
- 128 | ||
sum: 583 |
94 changes: 94 additions & 0 deletions
94
...t/algorithms/example_test/test_backward_pass_is_reproducible/cpu/fcnet_mnist_example.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
batch.0: | ||
device: cpu | ||
max: '2.821e+00' | ||
mean: '1.432e-02' | ||
min: '-4.242e-01' | ||
shape: | ||
- 128 | ||
- 1 | ||
- 28 | ||
- 28 | ||
sum: '1.437e+03' | ||
batch.1: | ||
device: cpu | ||
max: 9 | ||
mean: '4.242e+00' | ||
min: 0 | ||
shape: | ||
- 128 | ||
sum: 543 | ||
grads.network.0.1.bias: | ||
device: cpu | ||
max: '1.075e-02' | ||
mean: '2.421e-04' | ||
min: '-7.844e-03' | ||
shape: | ||
- 128 | ||
sum: '3.099e-02' | ||
grads.network.0.1.weight: | ||
device: cpu | ||
max: '2.006e-02' | ||
mean: '5.258e-05' | ||
min: '-1.844e-02' | ||
shape: | ||
- 128 | ||
- 784 | ||
sum: '5.277e+00' | ||
grads.network.1.0.bias: | ||
device: cpu | ||
max: '1.169e-02' | ||
mean: '4.285e-04' | ||
min: '-1.152e-02' | ||
shape: | ||
- 128 | ||
sum: '5.485e-02' | ||
grads.network.1.0.weight: | ||
device: cpu | ||
max: '1.753e-02' | ||
mean: '1.016e-04' | ||
min: '-2.219e-02' | ||
shape: | ||
- 128 | ||
- 128 | ||
sum: '1.665e+00' | ||
grads.network.2.0.bias: | ||
device: cpu | ||
max: '3.969e-02' | ||
mean: '-1.304e-09' | ||
min: '-7.979e-02' | ||
shape: | ||
- 10 | ||
sum: '-1.304e-08' | ||
grads.network.2.0.weight: | ||
device: cpu | ||
max: '3.221e-02' | ||
mean: '-1.306e-10' | ||
min: '-6.755e-02' | ||
shape: | ||
- 10 | ||
- 128 | ||
sum: '-1.672e-07' | ||
outputs.logits: | ||
device: cpu | ||
max: '7.029e-01' | ||
mean: '-3.564e-02' | ||
min: '-7.781e-01' | ||
shape: | ||
- 128 | ||
- 10 | ||
sum: '-4.562e+01' | ||
outputs.loss: | ||
device: cpu | ||
max: '2.304e+00' | ||
mean: '2.304e+00' | ||
min: '2.304e+00' | ||
shape: [] | ||
sum: '2.304e+00' | ||
outputs.y: | ||
device: cpu | ||
max: 9 | ||
mean: '4.242e+00' | ||
min: 0 | ||
shape: | ||
- 128 | ||
sum: 543 |
Oops, something went wrong.