Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Developer Issue]: Investigate the requirements for FIMS reproducibility #649

Open
2 tasks
Bai-Li-NOAA opened this issue Jul 10, 2024 · 3 comments
Open
2 tasks
Labels
status: triage_needed This is not approved for this milestone, do not work on it yet
Milestone

Comments

@Bai-Li-NOAA
Copy link
Contributor

Description

A mismatch between FIMS results from local runs and GHA was identified, even when using the same FIMS model, data, and seed (see notes here). The mismatch was due to different versions of tools used locally and on GHA. To ensure FIMS results are reproducible, similar to Stan, we need to ensure the following components are identical: FIMS version, R interface, included library versions, operating system version, computer hardware, and C++ compiler.

Tasks:

  • @msupernaw suggests narrowing down the cause of the mismatch to one particular component.
  • Document the requirements for FIMS reproducibility in a vignette.
@Bai-Li-NOAA Bai-Li-NOAA added the status: triage_needed This is not approved for this milestone, do not work on it yet label Jul 10, 2024
@iantaylor-NOAA
Copy link
Contributor

@Bai-Li-NOAA, thanks for catching this and posting the issue.

In addition to documenting the requirements for reproducibility, I think It would helpful to document the extent of the difference that could be expected when the components aren't identical.

Looking at the expected and actual values from this line the GHA https://github.com/NOAA-FIMS/FIMS/actions/runs/9762956877/job/26947645590#step:7:274 and taking the ratio shows a range of 0.9999947 to 1.0000058. I think that's plenty of precision for any fisheries stock assessment model. SS3 results have always differed to a similar extent among operating systems and it's never been an issue for the production assessments.

Having said that, I understand the problem this poses for our testing framework and for reproducibility in general.

Perhaps the User Guide could include language like "Differences in R interface, included library versions, operating system version, computer hardware, and C++ compiler may lead to differences in results on the order of 1e-5." Perhaps referencing the Stan page makes sense as well. I don't think we should speak to the differences in results between FIMS versions because those might be more extensive depending on what we're changing.

# code to calculate ratio
expected <- c(974415.508459565, 855922.366697973, 665636.943841043, 559681.933274521, 417469.882961596, 364389.958969985, 313543.539361582, 194952.972377838, 166776.416177042)
actual <- c(974417.319788301, 855922.751294695, 665639.068928898, 559682.547638791, 417471.354292516, 364390.381762609, 313544.580739862, 194951.845447807, 166777.297021839)
range(expected/actual)

@k-doering-NOAA
Copy link
Member

Just adding that this issue popped up on occasion when running regression tests for SS3, usually due to OS differences (but it does seem logical that all the other components mentioned could have an effect!) For unstable models, the differences could sometimes be large (because the model run would end in a way different optimization).

@kellijohnson-NOAA
Copy link
Contributor

Are the differences before or after optimization (might be worth trying to compare before). It might be the transfer of values from R to C++ interface because of cropped decimal places if the results are small differences.

@kellijohnson-NOAA kellijohnson-NOAA added this to the Q2 milestone Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: triage_needed This is not approved for this milestone, do not work on it yet
Projects
None yet
Development

No branches or pull requests

4 participants