Scale Microgrid Test Error #16

Paulm991 · 2024-07-10T18:24:13Z

Trying to run the ScaleMicrogrid test is throwing an error on the 'develop' branch. The error output is as follows:

"ScaleMicrogrid" start time: Jul 10 14:10 EDT
Output:
Test the Relative Error
Test with Nsize = 2 passes!
Test the Relative Error
Test with Nsize = 4 passes!
Test the Relative Error
Test with Nsize = 8 fails!
Some tests fail!!

Test time = 3.82 sec
Test Failed.
"ScaleMicrogrid" end time: Jul 10 14:10 EDT
"ScaleMicrogrid" time elapsed: 00:00:03
This was performed with the coin-or/Ipopt#12 release of GridKit, Sundials 6.7.0, Ipopt 3.14.16, and SuiteSparse 5.10.1.

reid-g · 2024-09-16T20:22:16Z

Some updates on this. I have verified that all model parameters are correct and rhs outputs are correct. So these are not it.

The "true" solution vectors I generate are from MATLAB based ODE form of the model. Since no analytical solution to the model available . No Jacobian is used. This is done with extreme tolerances. Relative Tolerance 1e-14 and Absolute Tolerance 1e-14. I have done both ode15s and ode23tb solutions. I have done also the new MATLAB integrated IDA way as well. They all match up correctly with only some numerical rounds difference between.

This same error appears in hardwired setups (no GridKit) as well. This is for both ODE and DAE forms hardwired.

Still actively looking to see what is the issue.

pelesh · 2024-12-12T14:54:11Z

Unfortunately, #30 did not fix this issue. See this log.

nkoukpaizan · 2024-12-12T15:37:42Z

The test failure seems to be non-deterministic. Different versions of SUNDIALS and/or different machines produce different error norms.

For example, from my runs on Frontier:

Test the Relative Error for N = 2
2-Norm Relative Error: 1.85946e-06
Test with Nsize = 2 passes!
Test the Relative Error for N = 4
2-Norm Relative Error: 7.05324e-06
Test with Nsize = 4 passes!
Test the Relative Error for N = 8
2-Norm Relative Error: 6.77802e-06
Test with Nsize = 8 passes!
All tests pass!!

But from the Github actions:

12: Test the Relative Error for N = 2
12: 2-Norm Relative Error: 2.80821e-06
12: Test with Nsize = 2 passes!
12: Test the Relative Error for N = 4
12: 2-Norm Relative Error: 2.25243e-06
12: Test with Nsize = 4 passes!
12: Test the Relative Error for N = 8
12: 2-Norm Relative Error: 0.000147299
12: Test with Nsize = 8 fails!

Both of these are with SUNDIALS v7.1.1.

reid-g · 2024-12-12T20:40:24Z

The test failure seems to be non-deterministic. Different versions of SUNDIALS and/or different machines produce different error norms.

For example, from my runs on Frontier:

Test the Relative Error for N = 2
2-Norm Relative Error: 1.85946e-06
Test with Nsize = 2 passes!
Test the Relative Error for N = 4
2-Norm Relative Error: 7.05324e-06
Test with Nsize = 4 passes!
Test the Relative Error for N = 8
2-Norm Relative Error: 6.77802e-06
Test with Nsize = 8 passes!
All tests pass!!

But from the Github actions:

12: Test the Relative Error for N = 2
12: 2-Norm Relative Error: 2.80821e-06
12: Test with Nsize = 2 passes!
12: Test the Relative Error for N = 4
12: 2-Norm Relative Error: 2.25243e-06
12: Test with Nsize = 4 passes!
12: Test the Relative Error for N = 8
12: 2-Norm Relative Error: 0.000147299
12: Test with Nsize = 8 fails!

Both of these are with SUNDIALS v7.1.1.

This appears to be due to compiler optimizations and how rounding is handled. I am able to exactly (every digit) replicate the results from Github actions on my machine by simply changing optimization flag from O0 to O1. I did not catch this since I was working in O0. My results were the same going from SUNDIALS v6.6.0 to v7.2.0 at optimization level O0.

There are trig functions utilized in the DG component. I suspect compiler optimizations are handling them differently. Plus the time scales are quite small in the initial time. The errors are worse on N=8 vs N=4 and N=2

I have found two routes (at least for my machine) to handle the problem.

Reduce the SCALE_MICROGRID_REL_TOL and SCALE_MICROGRID_ABS_TOL to 1e-6. This gives consistent results between O0 and O1. I am in favor of this option as we are already "bottoming out".
Add the compiler flag -mfpmath=387 to increase temporary float precision. This prevents the large increase in error in N=8 at the current tolerances. However, this would only be useful for this problem and may slow performance.

pelesh · 2024-12-13T17:35:50Z

I have found two routes (at least for my machine) to handle the problem.

Reduce the SCALE_MICROGRID_REL_TOL and SCALE_MICROGRID_ABS_TOL to 1e-6. This gives consistent results between O0 and O1. I am in favor of this option as we are already "bottoming out".

Add the compiler flag -mfpmath=387 to increase temporary float precision. This prevents the large increase in error in N=8 at the current tolerances. However, this would only be useful for this problem and may slow performance.

I suggest you make a PR with your solution to nicholson/buildsystem branch. That way we test it instantly. Please document your choices for selecting specific tolerances.

I suggest building code as RelWithDebInfo in CMake. With GCC, it will set flags to -g -O2. I would also test with -O3, as well.

Paulm991 changed the title ~~Trying to run the ScaleMicrogrid test is throwing an error on the 'develop' branch. The error output is as follows:~~ Scale Microgrid Test Error Jul 10, 2024

Paulm991 assigned Paulm991 and reid-g and unassigned Paulm991 Jul 10, 2024

superwhiskers assigned reid-g and unassigned reid-g Jul 10, 2024

Paulm991 linked a pull request Jul 11, 2024 that will close this issue

GENROU With Jacobian #18

Closed

Paulm991 mentioned this issue Aug 6, 2024

Fix #17 #19

Merged

Paulm991 added the bug Something isn't working label Aug 8, 2024

reid-g mentioned this issue Dec 2, 2024

BugFix: Scale Microgrid Error #30

Merged

reid-g linked a pull request Dec 17, 2024 that will close this issue

Fixed IDA Interface Final Copy #37

Merged

pelesh closed this as completed in #37 Dec 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scale Microgrid Test Error #16

Scale Microgrid Test Error #16

Paulm991 commented Jul 10, 2024

reid-g commented Sep 16, 2024

pelesh commented Dec 12, 2024

nkoukpaizan commented Dec 12, 2024

reid-g commented Dec 12, 2024 •

edited

Loading

pelesh commented Dec 13, 2024

Scale Microgrid Test Error #16

Scale Microgrid Test Error #16

Comments

Paulm991 commented Jul 10, 2024

reid-g commented Sep 16, 2024

pelesh commented Dec 12, 2024

nkoukpaizan commented Dec 12, 2024

reid-g commented Dec 12, 2024 • edited Loading

pelesh commented Dec 13, 2024

reid-g commented Dec 12, 2024 •

edited

Loading