Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test new latest intel compiler on GitHub Action #990

Merged
merged 35 commits into from
Dec 1, 2023
Merged

Conversation

carlocamilloni
Copy link
Member

This enables back the intel compilers testing and switches it to the new clang based version, cf. #976

@carlocamilloni
Copy link
Member Author

@valsson @invemichele multiple VES and OPES regtest fail using the new intel compilers, could you please check if there is anything relevant?

@carlocamilloni
Copy link
Member Author

@GiovanniBussi there are also tests that fail with a segfault like rt-make-threads and other that fails of +-nan, didn't we already solved the +- issue?

@invemichele
Copy link
Contributor

invemichele commented Nov 28, 2023

I just checked, for OPES it fails every time I rely on NaNs, it seems they are interpreted as zeros. In particular std::isnan seems to always return false.
Maybe it's related to this? https://stackoverflow.com/questions/32195949/why-does-nan-nan-0-0-with-the-intel-c-compiler https://stackoverflow.com/questions/47703436/isnan-does-not-work-correctly-with-ofast-flags

@codecov-commenter
Copy link

codecov-commenter commented Nov 28, 2023

Codecov Report

Attention: 7 lines in your changes are missing coverage. Please review.

Comparison is base (a0973e3) 84.08% compared to head (647440c) 84.07%.
Report is 1 commits behind head on master.

❗ Current head 647440c differs from pull request most recent head 2e7b680. Consider uploading reports for the commit 2e7b680 to get more accurate results

Files Patch % Lines
src/tools/NeighborList.cpp 89.23% 7 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #990      +/-   ##
==========================================
- Coverage   84.08%   84.07%   -0.02%     
==========================================
  Files         602      602              
  Lines       56197    56228      +31     
==========================================
+ Hits        47255    47274      +19     
- Misses       8942     8954      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@GiovanniBussi
Copy link
Member

Interesting. I am afraid @invemichele identified the point with NaN, which might affect other tests. Regarding the rt-make-threads I will try to check locally (it seems we have icpx 2021 installed, hopefully it can reproduce the bug).

@carlocamilloni was the problem present also with (old) icpc? Or maybe the new suites does not include icpc, and we are forced to switch to icpx?

@carlocamilloni
Copy link
Member Author

no we are not force to icpx yet but I think it is worth moving to it because icpc is deprecated. We can possibly overcome to nan issue by not using the default fast-math behaviour as mentioned in the post above, but at the same time I am not sure this is good because then people using the compiler will still get the error

@Miniland1333
Copy link

Miniland1333 commented Nov 28, 2023

Based on this stackoverflow article, perhaps check if the issue also goes away if you use -O2 instead of -O3. If this is the case, the stackoverflow article recommends perhaps using -fp-model=precise with -O3.

@invemichele
Copy link
Contributor

On my side, using NaNs is a convenient way to do the parsing, but I could easily add a workaround for the case isnan(nan)==false so that the code is more robust

@carlocamilloni
Copy link
Member Author

I think that generally speaking is not worth trading performances to parsing so it would be better to make the parsing more robust, so if a solution exist for the parsing I would say is better, @GiovanniBussi @maxbonomi @gtribello

@GiovanniBussi
Copy link
Member

I agree with latest @carlocamilloni 's comment.

Regarding the problem on rt-make-threads, it's misleading now. Due to an incorrect coding (that I am going to fix), the code crashes. The underlying error is just numeric. I guess it will be sufficient to change a little bit the trajectory in the test to make it work with optimizing icpx

GiovanniBussi added a commit that referenced this pull request Nov 29, 2023
In the previous version, an error leading to an exception thrown in a thread
was reported as a crash.

With this fix, errors (per thread) are stored and reported in the output.
This makes the possible test failure easier to interpret.
In particular, the failure discussed in #990 in rt-make-threads
should be now visible as a numerical error.
@valsson
Copy link
Contributor

valsson commented Nov 29, 2023

Hello,

From what I can see, there are two failures in the VES regtests:

  • ves/rt-bf-combined
  • ves/rt-bf-legendre
    and both seem to be due to numerics. Reducing the precision of the regtests should fix that.

Changing FORMAT_VALUES_DERIVS=%13.6f to FORMAT_VALUES_DERIVS=%13.5f would reduce the precision of the relevant files.

Will you do this change? Or should I do that via a PR on the master branch?

@carlocamilloni
Copy link
Member Author

@valsson thanks, I have just pushed the change into this pull request

@valsson
Copy link
Contributor

valsson commented Nov 29, 2023

It seems that this commit has some unintended consequences on other files, so now it failing due to other issues. It is related to the bf*.targetdist-averages.data.reference. @carlocamilloni did you do that change on a Apple computer? I think it is related to issues with Apple clang compiler that I still needed to fix (see #950). I think that reverting all the bf*.targetdist-averages.data.reference files to the ones previous to this commit should fix the issue on GH action.

If you revert this commit, I can do a PR on the master branch where I only commit the relevant files related to FORMAT_VALUES_DERIVS=%13.5f

@carlocamilloni
Copy link
Member Author

@valsson indeed, reverted, thanks

@invemichele
Copy link
Contributor

I make the relevant changes here #992
Notice that there are other parts of the code that rely on std::isnan, e.g.

if(std::isnan(getMass(i))) {
probably it's worth checking for possible issues not caught by the regtests

@carlocamilloni
Copy link
Member Author

carlocamilloni commented Nov 29, 2023

tests still failing: @invemichele @GiovanniBussi @valsson (this is to keep track of progresses)

  • ERROR in test basic/rt-ermsd/
  • ERROR in test basic/rt-lepton-asmjit/
  • ERROR in test basic/rt-lepton/
  • ERROR in test basic/rt-make-fortran/
  • ERROR in test basic/rt-make-threads/
  • ERROR in test basic/rt16/
  • ERROR in test crystallization/rt-averaged-q6-spAspB/
  • ERROR in test drr/rt-divergence/
  • ERROR in test drr/rt-restart/
  • ERROR in test opes/rt-ecv_linear-bis/
  • ERROR in test opes/rt-ecv_linear-tris/
  • ERROR in test opes/rt-ecv_multithermal_multibaric-bis/
  • ERROR in test opes/rt-ecv_multithermal_multibaric-custom-restart/
  • ERROR in test opes/rt-ecv_multithermal_multibaric-custom/
  • ERROR in test opes/rt-ecv_multithermal_multibaric-tris/
  • ERROR in test opes/rt-ecv_multithermal_multibaric/
  • ERROR in test opes/rt-opes_expanded-bis/
  • ERROR in test ves/rt-bf-combined/
  • ERROR in test ves/rt-bf-legendre/

btw how are they so noisy?
I am afraid that they are almost useless  now
fortran intel compiler is not yet well configured
@invemichele
Copy link
Contributor

for my code I switched to a much simpler solution that does not use NaNs and works the same in practice, see #993

@carlocamilloni
Copy link
Member Author

@valsson could you please decrease it by one more digit?

@carlocamilloni carlocamilloni merged commit e158c05 into master Dec 1, 2023
32 checks passed
@carlocamilloni carlocamilloni deleted the test_intel branch December 1, 2023 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants