-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update chicoma-cpu modules #112
Update chicoma-cpu modules #112
Conversation
This is just a draft so far. I'm having no luck with either gnu or intel on Chicoma-CPU so far. I haven't tried anything else yet. |
I've contacted LANL IC about the trouble I'm having with gnu:
While it seems clear that there's an RPATH being set to |
On the intel side, it's not finding NetCDF-C or -Fortran, even though we're passing a |
The gnu issue seems similar to E3SM-Project#6677 |
With the commits I just pushed, I was able to successfully build and run:
So I think at this point we can say we support gnu on chicoma. I'll poke around at intel as well |
<CCSM_CPRNC>/usr/projects/climate/SHARED_CLIMATE/software/badger/cprnc</CCSM_CPRNC> | ||
<CCSM_CPRNC>/usr/projects/e3sm/software/chicoma-cpu/cprnc</CCSM_CPRNC> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I bet I know what happened here. I deleted this thinking that it was old and no longer used. In my defense, it has badger
in the path...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it no longer exists! Not just the machine but that file. But I think if that line is missing it forces each test to try to build it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it no longer exists!
That's what I was saying. I think I deleted it trying to free up space in /usr/projects/climate
because I couldn't imagine we were still using software built for Badger.
<command name="unload">PrgEnv-gnu</command> | ||
<command name="unload">PrgEnv-intel</command> | ||
<command name="unload">PrgEnv-nvidia</command> | ||
<command name="unload">PrgEnv-cray</command> | ||
<command name="unload">PrgEnv-aocc</command> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that these needed to be unloaded after their corresponding compiler modules or there would be an error about an undefined environment variable name.
df21910
to
9c0b308
Compare
@jonbob, I'm trying to run a test:
This looks to be what you ran successfully. But for me it just seems to be hanging. It hasn't got to ocean time stepping yet and there's very little output in the Could you have a quick look and let me know if you see anything obvious?
|
@xylar -- let me take a peek |
@xylar - it seems to be struggling with the atm data? That doesn't make much sense |
In the meantime, I'm trying an optimized run to see how that goes. |
|
|
I tested |
I realize it's not a high priority for us but |
<command name="load">PrgEnv-nvidia/8.5.0</command> | ||
<command name="load">nvidia/24.7</command> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I successfully tested these updated modules as well.
This should be set to: ``` export GNU_CRAY_LDFLAGS="-Wl,--enable-new-dtags" ``` on Chicoma-CPU with gcc.
@@ -396,11 +396,11 @@ gnu-cray: | |||
"FFLAGS_OPT = -O3 -m64 -ffree-line-length-none -fconvert=big-endian -ffree-form -ffpe-summary=none $${EXTRA_FFLAGS}" \ | |||
"CFLAGS_OPT = -O3 -m64" \ | |||
"CXXFLAGS_OPT = -O3 -m64" \ | |||
"LDFLAGS_OPT = -O3 -m64" \ | |||
"LDFLAGS_OPT = -O3 -m64 $(GNU_CRAY_LDFLAGS)" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matthewhoffman, this environment variable (or argument to make
) needs to be set to:
export GNU_CRAY_LDFLAGS="-Wl,--enable-new-dtags"
on Chicoma for now. I'll make sure Compass and Polaris do this. If someone is building for Chicoma outside of Compass or Polaris (good luck!), they would need to set this manually.
Are you okay with this fix? I don't want to put in anything into the Makefile that tries to detect the machine or anything crazy like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xylar , this seems like the best solution given the circumstances
870b287
to
d379003
Compare
@jonbob, at the risk of delaying this further, I think we probably want to follow what Noel is doing on Perlmutter: |
871a4b7
to
b33833c
Compare
This is to match proposed updates to Perlmutter-CPU E3SM-Project#6702
b33833c
to
c34336e
Compare
<command name="unload">cray-parallel-netcdf</command> | ||
<command name="unload">cray-netcdf</command> | ||
<command name="unload">cray-hdf5</command> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried removing cpe
like Noel did here:
https://github.com/E3SM-Project/E3SM/blob/bdcc2f551cfae2fca53bd8aa4ec604601ddf1c68/cime_config/machines/config_machines.xml#L193
But I got nasty error like:
Lmod has detected the following error: These module(s) or extension(s) exist
but cannot be loaded as requested: "git", "cmake/3.27.7"
Try: "module spider git cmake/3.27.7" to see how to load the module(s).
I have |
The following both passed:
|
Closed in favor of E3SM-Project#6705 |
Following the recent DST, this merge updates the module files and environment variables on Chicoma-CPU. We note that these updates work well for
gnu
andnvidia
compilers but not yet forintel
, which we are continuing to work on. A separate update will be needed to address Chicoma-GPU as well.