-
Notifications
You must be signed in to change notification settings - Fork 374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add aurora machine to e3sm #6117
Conversation
|
<SAVE_TIMING_DIR>/lus/gecko/CSC249ADSE15_CNDA/performance_archive</SAVE_TIMING_DIR> | ||
<SAVE_TIMING_DIR_PROJECTS>.*</SAVE_TIMING_DIR_PROJECTS> | ||
<CIME_OUTPUT_ROOT>/lus/gecko/projects/CSC249ADSE15_CNDA/$USER/scratch</CIME_OUTPUT_ROOT> | ||
<DIN_LOC_ROOT>/lus/gecko/projects/CSC249ADSE15_CNDA/inputdata</DIN_LOC_ROOT> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What project is "CSC249ADSE15_CNDA"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for ECP Aurora early access project, it is valid through Apr. 2024.
@xyuan -- does this setup builds/runs on aurora? for both gpu/cpu setups? |
Also update to CMake-friendly vars and rm diag queue.
Add ALCF Aurora to E3SM machines. [BFB]
Re-merge to next to update PATH on Aurora
Pushed a lot of updates to the branch to get
|
<env name="ONEAPI_DEVICE_SELECTOR">level_zero:gpu</env> | ||
<env name="ONEAPI_MPICH_GPU">NO_GPU</env> | ||
<env name="MPIR_CVAR_ENABLE_GPU">0</env> | ||
<env name="romio_cb_read">disable</env> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these flags (romio_cb_*) still causing issues on Aurora?
<env name="FI_CXI_DEFAULT_CQ_SIZE">131072</env> | ||
<env name="FI_CXI_CQ_FILL_PERCENT">20</env> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good to add context for these flags and why we had to steer away from defaults for these Slingshot networks variables.
<env name="romio_cb_write">disable</env> | ||
<env name="SYCL_CACHE_PERSISTENT">1</env> | ||
<env name="GATOR_INITIAL_MB">4000MB</env> | ||
<env name="GATOR_DISABLE">0</env> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't GATOR_DISABLE 0 by default? Perhaps this was set as 1 while debugging?
</environment_variables> | ||
<environment_variables compiler="oneapi-ifxgpu"> | ||
<env name="ONEAPI_DEVICE_SELECTOR">level_zero:gpu</env> | ||
<env name="ONEAPI_MPICH_GPU">NO_GPU</env> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is disabling GPU to GPU MPI?
<env name="NETCDF_FORTRAN_PATH">/lus/gecko/projects/CSC249ADSE15_CNDA/software/netcdf-fortran/4.6.1/oneapi.eng.2023.05.15.007</env> | ||
<env name="PNETCDF_PATH">/lus/gecko/projects/CSC249ADSE15_CNDA/software/pnetcdf/1.12.3/oneapi.eng.2023.05.15.007</env> | ||
<env name="LD_LIBRARY_PATH">/lus/gecko/projects/CSC249ADSE15_CNDA/software/pnetcdf/1.12.3/oneapi.eng.2023.05.15.007/lib:/lus/gecko/projects/CSC249ADSE15_CNDA/software/netcdf-fortran/4.6.1/oneapi.eng.2023.05.15.007/lib:/lus/gecko/projects/CSC249ADSE15_CNDA/software/netcdf-c/4.9.2/oneapi.eng.2023.05.15.007/lib:$ENV{LD_LIBRARY_PATH}</env> | ||
<env name="PATH">/lus/gecko/projects/CSC249ADSE15_CNDA/software/pnetcdf/1.12.3/oneapi.eng.2023.05.15.007/bin:/lus/gecko/projects/CSC249ADSE15_CNDA/software/netcdf-fortran/4.6.1/oneapi.eng.2023.05.15.007/bin:/lus/gecko/projects/CSC249ADSE15_CNDA/software/netcdf-c/4.9.2/oneapi.eng.2023.05.15.007/bin:$ENV{PATH}</env> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these blocks, "modules", "env variables"..., get executed in order they appear in the file? if so, does it make sense to append env variables before modules are loaded?
It may be a user error, but i am in the situation when a module is loaded and it presumably modifies PATH, but then, I think, the command from above for PATH "erases" that module's path because, maybe, $ENV{PATH} value in use is from before the module was loaded.
Add ALCF Aurora to E3SM machines.
[BFB]