-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add admin.list
files to warn users of known issues
#197
base: main
Are you sure you want to change the base?
Conversation
In general I like this approach, and it's sort of what we had in mind with keeping track of known issues, but we should also wonder if these "nag" messages on load won't be too alarming. If a module is loaded indirectly as a dependency, should we emit a message in this case, for example? Maybe the known issues list should also indicate whether or not a message should be emitted when the corresponding module is loaded, and in some cases it may only make sense to emit the message when the module is being loaded directly (not as a dependency). We should definitely also point to the "Known issues" page in the EESSI documentation, where people can get more information on known issues. We have a page like that, but it doesn't list the known issues included in the YAML file currently: http://www.eessi.io/docs/known_issues/eessi-2023.06 |
My preference would be not to warn users if the module is loaded as a dependency mostly because that has the potential of being very verbose if the modules with warnings are very common dependencies. It might also be very alarming in situations where there is no real cause for alarm... I haven't tried it yet, but there is a good chance the messages are triggered even for modules loaded as dependencies.
I agree, and @bedroge suggests to have an entry in the known issues yaml file that determines if the message gets displayed or not. With some refactoring to the file that we discussed (details below) adding this is simple and can easily be parsed by the python script to determine if the module gets added to the
Absolutely, it would be good to automatically parse the yaml file and add the information in an easy to read format on the "Known Issues" page. Maybe it could also be added to the installed software list by adding the issue information to the relevant pages, but that might be more complicated than it's worth. We discussed the best way to address converting the easyconfig name to module name and agreed to go by @casparvl's suggestion of revamping the yaml file. The reworked file would have explicit fields for architecture, module name, version, toolchain, link to relevant GH issue, short description of the problem and if the warning message should be displayed or not. We would have to enforce this formatting and there might be corner cases that we didn't anticipate, but this way is fairly general and would let sites that user other module naming schemes to grab this information and adapt the I will do a semi-manual conversion of the current yaml file and paste it here to propose a change. We could then also add some CI checks that make sure new items added contain at least the required fields. |
A proposal of what the yml file could contain: - aarch64/a64x:
- SciPy-bundle/2023.07-gfbf-2023a:
- software_name: SciPy-bundle
- software_version: 2023.07
- toolchain: gfbf
- toolchain_version: 2023a
- issue: https://github.com/EESSI/software-layer/issues/318
- info: "4 failing tests (vs 54407 passed) in scipy test suite"
- warn: true
- SciPy-bundle/2023.11-gfbf-2023b:
- software_name: SciPy-bundle
- software_version: 2023.11
- toolchain: gfbf
- toolchain_version: 2023b
- issue: https://github.com/EESSI/software-layer/issues/318
- info: "3 failing tests (vs 54875 passed) in scipy test suite" The Edit: Add some of the changes from support meeting (so I don't forget :) ) |
@Neves-P I would also keep track of software name, and then maybe this is better: - aarch64/a64x:
- SciPy-bundle/2023.07-gfbf-2023a:
- software:
- name: SciPy-bundle
- version: 2023.07
- toolchain:
- name: gfbf
- version: 2023a
- issue: https://github.com/EESSI/software-layer/issues/318
- info: "4 failing tests (vs 54407 passed) in scipy test suite"
- warn: true Maybe the |
A small number of software installed on EESSI have some issues in specific contexts that users should be made aware of (see support ticket #79).
This WIP PR adds
admin.list
files per architecture to use Lmod's module deprecating feature in order to display a message to users when they load a module with known issues. The information about the known issues comes from the YAML file(s) in the root of the software-layer repository: eessi-2023.06-known-issues.ymlThe warnings should appear only in the context where they apply, i.e., a user using
zen4
CPUs shouldn't be warned about a few failing tests for SciPy inneoverse_v1
.The
admin.list
files should have the following format:The first line can be a module name and version, but also the full path to a module. This is preferred, since it ensures that we are picking up the relevant module in the right context, and that we are not displaying unintended warnings should EESSI be mounted in a site that has repeating local installations of the same module.
This PR is marked as WIP because there is issue to solve still. Parsing the
known_issues.yml
file yields almost the correct path, but the module directory is incorrect as it is the easyconfig name that is recorded and not the module name. Compare (note dash between module name and version):Expected -
/cvmfs/software.eessi.io/versions/2023.06/software/linux/modules/all/aarch64/neoverse_v1/ESPResSo/4.2.1-foss-2023a
Obtained -
/cvmfs/software.eessi.io/versions/2023.06/software/linux/modules/all/aarch64/neoverse_v1/ESPResSo-4.2.1-foss-2023a
Converting between easyconfig name and module name is more complicated than I initially thought, because module names can be very variable and be composed of an unknown number of words.
I see two options:
known_issues.yml
file to match the expected output.