Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] CI job generator: define in the version.py which backend combinations should be tested #2099

Merged

Conversation

SimeonEhrig
Copy link
Member

@SimeonEhrig SimeonEhrig commented Aug 28, 2023

Before, the backends was enabled depending on the host and device compiler.
For each host device compiler combination only one backend combination was possible.
Now, the backend combinations are defined in the versions.py and each combination can be combined with a host device compiler combination, if no filter rule forbid it.

Is preparation for enabling GCC and Clang CPU tests in the GitLab CI.
Should solve also issue #639 -> no, see here

@SimeonEhrig
Copy link
Member Author

The failed TSAN job is independent of this PR, see #2101.

@SimeonEhrig
Copy link
Member Author

SimeonEhrig commented Aug 29, 2023

There is no rule, which requires an enabled OpenMP2 backend for nvcc as device compiler. Therefore both backend combinations are tested.

# nvcc
[
ALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLE,
ALPAKA_ACC_CPU_B_SEQ_T_THREADS_ENABLE,
ALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLE,
ALPAKA_ACC_CPU_B_SEQ_T_OMP2_ENABLE,
ALPAKA_ACC_GPU_CUDA_ENABLE,
],
# clang-cuda
# OpenMP is not supported for clang as cuda compiler
# https://github.com/alpaka-group/alpaka/issues/639
# therefore a extra combination is required
[
ALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLE,
ALPAKA_ACC_CPU_B_SEQ_T_THREADS_ENABLE,
ALPAKA_ACC_GPU_CUDA_ENABLE,
],

@psychocoderHPC Any opinion about the behavior?

@SimeonEhrig SimeonEhrig marked this pull request as ready for review August 29, 2023 05:21
@SimeonEhrig SimeonEhrig requested a review from j-stephan August 29, 2023 05:21
@psychocoderHPC
Copy link
Member

There is no rule, which requires an enabled OpenMP2 backend for nvcc as device compiler. Therefore both backend combinations are tested.

# nvcc
[
ALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLE,
ALPAKA_ACC_CPU_B_SEQ_T_THREADS_ENABLE,
ALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLE,
ALPAKA_ACC_CPU_B_SEQ_T_OMP2_ENABLE,
ALPAKA_ACC_GPU_CUDA_ENABLE,
],
# clang-cuda
# OpenMP is not supported for clang as cuda compiler
# https://github.com/alpaka-group/alpaka/issues/639
# therefore a extra combination is required
[
ALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLE,
ALPAKA_ACC_CPU_B_SEQ_T_THREADS_ENABLE,
ALPAKA_ACC_GPU_CUDA_ENABLE,
],

@psychocoderHPC Any opinion about the behavior?

@SimeonEhrig looks good for me.

psychocoderHPC
psychocoderHPC previously approved these changes Aug 29, 2023
# https://github.com/alpaka-group/alpaka/issues/639
# therefore a extra combination is required
[
ALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLE,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can I see that this is for clang-cuda except by the comment? So I am wondering how the generator know that for clang-cuda this configuration should be used and not the first entry in BACKENDS?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot see it directly in code, because it is decided by filter rules. So clang-cuda requires ALPAKA_ACC_GPU_CUDA_ENABLE and does not allow the OMP2 backends:

# OpenMP is not supported for clang as cuda compiler
# https://github.com/alpaka-group/alpaka/issues/639
if row_check_name(row, DEVICE_COMPILER, "==", CLANG_CUDA) and (
row_check_backend_version(row, ALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLE, "==", ON_VER)
or row_check_backend_version(
row, ALPAKA_ACC_CPU_B_SEQ_T_OMP2_ENABLE, "==", ON_VER
)
):
return False

You can check if a compiler backend passes the filter with ajc-validate tool (except the alpaka filter, I need to implement support for this 🤔 ).

Copy link
Member

@j-stephan j-stephan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few suggestions:

script/job_generator/generate_job_yaml.py Outdated Show resolved Hide resolved
script/job_generator/generate_job_yaml.py Outdated Show resolved Hide resolved
script/job_generator/generate_job_yaml.py Outdated Show resolved Hide resolved
script/job_generator/versions.py Outdated Show resolved Hide resolved
script/job_generator/versions.py Outdated Show resolved Hide resolved
script/job_generator/versions.py Outdated Show resolved Hide resolved
script/job_generator/versions.py Outdated Show resolved Hide resolved
script/job_generator/versions.py Outdated Show resolved Hide resolved
@SimeonEhrig SimeonEhrig force-pushed the refactorBackendDefinitonCI branch 2 times, most recently from 2c6e995 to 8391949 Compare August 29, 2023 11:26
j-stephan
j-stephan previously approved these changes Aug 29, 2023
@SimeonEhrig SimeonEhrig marked this pull request as draft August 29, 2023 12:04
@SimeonEhrig
Copy link
Member Author

I stopped the merge process, because I found a bug in #2102 which already exist in this PR. The bug is simply not triggered, because we don't use a specific combination of enabled backends.

… should be tested

- Before, the backends was enabled depending on the host and device compiler.
- For each host device compiler combination only one backend combination was possible.
- Now, the backend combinations are defined in the versions.py and each combination can be combined with a host device compiler combination, if no filter rule forbid it.
- Add filter rule to disable OpenMP for Clang-CUDA.
@SimeonEhrig SimeonEhrig force-pushed the refactorBackendDefinitonCI branch from 8391949 to 52a4892 Compare August 30, 2023 11:17
@SimeonEhrig SimeonEhrig marked this pull request as ready for review August 31, 2023 07:23
@SimeonEhrig
Copy link
Member Author

I stopped the merge process, because I found a bug in #2102 which already exist in this PR. The bug is simply not triggered, because we don't use a specific combination of enabled backends.

@j-stephan The bug is fixed in the alpaka-job-coverage library 1.5.3. and tests passed: #2102

Therefore the PR is ready to merge. Since your last review I only increased alpaka-job-coverage library version.

@j-stephan j-stephan merged commit af4e30b into alpaka-group:develop Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants