Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{devel}[foss/2024a] Triton v3.1.0 w/ CUDA 12.6.0 #22064

Open
wants to merge 15 commits into
base: develop
Choose a base branch
from

Conversation

ThomasHoffmann77
Copy link
Contributor

@ThomasHoffmann77 ThomasHoffmann77 commented Dec 19, 2024

(created using eb --new-pr)
requires:

…1.6_10dc3a8-GCCcore-13.3.0-CUDA-12.6.0.eb and patches: Triton-3.1.0_5fe38ff_eb_env_python_build.patch
Copy link

github-actions bot commented Dec 19, 2024

Updated software Triton-3.1.0-foss-2024a-CUDA-12.6.0.eb

Diff against Triton-2.1.0-foss-2023a-CUDA-12.1.1.eb

easybuild/easyconfigs/t/Triton/Triton-2.1.0-foss-2023a-CUDA-12.1.1.eb

diff --git a/easybuild/easyconfigs/t/Triton/Triton-2.1.0-foss-2023a-CUDA-12.1.1.eb b/easybuild/easyconfigs/t/Triton/Triton-3.1.0-foss-2024a-CUDA-12.6.0.eb
index a1cdbe87df..91576a9307 100644
--- a/easybuild/easyconfigs/t/Triton/Triton-2.1.0-foss-2023a-CUDA-12.1.1.eb
+++ b/easybuild/easyconfigs/t/Triton/Triton-3.1.0-foss-2024a-CUDA-12.6.0.eb
@@ -1,8 +1,14 @@
-easyblock = 'PythonPackage'
+# Update 3.1.0: Thomas Hoffmann, EMBL Heidelberg, [email protected], 2024/12
+
+easyblock = 'PythonBundle'
 
 name = 'Triton'
-version = '2.1.0'
+
+version = '3.1.0'
 versionsuffix = '-CUDA-%(cudaver)s'
+# There is no 3.1 in pypi and no 3.1-tag at github. However, 5fe38ffd is version bump 3.1 in the release_3.1.x branch:
+_commit = '5fe38ffd73c2ac6ed6323b554205186696631c6f'
+_clang_commit = '10dc3a8e916d73291269e5e2b82dd22681489aa1'  # acc. to cmake/llvm-hash.txt; 2024/05/23
 
 homepage = 'https://triton-lang.org/'
 
@@ -10,47 +16,116 @@ description = """Triton is a language and compiler for parallel programming. It
 Python-based programming environment for productively writing custom DNN compute
 kernels capable of running at maximal throughput on modern GPU hardware."""
 
-toolchain = {'name': 'foss', 'version': '2023a'}
+toolchain = {'name': 'foss', 'version': '2024a'}
 
 github_account = 'openai'
-source_urls = [GITHUB_LOWER_SOURCE]
-sources = ['v%(version)s.tar.gz']
-patches = [
-    '%(name)s-%(version)s-disable_rocm_support.patch',
-    '%(name)s-%(version)s-use_eb_env_python_build.patch',
-]
-checksums = [
-    {'v2.1.0.tar.gz': '4338ca0e80a059aec2671f02bfc9320119b051f378449cf5f56a1273597a3d99'},
-    {'Triton-2.1.0-disable_rocm_support.patch': 'e4d7c0947c3287b3f0871a004e8b483963f637c9fa3ef6212ac3a34660de2a7c'},
-    {'Triton-2.1.0-use_eb_env_python_build.patch': 'd68bf766c699ad6a778d9449d3bccdadc2f20f1f86ba13e1359ad297b12fbf7c'},
-]
 
 builddependencies = [
-    ('Clang', '17.0.0_20230515', versionsuffix),  # this is the exact commit that would be downloaded by Triton
-    ('CMake', '3.26.3'),
+    ('CMake', '3.29.3'),
+    ('Ninja', '1.12.1'),
+    ('pybind11', '2.13.6'),
+    ('poetry', '1.8.3'),
+    ('nlohmann_json', '3.11.3'),
+    ('googletest', '1.15.2'),
 ]
 
 dependencies = [
-    ('CUDA', '12.1.1', '', SYSTEM),
-    ('Python', '3.11.3'),
-    ('PyTorch', '2.1.2', versionsuffix),
+    ('CUDA', '12.6.0', '', SYSTEM),
+    ('Python', '3.12.3'),
+    ('Z3', '4.13.0'),
+]
+
+_llvm_confopts = [
+    # acc. to: 
+    # https://github.com/triton-lang/triton?tab=readme-ov-file#building-with-a-custom-llvm
+    '-DLLVM_ENABLE_ASSERTIONS=ON',
+    '-DLLVM_ENABLE_PROJECTS="mlir;llvm"',
+    '-DLLVM_TARGETS_TO_BUILD="X86;NVPTX"',
+]
+
+components = [
+    ('LLVM', _clang_commit, {
+        'easyblock': 'CMakeNinja',
+        'source_urls': ['https://github.com/llvm/llvm-project/archive/'],
+        'sources': [{
+            'download_filename': '%(version)s.tar.gz',
+            'filename': 'llvm-project-%(version)s.tar.gz',
+        }],
+        'checksums': [
+            {'llvm-project-10dc3a8e916d73291269e5e2b82dd22681489aa1.tar.gz':
+             '6ee5e0f9a49d41b5f48ebc4613ce3371f686bf70fcece9f849aba3c37bdeb3e8'},
+        ],
+        'start_dir': 'llvm-project-%(version)s',
+        'configopts': ' '.join(_llvm_confopts),
+        'srcdir': 'llvm',
+        'skipsteps': ['install']
+    })
 ]
 
 use_pip = True
-download_dep_fail = True
 
-start_dir = 'python'
+_tr_start_dir = 'python'
+
+_tr_preinstallopts = 'export PYBIND11_SYSPATH=$EBROOTPYBIND11 && '
+_tr_preinstallopts += 'export JSON_SYSPATH=$EBROOTNLOHMANN_JSON && '
+# use LLVM component in builddir:
+_tr_preinstallopts += 'export PATH=%(builddir)s/easybuild_obj/bin:$PATH && '
+_tr_preinstallopts += 'export LLVM_INCLUDE_DIRS=%(builddir)s/easybuild_obj/include && '
+_tr_preinstallopts += 'export LLVM_LIBRARY_DIR=%(builddir)s/easybuild_obj/lib && '
+_tr_preinstallopts += 'export LLVM_SYSPATH=%(builddir)s/easybuild_obj/ && '
 
-preinstallopts = 'export LLVM_INCLUDE_DIRS=$EBROOTCLANG/include && '
-preinstallopts += 'export LLVM_LIBRARY_DIR=$EBROOTCLANG/lib && '
-preinstallopts += 'export LLVM_SYSPATH=$EBROOTCLANG && '
-preinstallopts += 'export TRITON_BUILD_WITH_CLANG_LLD=1 && '
+_tr_preinstallopts += 'export TRITON_BUILD_WITH_CLANG_LLD=false && '
+_tr_preinstallopts += 'export TRITON_HOME=%(builddir)s && '
 
-# make pip print output of cmake
-installopts = "-v "
+_tr_installopts = "-v "
+
+exts_list = [
+    (name, version, {
+        'installopts': _tr_installopts,
+        'patches': [
+            'Triton-3.1.0_5fe38ff_eb_env_python_build.patch',
+            'Triton-3.1.0_5fe38ff_CUDA-12.6_ptx.patch',
+        ],
+        # ensure that libdevice.10.bc from $EBROOTCUDA/nvvm/libdevice is used:
+        'postinstallcmds': [
+            'rm -rf %(installdir)s/lib/python%(pyshortver)s/site-packages/triton/backends/nvidia/lib/libdevice.10.bc'
+        ],
+        'preinstallopts': _tr_preinstallopts,
+        'source_urls': ['https://github.com/triton-lang/triton/archive/'],
+        'sources': [{
+            'filename': 'v%%(version)s-%s.tar.gz' % _commit,
+            'download_filename': '%s.tar.gz' % _commit}],
+        'start_dir': 'python',
+        'checksums': [
+            {'v3.1.0-5fe38ffd73c2ac6ed6323b554205186696631c6f.tar.gz':
+             '933babc32b69872efbce05fe8be61129fecf52c724fadea42d8c7b2d10e16ad9'},
+            {'Triton-3.1.0_5fe38ff_eb_env_python_build.patch':
+             '6b46064b4892c7df340b6afd7ffb4abb2ea4486df9406626cd9b2c92a748705d'},
+            {'Triton-3.1.0_5fe38ff_CUDA-12.6_ptx.patch':
+             '2be8609141375ee381364ef74d74c12af598fc0b06357689c9f32d9f2514eff4'},
+        ],
+    }),
+    ('filelock', '3.15.1', {
+        'checksums': ['58a2549afdf9e02e10720eaa4d4470f56386d7a6f72edd7d0596337af8ed7ad8'],
+    }),
+]
 
 sanity_pip_check = True
 
-modluafooter = 'setenv("TRITON_PTXAS_PATH", pathJoin(os.getenv("CUDA_HOME"), "bin", "ptxas"))'
+modluafooter = """
+setenv("TRITON_PTXAS_PATH", pathJoin(os.getenv("CUDA_HOME"), "bin", "ptxas"))
+"""
+# ensure that libdevice.10.bc from $EBROOTCUDA/nvvm/libdevice is used:
+modluafooter += """
+setenv("TRITON_LIBDEVICE_PATH", pathJoin(os.getenv("CUDA_HOME"), "nvvm", "libdevice", "libdevice.10.bc"))
+"""
+
+modtclfooter = """
+setenv TRITON_PTXAS_PATH $::env(TRITON_PTXAS_PATH)/bin/ptxas
+"""
+# ensure that libdevice.10.bc from $EBROOTCUDA/nvvm/libdevice is used:
+modtclfooter += """
+setenv TRITON_LIBDEVICE_PATH $::env(CUDA_HOME)/nvvm/lidevice/libdevice.10.bc
+"""
 
 moduleclass = 'devel'
Diff against Triton-1.1.1-foss-2022a-CUDA-11.7.0.eb

easybuild/easyconfigs/t/Triton/Triton-1.1.1-foss-2022a-CUDA-11.7.0.eb

diff --git a/easybuild/easyconfigs/t/Triton/Triton-1.1.1-foss-2022a-CUDA-11.7.0.eb b/easybuild/easyconfigs/t/Triton/Triton-3.1.0-foss-2024a-CUDA-12.6.0.eb
index c7b10ad68a..91576a9307 100644
--- a/easybuild/easyconfigs/t/Triton/Triton-1.1.1-foss-2022a-CUDA-11.7.0.eb
+++ b/easybuild/easyconfigs/t/Triton/Triton-3.1.0-foss-2024a-CUDA-12.6.0.eb
@@ -1,8 +1,14 @@
-easyblock = 'PythonPackage'
+# Update 3.1.0: Thomas Hoffmann, EMBL Heidelberg, [email protected], 2024/12
+
+easyblock = 'PythonBundle'
 
 name = 'Triton'
-version = '1.1.1'
+
+version = '3.1.0'
 versionsuffix = '-CUDA-%(cudaver)s'
+# There is no 3.1 in pypi and no 3.1-tag at github. However, 5fe38ffd is version bump 3.1 in the release_3.1.x branch:
+_commit = '5fe38ffd73c2ac6ed6323b554205186696631c6f'
+_clang_commit = '10dc3a8e916d73291269e5e2b82dd22681489aa1'  # acc. to cmake/llvm-hash.txt; 2024/05/23
 
 homepage = 'https://triton-lang.org/'
 
@@ -10,40 +16,116 @@ description = """Triton is a language and compiler for parallel programming. It
 Python-based programming environment for productively writing custom DNN compute
 kernels capable of running at maximal throughput on modern GPU hardware."""
 
-toolchain = {'name': 'foss', 'version': '2022a'}
+toolchain = {'name': 'foss', 'version': '2024a'}
 
 github_account = 'openai'
-source_urls = [GITHUB_LOWER_SOURCE]
-sources = ['v%(version)s.tar.gz']
-patches = [
-    'Triton-%(version)s-disable_rocm_support.patch',
-    'Triton-%(version)s-use_eb_env_python_build.patch',
-]
-checksums = [
-    {'v1.1.1.tar.gz': '6b0e4a4375068938f7045819987b51299762abf0b1f39948f839d069ed9366bc'},
-    {'Triton-1.1.1-disable_rocm_support.patch': 'abdd50246c668d7fe9889bbe4e8ca84ea4b1b762e814f099919bcbee7c037c62'},
-    {'Triton-1.1.1-use_eb_env_python_build.patch': '428a86da560b5f4353e956452f495ec022dcfbb51aa283dab50551369d7838b4'},
-]
 
 builddependencies = [
-    ('Clang', '13.0.1', versionsuffix),
-    ('CMake', '3.23.1'),
+    ('CMake', '3.29.3'),
+    ('Ninja', '1.12.1'),
+    ('pybind11', '2.13.6'),
+    ('poetry', '1.8.3'),
+    ('nlohmann_json', '3.11.3'),
+    ('googletest', '1.15.2'),
 ]
 
 dependencies = [
-    ('CUDA', '11.7.0', '', SYSTEM),
-    ('Python', '3.10.4'),
-    ('PyTorch', '1.12.0', versionsuffix),
+    ('CUDA', '12.6.0', '', SYSTEM),
+    ('Python', '3.12.3'),
+    ('Z3', '4.13.0'),
+]
+
+_llvm_confopts = [
+    # acc. to: 
+    # https://github.com/triton-lang/triton?tab=readme-ov-file#building-with-a-custom-llvm
+    '-DLLVM_ENABLE_ASSERTIONS=ON',
+    '-DLLVM_ENABLE_PROJECTS="mlir;llvm"',
+    '-DLLVM_TARGETS_TO_BUILD="X86;NVPTX"',
+]
+
+components = [
+    ('LLVM', _clang_commit, {
+        'easyblock': 'CMakeNinja',
+        'source_urls': ['https://github.com/llvm/llvm-project/archive/'],
+        'sources': [{
+            'download_filename': '%(version)s.tar.gz',
+            'filename': 'llvm-project-%(version)s.tar.gz',
+        }],
+        'checksums': [
+            {'llvm-project-10dc3a8e916d73291269e5e2b82dd22681489aa1.tar.gz':
+             '6ee5e0f9a49d41b5f48ebc4613ce3371f686bf70fcece9f849aba3c37bdeb3e8'},
+        ],
+        'start_dir': 'llvm-project-%(version)s',
+        'configopts': ' '.join(_llvm_confopts),
+        'srcdir': 'llvm',
+        'skipsteps': ['install']
+    })
 ]
 
 use_pip = True
-download_dep_fail = True
 
-start_dir = 'python'
+_tr_start_dir = 'python'
+
+_tr_preinstallopts = 'export PYBIND11_SYSPATH=$EBROOTPYBIND11 && '
+_tr_preinstallopts += 'export JSON_SYSPATH=$EBROOTNLOHMANN_JSON && '
+# use LLVM component in builddir:
+_tr_preinstallopts += 'export PATH=%(builddir)s/easybuild_obj/bin:$PATH && '
+_tr_preinstallopts += 'export LLVM_INCLUDE_DIRS=%(builddir)s/easybuild_obj/include && '
+_tr_preinstallopts += 'export LLVM_LIBRARY_DIR=%(builddir)s/easybuild_obj/lib && '
+_tr_preinstallopts += 'export LLVM_SYSPATH=%(builddir)s/easybuild_obj/ && '
 
-# make pip print output of cmake
-installopts = "-v "
+_tr_preinstallopts += 'export TRITON_BUILD_WITH_CLANG_LLD=false && '
+_tr_preinstallopts += 'export TRITON_HOME=%(builddir)s && '
+
+_tr_installopts = "-v "
+
+exts_list = [
+    (name, version, {
+        'installopts': _tr_installopts,
+        'patches': [
+            'Triton-3.1.0_5fe38ff_eb_env_python_build.patch',
+            'Triton-3.1.0_5fe38ff_CUDA-12.6_ptx.patch',
+        ],
+        # ensure that libdevice.10.bc from $EBROOTCUDA/nvvm/libdevice is used:
+        'postinstallcmds': [
+            'rm -rf %(installdir)s/lib/python%(pyshortver)s/site-packages/triton/backends/nvidia/lib/libdevice.10.bc'
+        ],
+        'preinstallopts': _tr_preinstallopts,
+        'source_urls': ['https://github.com/triton-lang/triton/archive/'],
+        'sources': [{
+            'filename': 'v%%(version)s-%s.tar.gz' % _commit,
+            'download_filename': '%s.tar.gz' % _commit}],
+        'start_dir': 'python',
+        'checksums': [
+            {'v3.1.0-5fe38ffd73c2ac6ed6323b554205186696631c6f.tar.gz':
+             '933babc32b69872efbce05fe8be61129fecf52c724fadea42d8c7b2d10e16ad9'},
+            {'Triton-3.1.0_5fe38ff_eb_env_python_build.patch':
+             '6b46064b4892c7df340b6afd7ffb4abb2ea4486df9406626cd9b2c92a748705d'},
+            {'Triton-3.1.0_5fe38ff_CUDA-12.6_ptx.patch':
+             '2be8609141375ee381364ef74d74c12af598fc0b06357689c9f32d9f2514eff4'},
+        ],
+    }),
+    ('filelock', '3.15.1', {
+        'checksums': ['58a2549afdf9e02e10720eaa4d4470f56386d7a6f72edd7d0596337af8ed7ad8'],
+    }),
+]
 
 sanity_pip_check = True
 
+modluafooter = """
+setenv("TRITON_PTXAS_PATH", pathJoin(os.getenv("CUDA_HOME"), "bin", "ptxas"))
+"""
+# ensure that libdevice.10.bc from $EBROOTCUDA/nvvm/libdevice is used:
+modluafooter += """
+setenv("TRITON_LIBDEVICE_PATH", pathJoin(os.getenv("CUDA_HOME"), "nvvm", "libdevice", "libdevice.10.bc"))
+"""
+
+modtclfooter = """
+setenv TRITON_PTXAS_PATH $::env(TRITON_PTXAS_PATH)/bin/ptxas
+"""
+# ensure that libdevice.10.bc from $EBROOTCUDA/nvvm/libdevice is used:
+modtclfooter += """
+setenv TRITON_LIBDEVICE_PATH $::env(CUDA_HOME)/nvvm/lidevice/libdevice.10.bc
+"""
+
 moduleclass = 'devel'

downgrade pybind11 dependency to v2.12.0
…to Clang-19.0.0_10dc3a8-GCCcore-13.3.0-CUDA-12.6.0.eb
change Clang version acc. to ./llvm/utils/gn/secondary/llvm/version.gni
@ThomasHoffmann77 ThomasHoffmann77 changed the title {compiler,devel}[GCCcore/13.3.0,foss/2024a] Triton v3.1.0, Clang v18.1.6_10dc3a8 w/ CUDA 12.6.0 {compiler,devel}[GCCcore/13.3.0,foss/2024a] Triton v3.1.0, Clang v19.0.0_10dc3a8 w/ CUDA 12.6.0 Dec 20, 2024
@smoors
Copy link
Contributor

smoors commented Dec 25, 2024

@boegelbot: please test @ generoso

@boegelbot
Copy link
Collaborator

@smoors: Request for testing this PR well received on login1

PR test command 'EB_PR=22064 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs /opt/software/slurm/bin/sbatch --job-name test_PR_22064 --ntasks=4 ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 14902

Test results coming soon (I hope)...

- notification for comment with ID 2561743004 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@smoors
Copy link
Contributor

smoors commented Dec 25, 2024

@boegelbot: please test @ generoso

@boegelbot
Copy link
Collaborator

@smoors: Request for testing this PR well received on login1

PR test command 'EB_PR=22064 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs /opt/software/slurm/bin/sbatch --job-name test_PR_22064 --ntasks=4 ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 14903

Test results coming soon (I hope)...

- notification for comment with ID 2561991710 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
FAILED
Build succeeded for 0 out of 2 (2 easyconfigs in total)
cns1 - Linux Rocky Linux 8.9, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/boegelbot/eb092c5a0dfdd4648bb1b864f82cfae8 for a full test report.

@smoors
Copy link
Contributor

smoors commented Dec 25, 2024

@boegelbot please test @ jsc-zen3

@boegelbot
Copy link
Collaborator

@smoors: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=22064 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_22064 --ntasks=8 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 5479

Test results coming soon (I hope)...

- notification for comment with ID 2561992640 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
FAILED
Build succeeded for 0 out of 2 (2 easyconfigs in total)
jsczen3c1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.5, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.19
See https://gist.github.com/boegelbot/45cd9bde45669911766b1f70f9ea3052 for a full test report.

@smoors
Copy link
Contributor

smoors commented Dec 26, 2024

Sanity check failed: no file found at 'lib/libomp.so' in /project/def-maintainers/boegelbot/rocky9/zen3/software/Clang/19.0.0_10dc3a8-GCCcore-13.3.0-CUDA-12.6.0

@ThomasHoffmann77
Copy link
Contributor Author

Can we use Clang 18.1.8?: Triton 3.1.0 uses ConstantEnumCase and llvm::max_element -> patch?

@ThomasHoffmann77 ThomasHoffmann77 changed the title {compiler,devel}[GCCcore/13.3.0,foss/2024a] Triton v3.1.0, Clang v19.0.0_10dc3a8 w/ CUDA 12.6.0 {compiler,devel}[GCCcore/13.3.0,foss/2024a] Triton v3.1.0 w/ CUDA 12.6.0 Jan 9, 2025
@ThomasHoffmann77
Copy link
Contributor Author

@smoors I removed the Clang 19.0.0 EC because it probably requires adaption of the EB's sanity check. Instead I build llvm+mlir as a component.

@ThomasHoffmann77 ThomasHoffmann77 changed the title {compiler,devel}[GCCcore/13.3.0,foss/2024a] Triton v3.1.0 w/ CUDA 12.6.0 {devel}[foss/2024a] Triton v3.1.0 w/ CUDA 12.6.0 Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants