Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-build Legion library #1042

Merged
merged 118 commits into from
Oct 23, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
118 commits
Select commit Hold shift + click to select a range
eb04a78
add optional flag for building legion only
DerrickYLJ Jul 24, 2023
35f02a7
added build path and legion-only flag
DerrickYLJ Jul 26, 2023
8ef6513
bug fix
DerrickYLJ Jul 26, 2023
f458e69
pass new variable with config file
DerrickYLJ Jul 27, 2023
0971852
move nccl
DerrickYLJ Jul 27, 2023
f372b36
bug fix
DerrickYLJ Jul 28, 2023
0acef9e
add cuda_arch list
DerrickYLJ Jul 28, 2023
a6cd35e
export position move
DerrickYLJ Jul 28, 2023
e139c2c
cd into legion
DerrickYLJ Jul 29, 2023
b9d7e03
quick fix
DerrickYLJ Jul 29, 2023
d8b2aa8
retrieve os version and cd directory
DerrickYLJ Jul 29, 2023
e808556
using ubuntu
DerrickYLJ Jul 30, 2023
e24c17a
directory fix
DerrickYLJ Jul 30, 2023
9d282d9
Merge branch 'master' into prebuild_legion
DerrickYLJ Jul 30, 2023
f537d24
bug fix
DerrickYLJ Jul 30, 2023
9683e76
Merge branch 'prebuild_legion' of https://github.com/flexflow/FlexFlo…
DerrickYLJ Jul 30, 2023
a9befef
add touch
DerrickYLJ Jul 30, 2023
2102875
create the release to flexflow-third-party
DerrickYLJ Aug 3, 2023
55f5d30
bug fix
DerrickYLJ Aug 3, 2023
cb474d4
bug fix
DerrickYLJ Aug 4, 2023
8540c88
fix indentation
goliaro Aug 4, 2023
e8f8d7b
fix
goliaro Aug 4, 2023
06727f9
bash launching
DerrickYLJ Aug 9, 2023
6a36405
bug fix
DerrickYLJ Aug 10, 2023
7681d29
bug fix
DerrickYLJ Aug 10, 2023
50181a0
extract tar file`
DerrickYLJ Aug 10, 2023
6458325
bug fix
DerrickYLJ Aug 10, 2023
dc47bcb
add parameter
DerrickYLJ Aug 10, 2023
5926b11
bash fix
DerrickYLJ Aug 14, 2023
f319038
python version
DerrickYLJ Aug 15, 2023
3a16d77
bug fix
DerrickYLJ Aug 15, 2023
cc4a414
bug fix
DerrickYLJ Aug 16, 2023
1e4de15
bug fix
DerrickYLJ Aug 16, 2023
ca2a79d
bug fix
DerrickYLJ Aug 16, 2023
904b77c
bug fix
DerrickYLJ Aug 16, 2023
2696277
build bash
DerrickYLJ Aug 16, 2023
5e69cc0
bug fix
DerrickYLJ Aug 17, 2023
c9f8aa0
bug fix
DerrickYLJ Aug 17, 2023
7853c6c
bug fix
DerrickYLJ Aug 17, 2023
6906849
bug fix
DerrickYLJ Aug 17, 2023
f474462
bug fix
DerrickYLJ Aug 17, 2023
1dab5ea
auto running docker container
DerrickYLJ Aug 17, 2023
37323fd
renew bash script
DerrickYLJ Aug 17, 2023
3066860
bug fix
DerrickYLJ Aug 17, 2023
d1f5fe5
bug fix
DerrickYLJ Aug 17, 2023
58a08f8
bug fix
DerrickYLJ Aug 17, 2023
befa1e2
non-running container
DerrickYLJ Aug 24, 2023
ed93547
bug fix
DerrickYLJ Aug 24, 2023
1100582
Merge branch 'master' into prebuild_legion
goliaro Aug 26, 2023
ee2eae0
make it easier to switch between inference and master branch
goliaro Aug 26, 2023
b5b70b4
multiple fixes
goliaro Aug 27, 2023
b036805
bug fix
DerrickYLJ Aug 27, 2023
46b37fb
bug fix
DerrickYLJ Aug 28, 2023
d653602
add python version
DerrickYLJ Aug 28, 2023
df5a9ff
bug fix
DerrickYLJ Aug 28, 2023
4b48970
restore
DerrickYLJ Aug 28, 2023
6fab6a6
enable building docker images for different hip versions
goliaro Aug 29, 2023
9c7c489
ignore shellcheck error code
goliaro Aug 29, 2023
812ad0b
support hip compilation in inference cmake files
goliaro Aug 29, 2023
c67e02c
fix
goliaro Aug 29, 2023
66610b0
workflow and hardcode
DerrickYLJ Aug 29, 2023
a8bb766
bug fix
DerrickYLJ Aug 29, 2023
9ab7dbd
fix
goliaro Aug 29, 2023
569a11d
cmake fix
goliaro Aug 29, 2023
d027861
python versions
DerrickYLJ Aug 29, 2023
e43f039
cmake fixes
goliaro Aug 29, 2023
ddcc5d1
cmake fixes
goliaro Aug 30, 2023
bad07f9
move install
DerrickYLJ Aug 30, 2023
eec6ea9
order
DerrickYLJ Aug 30, 2023
7b75fd1
bug fix
DerrickYLJ Aug 30, 2023
9edc2c2
nested if condition fix
goliaro Aug 30, 2023
c7b98f9
update docker workflow and config scripts
goliaro Aug 30, 2023
1d5d9c2
update scripts
goliaro Aug 30, 2023
ea2cdb0
fix
goliaro Aug 30, 2023
3bb76b3
fix
goliaro Aug 30, 2023
4933181
cleanup
goliaro Aug 30, 2023
796eddd
rocm 5.6 by default in workflow
goliaro Aug 30, 2023
9396cb8
move outside
DerrickYLJ Aug 30, 2023
7fb98b0
update workflow
goliaro Aug 30, 2023
c53062a
incorp install.sh
DerrickYLJ Aug 30, 2023
7942b0a
bug fix
DerrickYLJ Aug 30, 2023
63ad8f7
fix
goliaro Aug 30, 2023
c6a3b87
fix
goliaro Aug 30, 2023
f41e49b
fix
goliaro Aug 30, 2023
61af1c1
bg fix
DerrickYLJ Aug 30, 2023
188d0f1
fix permissions
goliaro Aug 30, 2023
64c9fca
bug fix
DerrickYLJ Aug 30, 2023
15860d8
bug fix
DerrickYLJ Aug 30, 2023
1c58371
bug fix
DerrickYLJ Aug 30, 2023
60e2a0e
bug fix
DerrickYLJ Aug 30, 2023
a548e8c
updated
DerrickYLJ Aug 30, 2023
0a6497e
bug fix
DerrickYLJ Aug 30, 2023
fe06e88
fix workflow
goliaro Aug 30, 2023
05fbb4c
check
DerrickYLJ Aug 30, 2023
66355aa
check
DerrickYLJ Aug 30, 2023
6a53d16
bug fix
DerrickYLJ Aug 30, 2023
49a0231
fix
goliaro Aug 30, 2023
7e81ec2
add python env
DerrickYLJ Aug 30, 2023
d483ed2
fix
goliaro Aug 30, 2023
8873c49
Merge branch 'amd_docker' into legion_workflow
goliaro Aug 31, 2023
65ccec3
cleanup
goliaro Aug 31, 2023
fab5f30
update workflow
goliaro Aug 31, 2023
bf943fa
Merge branch 'inference' into legion_workflow
goliaro Aug 31, 2023
a021f96
newline
goliaro Aug 31, 2023
345aeb9
added runner
DerrickYLJ Sep 10, 2023
254b160
added endif
DerrickYLJ Sep 18, 2023
d0688ee
Merge branch 'inference' into legion_workflow
goliaro Sep 21, 2023
98416b7
Code Cleanup
DerrickYLJ Oct 5, 2023
ce170cc
restore to self-hosted
DerrickYLJ Oct 5, 2023
3a1379f
Merge branch 'inference' into legion_workflow
goliaro Oct 5, 2023
8a99a23
bug fix
DerrickYLJ Oct 6, 2023
242ce2e
Merge branch 'inference' into legion_workflow
goliaro Oct 22, 2023
fe4dba2
fix
goliaro Oct 22, 2023
4d03f7b
fix
goliaro Oct 22, 2023
c284631
update workflow
goliaro Oct 22, 2023
5559374
fixes
goliaro Oct 22, 2023
69cb57e
Merge branch 'inference' into legion_workflow
goliaro Oct 22, 2023
8c0c5ae
fix cmake for hip rocm
goliaro Oct 22, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 75 additions & 0 deletions .github/workflows/helpers/prebuild_legion.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
#! /usr/bin/env bash
set -euo pipefail

# Parse input params
python_version=${python_version:-"empty"}
gpu_backend=${gpu_backend:-"empty"}
gpu_backend_version=${gpu_backend_version:-"empty"}

if [[ "${gpu_backend}" != @(cuda|hip_cuda|hip_rocm|intel) ]]; then
echo "Error, value of gpu_backend (${gpu_backend}) is invalid. Pick between 'cuda', 'hip_cuda', 'hip_rocm' or 'intel'."
exit 1
else
echo "Pre-building Legion with GPU backend: ${gpu_backend}"
fi

if [[ "${gpu_backend}" == "cuda" || "${FF_GPU_BACKEND}" == "hip_cuda" ]]; then
# Check that CUDA version is supported. Versions above 12.0 not supported because we don't publish docker images for it yet.
if [[ "$gpu_backend_version" != @(11.1|11.2|11.3|11.4|11.5|11.6|11.7|11.8|12.0) ]]; then
echo "cuda_version is not supported, please choose among {11.1|11.2|11.3|11.4|11.5|11.6|11.7|11.8|12.0}"
exit 1
fi
export cuda_version="$gpu_backend_version"
elif [[ "${gpu_backend}" == "hip_rocm" ]]; then
# Check that HIP version is supported
if [[ "$gpu_backend_version" != @(5.3|5.4|5.5|5.6) ]]; then
echo "hip_version is not supported, please choose among {5.3, 5.4, 5.5, 5.6}"
exit 1
fi
export hip_version="$gpu_backend_version"
else
echo "gpu backend: ${gpu_backend} and gpu_backend_version: ${gpu_backend_version} not yet supported."
exit 1
fi

# Cd into directory holding this script
cd "${BASH_SOURCE[0]%/*}"

export FF_GPU_BACKEND="${gpu_backend}"
export FF_CUDA_ARCH=all
export FF_HIP_ARCH=all
export BUILD_LEGION_ONLY=ON
export INSTALL_DIR="/usr/legion"
export python_version="${python_version}"

# Build Docker Flexflow Container
echo "building docker"
../../../docker/build.sh flexflow

# Cleanup any existing container with the same name
docker rm prelegion || true

# Create container to be able to copy data from the image
docker create --name prelegion flexflow-"${gpu_backend}"-"${gpu_backend_version}":latest

# Copy legion libraries to host
echo "extract legion library assets"
mkdir -p ../../../prebuilt_legion_assets
rm -rf ../../../prebuilt_legion_assets/tmp || true
docker cp prelegion:$INSTALL_DIR ../../../prebuilt_legion_assets/tmp


# Create the tarball file
cd ../../../prebuilt_legion_assets/tmp
export LEGION_TARBALL="legion_ubuntu-20.04_${gpu_backend}-${gpu_backend_version}_py${python_version}.tar.gz"

echo "Creating archive $LEGION_TARBALL"
tar -zcvf "../$LEGION_TARBALL" ./
cd ..
echo "Checking the size of the Legion tarball..."
du -h "$LEGION_TARBALL"


# Cleanup
rm -rf tmp/*
docker rm prelegion
84 changes: 84 additions & 0 deletions .github/workflows/prebuild-legion.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
name: "prebuild-legion"
on:
push:
branches:
- "inference"
paths:
- "cmake/**"
- "config/**"
- "deps/legion/**"
- ".github/workflows/helpers/install_dependencies.sh"
workflow_dispatch:
concurrency:
group: prebuild-legion-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

jobs:
prebuild-legion:
name: Prebuild Legion with CMake
runs-on: ubuntu-20.04
defaults:
run:
shell: bash -l {0} # required to use an activated conda environment
strategy:
matrix:
gpu_backend: ["cuda", "hip_rocm"]
gpu_backend_version: ["11.8", "5.6"]
python_version: "3.11"
exclude:
- gpu_backend: "cuda"
gpu_backend_version: "5.6"
- gpu_backend: "hip_rocm"
gpu_backend_version: "11.8"
fail-fast: false
steps:
- name: Checkout Git Repository
uses: actions/checkout@v3
with:
submodules: recursive

- name: Free additional space on runner
run: .github/workflows/helpers/free_space_on_runner.sh

- name: Build Legion
env:
FF_GPU_BACKEND: ${{ matrix.gpu_backend }}
run: .github/workflows/helpers/prebuild_legion.sh

- name: Archive compiled Legion library (CUDA)
env:
FF_GPU_BACKEND: ${{ matrix.gpu_backend }}
uses: actions/upload-artifact@v3
with:
name: legion_ubuntu-20.04_${{ matrix.gpu_backend }}-${{ matrix.gpu_backend_version }}_py${{ matrix.python_version }}
path: prebuilt_legion_assets/legion_ubuntu-20.04_${{ matrix.gpu_backend }}-${{ matrix.gpu_backend_version }}_py${{ matrix.python_version }}.tar.gz

create-release:
name: Create new release
runs-on: ubuntu-20.04
needs: prebuild-legion
steps:
- name: Checkout Git Repository
uses: actions/checkout@v3
- name: Free additional space on runner
run: .github/workflows/helpers/free_space_on_runner.sh
- name: Create folder for artifacts
run: mkdir artifacts unwrapped_artifacts
- name: Download artifacts
uses: actions/download-artifact@v3
with:
path: ./artifacts
- name: Display structure of downloaded files
working-directory: ./artifacts
run: ls -R
- name: Unwrap all artifacts
working-directory: ./artifacts
run: find . -maxdepth 2 -mindepth 2 -type f -name "*.tar.gz" -exec mv {} ../unwrapped_artifacts/ \;
- name: Get datetime
run: echo "RELEASE_DATETIME=$(date '+%Y-%m-%dT%H-%M-%S')" >> $GITHUB_ENV
- name: Release
env:
NAME: ${{ env.RELEASE_DATETIME }}
TAG_NAME: ${{ env.RELEASE_DATETIME }}
GITHUB_TOKEN: ${{ secrets.FLEXFLOW_TOKEN }}
run: gh release create $TAG_NAME ./unwrapped_artifacts/*.tar.gz --repo flexflow/flexflow-third-party
Loading
Loading