Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README.md - typo correction #58

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitmodules
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
[submodule "third_party/oneCCL"]
path = third_party/oneCCL
url = https://github.com/intel-innersource/libraries.performance.communication.oneccl.git
url = https://github.com/oneapi-src/oneCCL.git
8 changes: 0 additions & 8 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,6 @@ set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wformat")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Werror=cpp")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wformat-security")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fstack-protector")
# Since 2016 Debian start using RUNPATH instead of normally RPATH, which gave the annoy effect that
# allow LD_LIBRARY_PATH to override dynamic linking path. Depends on intention of linking priority,
# change below for best outcome: disable, using RPATH, enable, using RUNPATH
if (ENABLE_LINKER_RUNPATH)
set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -Wl,--enable-new-dtags")
else()
set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -Wl,--disable-new-dtags")
endif()

set(LINUX TRUE)
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
Expand Down
94 changes: 63 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ This repository holds PyTorch bindings maintained by Intel for the Intel® oneAP

[PyTorch](https://github.com/pytorch/pytorch) is an open-source machine learning framework.

[Intel® oneCCL](https://github.com/oneapi-src/oneCCL) (collective communications library) is a library for efficient distributed deep learning training implementing such collectives like `allreduce`, `allgather`, `alltoall`. For more information on oneCCL, please refer to the [oneCCL documentation](https://spec.oneapi.com/versions/latest/elements/oneCCL/source/index.html) and [oneCCL specification](https://spec.oneapi.com/versions/latest/elements/oneCCL/source/index.html).
[Intel® oneCCL](https://github.com/oneapi-src/oneCCL) (collective communications library) is a library for efficient distributed deep learning training implementing such collectives like `allreduce`, `allgather`, `alltoall`. For more information on oneCCL, please refer to the [oneCCL documentation](https://spec.oneapi.com/versions/latest/elements/oneCCL/source/index.html).

`oneccl_bindings_for_pytorch` module implements PyTorch C10D ProcessGroup API and can be dynamically loaded as external ProcessGroup and only works on Linux platform now.

Expand All @@ -16,26 +16,29 @@ The table below shows which functions are available for use with CPU / Intel dGP

| | CPU | GPU |
| :--------------- | :---: | :---: |
| `send` | × | × |
| `recv` | × | × |
| `send` | × | |
| `recv` | × | |
| `broadcast` | √ | √ |
| `all_reduce` | √ | √ |
| `reduce` | √ | √ |
| `all_gather` | √ | √ |
| `gather` | √ | √ |
| `scatter` | × | × |
| `reduce_scatter` | × | × |
| `reduce_scatter` | | |
| `all_to_all` | √ | √ |
| `barrier` | √ | √ |


## Pytorch API Align

We recommend Anaconda as Python package management system. The following is the corresponding branches (tags) of `oneccl_bindings_for_pytorch` and supported Pytorch.
We recommend using Anaconda as Python package management system. The followings are the corresponding branches (tags) of `oneccl_bindings_for_pytorch` and supported Pytorch.

| `torch` | `oneccl_bindings_for_pytorch` |
| :-------------------------------------------------------------: | :-----------------------------------------------------------------------: |
| `master` | `master` |
| [v2.2.0](https://github.com/pytorch/pytorch/tree/v2.2.0) | [ccl_torch2.2.0+cpu](https://github.com/intel/torch-ccl/tree/ccl_torch2.2.0%2Bcpu) |
| [v2.1.0](https://github.com/pytorch/pytorch/tree/v2.1.0) | [ccl_torch2.1.0+cpu](https://github.com/intel/torch-ccl/tree/ccl_torch2.1.0%2Bcpu) |
| [v2.0.1](https://github.com/pytorch/pytorch/tree/v2.0.1) | [ccl_torch2.0.100](https://github.com/intel/torch-ccl/tree/ccl_torch2.0.100) |
| [v1.13](https://github.com/pytorch/pytorch/tree/v1.13) | [ccl_torch1.13](https://github.com/intel/torch-ccl/tree/ccl_torch1.13) |
| [v1.12.1](https://github.com/pytorch/pytorch/tree/v1.12.1) | [ccl_torch1.12.100](https://github.com/intel/torch-ccl/tree/ccl_torch1.12.100) |
| [v1.12.0](https://github.com/pytorch/pytorch/tree/v1.12.0) | [ccl_torch1.12](https://github.com/intel/torch-ccl/tree/ccl_torch1.12) |
Expand All @@ -45,33 +48,34 @@ We recommend Anaconda as Python package management system. The following is the
| [v1.8.1](https://github.com/pytorch/pytorch/tree/v1.8.1) | [ccl_torch1.8](https://github.com/intel/torch-ccl/tree/ccl_torch1.8) |
| [v1.7.1](https://github.com/pytorch/pytorch/tree/v1.7.1) | [ccl_torch1.7](https://github.com/intel/torch-ccl/tree/ccl_torch1.7) |
| [v1.6.0](https://github.com/pytorch/pytorch/tree/v1.6.0) | [ccl_torch1.6](https://github.com/intel/torch-ccl/tree/ccl_torch1.6) |
| [v1.5-rc3](https://github.com/pytorch/pytorch/tree/v1.5.0-rc3) | [beta09](https://github.com/intel/torch-ccl/tree/beta09) |
| [v1.5-rc3](https://github.com/pytorch/pytorch/tree/v1.5.0-rc3) | [beta09](https://github.com/intel/torch-ccl/tree/beta09) |

The usage details can be found in the README of corresponding branch. The following part is about the usage of v1.9 tag. if you want to use other version of torch-ccl please checkout to that branch(tag). For pytorch-1.5.0-rc3, the [#PR28068](https://github.com/pytorch/pytorch/pull/28068) and [#PR32361](https://github.com/pytorch/pytorch/pull/32361) are need to dynamicall register external ProcessGroup and enable `alltoall` collective communication primitive. The patch file about these two PRs is in `patches` directory and you can use it directly.

## Requirements

- Python 3.6 or later and a C++17 compiler
- Python 3.8 or later and a C++17 compiler

- PyTorch v1.13.0
- PyTorch v2.2.0

## Build Option List

The following build options are supported in Intel® oneCCL Bindings for PyTorch*.

| Build Option | Default Value | Description |
| :---------------------------------: | :------------: | :-------------------------------------------------------------------------------------------------: |
| COMPUTE_BACKEND | | Set oneCCL `COMPUTE_BACKEDN`,set to `dpcpp` and use DPC++ Compiler to enable support for Intel XPU |
| :---------------------------------- | :------------- | :-------------------------------------------------------------------------------------------------- |
| COMPUTE_BACKEND | | Set oneCCL `COMPUTE_BACKEND`,set to `dpcpp` and use DPC++ Compiler to enable support for Intel XPU |
| USE_SYSTEM_ONECCL | OFF | Use oneCCL library in system |
| CCL_PACKAGE_NAME | oneccl-bind-pt | Set Wheel Name |
| ONECCL_BINDINGS_FOR_PYTORCH_BACKEND | cpu | Set BACKEND |
| CCL_SHA_VERSION | False |add git head sha version to Wheel name |
| CCL_SHA_VERSION | False | add git head sha version to Wheel name |

## Lunch Option List
## Launch Option List

The following lunch options are supported in Intel® oneCCL Bindings for PyTorch*.
The following launch options are supported in Intel® oneCCL Bindings for PyTorch*.

| Lunch Option | Default Value | Description |
| :--------------------------------------: | :-----------: | :-------------------------------------------------------------------: |
| Launch Option | Default Value | Description |
| :--------------------------------------- | :------------ | :-------------------------------------------------------------------- |
| ONECCL_BINDINGS_FOR_PYTORCH_ENV_VERBOSE | 0 | Set verbose level in ONECCL_BINDINGS_FOR_PYTORCH |
| ONECCL_BINDINGS_FOR_PYTORCH_ENV_WAIT_GDB | 0 | Set 1 to force the oneccl_bindings_for_pytorch wait for GDB attaching |

Expand All @@ -96,23 +100,50 @@ The following lunch options are supported in Intel® oneCCL Bindings for PyTorch
# build with oneCCL from third party
COMPUTE_BACKEND=dpcpp python setup.py install
# build without oneCCL
BUILD_NO_ONECCL_PACKAGE=ON COMPUTE_BACKEND=dpcpp python setup.py install
export INTELONEAPIROOT=${HOME}/intel/oneapi
USE_SYSTEM_ONECCL=ON COMPUTE_BACKEND=dpcpp python setup.py install
```

### Install PreBuilt Wheel

Wheel files are avaiable for the following Python versions.

| Extension Version | Python 3.6 | Python 3.7 | Python 3.8 | Python 3.9 | Python 3.10 |
| :---------------: | :--------: | :--------: | :--------: | :--------: | :---------: |
| 1.13 | | √ | √ | √ | √ |
| 1.12.100 | | √ | √ | √ | √ |
| 1.12.0 | | √ | √ | √ | √ |
| 1.11.0 | | √ | √ | √ | √ |
| 1.10.0 | √ | √ | √ | √ | |
| Extension Version | Python 3.6 | Python 3.7 | Python 3.8 | Python 3.9 | Python 3.10 | Python 3.11 |
| :---------------: | :--------: | :--------: | :--------: | :--------: | :---------: | :---------: |
| 2.2.0 | | | √ | √ | √ | √ |
| 2.1.0 | | | √ | √ | √ | √ |
| 2.0.100 | | | √ | √ | √ | √ |
| 1.13 | | √ | √ | √ | √ | |
| 1.12.100 | | √ | √ | √ | √ | |
| 1.12.0 | | √ | √ | √ | √ | |
| 1.11.0 | | √ | √ | √ | √ | |
| 1.10.0 | √ | √ | √ | √ | | |

```bash
python -m pip install oneccl_bind_pt==1.13 -f https://developer.intel.com/ipex-whl-stable-cpu
python -m pip install oneccl_bind_pt==2.0.100 -f https://developer.intel.com/ipex-whl-stable-xpu
```

### Runtime Dynamic Linking

- If oneccl_bindings_for_pytorch is built without oneCCL and use oneCCL in system, dynamic link oneCCl from oneAPI basekit (recommended usage):

```bash
source $basekit_root/ccl/latest/env/vars.sh
```

Note: Make sure you have installed [basekit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/toolkits.html#base-kit) when using Intel® oneCCL Bindings for Pytorch\* on Intel® GPUs.

- If oneccl_bindings_for_pytorch is built with oneCCL from third party or installed from prebuilt wheel:
Dynamic link oneCCL and Intel MPI libraries:

```bash
source $(python -c "import oneccl_bindings_for_pytorch as torch_ccl;print(torch_ccl.cwd)")/env/setvars.sh
```

Dynamic link oneCCL only (not including Intel MPI):

```bash
source $(python -c "import oneccl_bindings_for_pytorch as torch_ccl;print(torch_ccl.cwd)")/env/vars.sh
```

## Usage
Expand Down Expand Up @@ -145,15 +176,12 @@ model = torch.nn.parallel.DistributedDataParallel(model, ...)
...
```

(oneccl_bindings_for_pytorch is installed along with the MPI tool set.)
(oneccl_bindings_for_pytorch is built without oneCCL, use oneCCL and MPI(if needed) in system)

```bash

source <oneccl_bindings_for_pytorch_path>/env/setvars.sh

# eg:
# $ oneccl_bindings_for_pytorch_path=$(python -c "from oneccl_bindings_for_pytorch import cwd; print(cwd)")
# $ source $oneccl_bindings_for_pytorch_path/env/setvars.sh
source $basekit_root/ccl/latest/env/vars.sh
source $basekit_root/mpi/latest/env/vars.sh
```

mpirun -n <N> -ppn <PPN> -f <hostfile> python example.py
```
Expand Down Expand Up @@ -226,6 +254,10 @@ mpirun -n 2 -l python profiling.py

```

## Known Issues

For Point-to-point communication, directly call dist.send/recv after initializing the process group in launch script will trigger runtime error. Because all ranks of the group are expected to participate in this call to create communicators in our current implementation, while dist.send/recv only has a pair of ranks' participation. As a result, dist.send/recv should be used after collective call, which ensures all ranks' participation. The further solution for supporting directly call dist.send/recv after initializing the process group is still under investigation.

## License

[BSD License](https://github.com/intel/torch-ccl/blob/master/LICENSE)
5 changes: 0 additions & 5 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -102,16 +102,11 @@ def build_cmake(self, extension: CMakeExtension):

if _check_env_flag('DEBUG'):
build_type = 'Debug'
run_path = 'OFF'
if _check_env_flag('RUNPATH'):
run_path = 'ON'

build_options = {
'CMAKE_BUILD_TYPE': build_type,
# The value cannot be easily obtained in CMakeLists.txt.
'CMAKE_PREFIX_PATH': torch.utils.cmake_prefix_path,
# Enable the RPATH of the oneCCL and torchCCL
'ENABLE_LINKER_RUNPATH': run_path,
# skip the example and test code in oneCCL
'BUILD_EXAMPLES': 'OFF',
'BUILD_CONFIG': 'OFF',
Expand Down
1 change: 1 addition & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ target_compile_options(oneccl_bindings_for_pytorch PUBLIC -Wall

if(COMPUTE_BACKEND STREQUAL "dpcpp")
add_subdirectory(./gpu)
add_definitions (-DUSE_GPU)
endif()

target_include_directories(oneccl_bindings_for_pytorch PUBLIC ./)
Expand Down
Loading