-
Notifications
You must be signed in to change notification settings - Fork 103
FAQ
1. What is UCC?
2. What are the important components of UCC reference implementation?
3. How can I participate?
4. How to compile and run UCC with Open MPI?
5. How to compile and run UCC with PyTorch?
6. What is TL scoring and how to select a certain TL?
7. What are the dependencies for UCC?
8. How to compile all TLs?
9. How to compile a specific TL?
10. How to compile and run UCC with OpenSHMEM Applications?
11. How to implement new TL for UCC?
12. Where I can find a simple UCC example?
13. How to configure UCC components with configuration file and priority?
14. Where can I find more details about the API and more UCC documentation?
UCC is a collective communication operations API and library that is flexible, complete, and feature-rich for current and emerging programming models and runtimes.
Please refer https://github.com/openucx/ucc/blob/master/docs/images/ucc_components.png
- Propose features, discuss issues, review design and code on GitHub
- Participate in the weekly working group meetings
- Mailing list: https://elist.ornl.gov/mailman/listinfo/ucx-group)
Please refer: https://github.com/openucx/ucc#open-mpi-and-ucc-collectives
UCC is available as internal ProcessGroup backend starting from PyTorch 2.0 release. Please refer to PyTorch ProcessGroup UCC backend for details on how to use UCC with earlier releases of PyTorch.
env var pattern: UCC_<TL/CL>_<NAME>_TUNE=token1#token2#...#tokenn, '#'
separated list of tokens
where token=coll_type:msg_range:mem_type:team_size:score:alg - a ':'
separated list of qualifiers.
Each qualifier is optional. The only requirement is that either "score" or "alg" is provided.
Qualifiers:
- coll_type = coll_type_1,coll_type_2,...,coll_type_n - a ',' separated list of coll_types
- msg_range = m_start_1-m_end_1,m_start_2-m_end_2,..,m_start_n-m_end_n - a ',' separated list of msg ranges, where each range is represented by "start" and "end" values separated by "-". Values can be numbers with "Size" characters, e.g. 128, 256b, 4K, 1M. Special value "inf" means MAX msg size.
- mem_type = m1,m2,..,mn - ',' separated list of memory types
- team_size = [t_start_1-t_end_1,t_start_2-t_end_2,...,t_start_n-t_end_n] - a ',' separated list of team size ranges enclosed with [].
- score = , a int value from 0 to "inf"
- alg = @<value|str> - character @ followed by either int number of string representing the collective algorithm.
Examples:
- UCC_TL_NCCL_TUNE=0 - disable all the NCCL collectives (score 0 is applied to ALL collectives since qualifier is not specified, similarly to ALL memory types, to default [0-inf] msg range and [0-inf] team size).
- UCC_TL_NCCL_TUNE=allreduce:cuda:inf#alltoall:0 - force NCCL allreduce for "cuda" buffers and disable alltoall
- UCC_TL_UCP_TUNE=bcast:0-4K:cuda:0#bcast:65k-1M:[25-100]:cuda:inf - disable UCP bcast on cuda buffers for msg sizes 0-4K and force UCP bcast on cuda buffers for msg sizes 65K-1M only for teams with 25-100 ranks
- UCC_TL_UCP_TUNE=allreduce:0-4K:@0#allreduce:4K-inf:@sra_knomial - for TL_UCP set allreduce algorithm to 0 for msg range 0-4K and to 1 (sra_knomial) for 4k-inf.
It depends on the system configuration, the workload that uses UCC, and TLs/CLs the user wants to enable.
- UCX
- NCCL
- Doxygen
All available TLs are compiled by default (--with-tls=all)
User can specify a list of specific TLs to be compiled, e.g. --with-tls=ucp: enables the only "ucp" tl build; --with-tls=sharp,nccl: enables build of tl/sharp and tl/nccl
For compilation instructions using OSHMEM with Open-MPI, please refer to: https://github.com/openucx/ucc#open-mpi-and-ucc-collectives
To run OpenSHMEM applications:
$ oshrun -np 2 --mca scoll_ucc_enable 1 --mca scoll_ucc_priority 100 ./my_openshmem_app
To run OpenSHMEM applications with one-sided collectives (i.e., Alltoall):
$ oshrun -np 2 --mca scoll_ucc_enable 1 --mca scoll_ucc_priority 100 -x UCC_TL_UCP_TUNE=alltoall:0-inf:@onesided ./my_openshmem_app
The UCC configuration file (ucc.conf) provides a unified way of tailoring the behavior of UCC components - CLs, TLs, and ECs. The configuration file can contain any UCC variables of the format VAR = VALUE
Example
Selecting a hierarchy CL
UCC_CLS=hier
Selecting a UCP TL
UCC_TLS=ucp
Selecting an algorithm
UCC_TL_SHARP_TUNE=allreduce:inf
Log info
UCC_TL_UCP_LOG_LEVEL=INFO
When multiple configuration files are found in the runtime environment, the priority is as follows:
- The file available via the environment variable UCC_CONFIG_FILE
- ucc.conf file in the $HOME
- ucc.conf found in the install <ucc_install_dir>/share/ucc.conf