Skip to content

Commit

Permalink
[Docs] hint to enable use of GPU performance counters in profiling tools
Browse files Browse the repository at this point in the history
In case of multi-node distributed serving, there is a a helper [script](https://github.com/vllm-project/vllm/tree/main/examples/run_cluster.sh) to start the cluster.  However, this script understandably launches docker without administrative privileges and at times is it curtails the GPU profiling and tracing. The change in the document is to help the users know that if they want to use profiling tools (e.g. NVIDIA NSIGHT) in the docker, they can use `CAP_SYS_ADMIN` to the docker container by using the `--cap-add` option in the docker run command.
  • Loading branch information
bk-TurbaAI authored Dec 16, 2024
1 parent 2ca830d commit e1f8f66
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion docs/source/serving/distributed_serving.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ Multi-Node Inference and Serving

If a single node does not have enough GPUs to hold the model, you can run the model using multiple nodes. It is important to make sure the execution environment is the same on all nodes, including the model path, the Python environment. The recommended way is to use docker images to ensure the same environment, and hide the heterogeneity of the host machines via mapping them into the same docker configuration.

The first step, is to start containers and organize them into a cluster. We have provided a helper `script <https://github.com/vllm-project/vllm/tree/main/examples/run_cluster.sh>`_ to start the cluster.
The first step, is to start containers and organize them into a cluster. We have provided a helper `script <https://github.com/vllm-project/vllm/tree/main/examples/run_cluster.sh>`_ to start the cluster. Please note, this script launches docker without administrative privileges that would be required to access GPU performance counters when running profiling and tracing tools. For that purpose, the script can have `CAP_SYS_ADMIN` to the docker container by using the `--cap-add` option in the docker run command.

Pick a node as the head node, and run the following command:

Expand Down

0 comments on commit e1f8f66

Please sign in to comment.