Skip to content

Commit

Permalink
fix format
Browse files Browse the repository at this point in the history
Signed-off-by: Hanzhi Zhou <[email protected]>
  • Loading branch information
hanzhi713 committed Nov 7, 2024
1 parent d89f510 commit 9307bfa
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion vllm/distributed/device_communicators/custom_all_reduce.py
Original file line number Diff line number Diff line change
Expand Up @@ -278,7 +278,7 @@ def custom_all_reduce(self, input: torch.Tensor) -> Optional[torch.Tensor]:
return torch.empty_like(input)
else:
# Note: outside of cuda graph context, custom allreduce incurs a
# cost of cudaMemcpy, which should be small (<=1% of overall
# cost of cudaMemcpy, which should be small (<=1% of overall
# latency) compared to the performance gain of using custom kernels
return self.all_reduce(input, registered=False)

Expand Down

0 comments on commit 9307bfa

Please sign in to comment.