[V1] Add `RayExecutor` support for `AsyncLLM` (api server) #11712

jikunshang · 2025-01-03T06:54:06Z

Add RayExecutor support for AsyncLLM (api server)

github-actions · 2025-01-03T06:54:18Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

Signed-off-by: Kunshang Ji <[email protected]>

robertgshaw2-neuralmagic · 2025-01-03T12:29:28Z

vllm/v1/engine/async_llm.py

@@ -150,7 +151,11 @@ def _get_executor_cls(cls, vllm_config: VllmConfig) -> Type[Executor]:
        executor_class: Type[Executor]
        distributed_executor_backend = (
            vllm_config.parallel_config.distributed_executor_backend)
-        if distributed_executor_backend == "mp":
+        if distributed_executor_backend == "ray":


Instead of repeating this logic in both LLMEngine and AsyncLLM, can we instead move _get_executor_cls into abstract Executor? That way, we don't need to repeat logic across implementations. This is not a @classmethod anyways

Good catch. Yeah we should unify them. cc @ruisearch42

robertgshaw2-neuralmagic · 2025-01-03T12:29:43Z

vllm/v1/engine/async_llm.py

@@ -23,6 +23,7 @@
 from vllm.v1.engine.detokenizer import Detokenizer
 from vllm.v1.engine.processor import Processor
 from vllm.v1.executor.abstract import Executor
+from vllm.v1.executor.ray_utils import initialize_ray_cluster


This should be a lazy import right? Doesn't this import ray?

This will import Ray but won't error out if Ray is not installed:

try: import ray ... except ImportError: ray = None

It errors out only when initialize_ray_cluster is called:

def initialize_ray_cluster(): assert_ray_available()

ruisearch42 · 2025-01-03T23:28:15Z

vllm/v1/engine/async_llm.py

@@ -150,7 +151,11 @@ def _get_executor_cls(cls, vllm_config: VllmConfig) -> Type[Executor]:
        executor_class: Type[Executor]
        distributed_executor_backend = (
            vllm_config.parallel_config.distributed_executor_backend)
-        if distributed_executor_backend == "mp":
+        if distributed_executor_backend == "ray":
+            initialize_ray_cluster(vllm_config.parallel_config)


This is called in RayExecutor constructor. No need to have it here.

comaniac · 2025-01-04T08:52:22Z

Sorry forgot to disable auto-merge... @ruisearch42 could you submit a follow-up PR when you got a chance? Thanks

jikunshang requested review from WoosukKwon, robertgshaw2-neuralmagic, njhill, ywang96, comaniac and alexm-neuralmagic as code owners January 3, 2025 06:54

jikunshang force-pushed the v1_ray branch from 5be65fd to 51f7128 Compare January 3, 2025 06:55

support ray

51f7128

Signed-off-by: Kunshang Ji <[email protected]>

comaniac approved these changes Jan 3, 2025

View reviewed changes

comaniac added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 3, 2025

comaniac enabled auto-merge (squash) January 3, 2025 07:33

robertgshaw2-neuralmagic reviewed Jan 3, 2025

View reviewed changes

ruisearch42 reviewed Jan 3, 2025

View reviewed changes

comaniac merged commit fbf2564 into vllm-project:main Jan 4, 2025
64 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[V1] Add `RayExecutor` support for `AsyncLLM` (api server) #11712

[V1] Add `RayExecutor` support for `AsyncLLM` (api server) #11712

jikunshang commented Jan 3, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Jan 3, 2025

robertgshaw2-neuralmagic Jan 3, 2025

comaniac Jan 3, 2025

robertgshaw2-neuralmagic Jan 3, 2025

comaniac Jan 3, 2025

ruisearch42 Jan 3, 2025

comaniac commented Jan 4, 2025

[V1] Add RayExecutor support for AsyncLLM (api server) #11712

[V1] Add RayExecutor support for AsyncLLM (api server) #11712

Conversation

jikunshang commented Jan 3, 2025 • edited by github-actions bot Loading

github-actions bot commented Jan 3, 2025

robertgshaw2-neuralmagic Jan 3, 2025

Choose a reason for hiding this comment

comaniac Jan 3, 2025

Choose a reason for hiding this comment

robertgshaw2-neuralmagic Jan 3, 2025

Choose a reason for hiding this comment

comaniac Jan 3, 2025

Choose a reason for hiding this comment

ruisearch42 Jan 3, 2025

Choose a reason for hiding this comment

comaniac commented Jan 4, 2025

[V1] Add `RayExecutor` support for `AsyncLLM` (api server) #11712

[V1] Add `RayExecutor` support for `AsyncLLM` (api server) #11712

jikunshang commented Jan 3, 2025 •

edited by github-actions bot

Loading