-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[V1] Add RayExecutor
support for AsyncLLM
(api server)
#11712
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
Signed-off-by: Kunshang Ji <[email protected]>
@@ -150,7 +151,11 @@ def _get_executor_cls(cls, vllm_config: VllmConfig) -> Type[Executor]: | |||
executor_class: Type[Executor] | |||
distributed_executor_backend = ( | |||
vllm_config.parallel_config.distributed_executor_backend) | |||
if distributed_executor_backend == "mp": | |||
if distributed_executor_backend == "ray": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of repeating this logic in both LLMEngine
and AsyncLLM
, can we instead move _get_executor_cls
into abstract Executor
? That way, we don't need to repeat logic across implementations. This is not a @classmethod
anyways
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Yeah we should unify them. cc @ruisearch42
@@ -23,6 +23,7 @@ | |||
from vllm.v1.engine.detokenizer import Detokenizer | |||
from vllm.v1.engine.processor import Processor | |||
from vllm.v1.executor.abstract import Executor | |||
from vllm.v1.executor.ray_utils import initialize_ray_cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a lazy import right? Doesn't this import ray?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will import Ray but won't error out if Ray is not installed:
try:
import ray
...
except ImportError:
ray = None
It errors out only when initialize_ray_cluster
is called:
def initialize_ray_cluster():
assert_ray_available()
@@ -150,7 +151,11 @@ def _get_executor_cls(cls, vllm_config: VllmConfig) -> Type[Executor]: | |||
executor_class: Type[Executor] | |||
distributed_executor_backend = ( | |||
vllm_config.parallel_config.distributed_executor_backend) | |||
if distributed_executor_backend == "mp": | |||
if distributed_executor_backend == "ray": | |||
initialize_ray_cluster(vllm_config.parallel_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is called in RayExecutor
constructor. No need to have it here.
Sorry forgot to disable auto-merge... @ruisearch42 could you submit a follow-up PR when you got a chance? Thanks |
Add
RayExecutor
support forAsyncLLM
(api server)