-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loading models from an S3 location instead of local path #3090
Comments
Similar to what @ikalista mentioned in original discussion, imo a better way is to mount a model storage to the container for model loading unless we want to rewrite the model loader to directly "stream" from S3 to GPU buffer like what Anyscale did. |
Sorry to bump an old issue here, but does this mean that |
@ywang96 is anybody working on the direct model loading, do we have a benchmark between mounting and directly loading to memory? Happy to work on this if nobody else is. |
Not in my knowledge. Feel free to work on this and thanks for your interest! |
@ashvinnihalani are you still working on this? This would be also be helpful to be able to load large models in environments where disk space isn't enough. The issue with mounting object storage is that it requires the platform operator to provide this. For example, certain K8s setups the user deploying vLLM may not have the required permissions for mounting object storage in their container. So that's why this would be a very valuable feature. |
Hey, The Streamer gives 2 main advantages:
You can read further in the whitepaper: https://pages.run.ai/hubfs/PDFs/White%20Papers/Model-Streamer-Performance-Benchmarks.pdf We have proposed a way to integrate it into vLLM. |
Discussed in #3072
Originally posted by petrosbaltzis February 28, 2024
Hello,
The VLLM library gives the ability to load the model and the tokenizer either from a local folder or directly from HuggingFace.
I wonder if this functionality can be extended to support s3 locations so that when we initialize the API server, we pass the proper S3 location.
Petros
The text was updated successfully, but these errors were encountered: