-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does vllm support the Mac/Metal/MPS? #1441
Comments
Hi @Phil-U-U, vLLM does not support MPS backend at the moment as its main target scenario is to be deployed as a high-throughput server running on powerful accelerators like NVIDIA A100/H100. |
That's understandable. Although, with Mac Studio, many people and companies are starting to use the Mac as LLM servers. I get very good t/s on Mac Studio M2 Ultra using llama.cpp, but something like vLLM on the Mac would be a game changer. I hope the dev team considers this. |
I'm with @ibehnam , Mac studio would be great for vllm |
+1 |
5 similar comments
+1 |
+1 |
+1 |
+1 |
+1 |
Right, many developers are on Macs while testing out architectures. Yes we'll commonly deploy to linux, but if we can't quickly test and experiment locally on our daily use machines... we'll mostly like look elsewhere for an inference engine. Would be great if it could be supported, even if performance suffers on 'Apple Metal'. |
+1 |
Does this issue still exist? |
@evolu8 Other solutions such as llama.cpp, LLM-MLC, Apple MLX, etc.? |
Does anyone know about any alternatives to VLLM until this is supported? |
same here please +1 |
+1On Feb 20, 2024, at 3:13 AM, Ather Shu ***@***.***> wrote:
+1
pip install vllm failed on my mac m1 pro
image.png (view on web)
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>
|
+1 |
2 similar comments
+1 |
+1 |
+1 I use a macbook and a mac studio. With the M3 architecture's unified memory these are very capable and cost effective systems for development and testing. |
I'm willing to sponsor that effort because what we pay will be in savings in simply hardware on a mac. |
+1 |
+10000 |
+1 It would be very useful |
it is not possible for vllm to support mac when triton https://github.com/openai/triton does not support mac. |
+1 mac support would be very useful for experimentation |
I was about to try it with Snowflake arctic AI on my mac M2 Max hosted in AWS and I couldn't :/ |
More reason to support Mac. Best in class $/token: https://x.com/awnihannun/status/1786069640948719956 |
Guys, please add a "like" in the main comment. |
okay |
+1 |
2 similar comments
+1 |
+1 |
I wish install vllm in m1 mac pro how to install ? |
Not supported :( |
If on Apple silicon, then have a look at Exo Platform (OSS), which supports MLX (using python) and allows inference with LLM sharding using multiple Macs @ https://github.com/exoplatform |
Yeah, this is common, in my experience of running so many models locally, cuda and GPUs is way worse than a simple M1 pro. |
+1 |
+1 |
3 similar comments
+1 |
+1 |
+1 |
+1
…On Sat, Oct 5, 2024, 9:19 PM ExtraTon618 ***@***.***> wrote:
+1
—
Reply to this email directly, view it on GitHub
<#1441 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ALWGZH4M5ZOPLQLP4RIBZITZ2C24DAVCNFSM6AAAAAA6JX4GAKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJVGI4DONBQHA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
+1 would love to have vllm support for mac |
+1 |
1 similar comment
+1 |
+1
…On Wed, Oct 30, 2024, 1:25 PM Nikita Bragin ***@***.***> wrote:
+1
—
Reply to this email directly, view it on GitHub
<#1441 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ALWGZH4U2XBFQROZRWIKWJLZ6E6BZAVCNFSM6AAAAAA6JX4GAKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBYGMYDANRZGI>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
+1 |
1 similar comment
+1 |
+10086 |
+10087 |
+10088 |
+1111111 |
I think #9228 should solve it now. You can build and run using macos. That's not for performance though, mainly for developing. |
+1 |
Compile failed on mac m3. There are a few problem I found:
|
I ran into the error when pip install vllm in Mac:
RuntimeError: Cannot find CUDA_HOME. CUDA must be available to build the package.
The text was updated successfully, but these errors were encountered: