Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does vllm support the Mac/Metal/MPS? #1441

Closed
Phil-U-U opened this issue Oct 21, 2023 · 75 comments
Closed

Does vllm support the Mac/Metal/MPS? #1441

Phil-U-U opened this issue Oct 21, 2023 · 75 comments

Comments

@Phil-U-U
Copy link

I ran into the error when pip install vllm in Mac:
RuntimeError: Cannot find CUDA_HOME. CUDA must be available to build the package.

@WoosukKwon
Copy link
Collaborator

Hi @Phil-U-U, vLLM does not support MPS backend at the moment as its main target scenario is to be deployed as a high-throughput server running on powerful accelerators like NVIDIA A100/H100.

@ibehnam
Copy link

ibehnam commented Dec 12, 2023

Hi @Phil-U-U, vLLM does not support MPS backend at the moment as its main target scenario is to be deployed as a high-throughput server running on powerful accelerators like NVIDIA A100/H100.

That's understandable. Although, with Mac Studio, many people and companies are starting to use the Mac as LLM servers. I get very good t/s on Mac Studio M2 Ultra using llama.cpp, but something like vLLM on the Mac would be a game changer. I hope the dev team considers this.

@willtejeda
Copy link

I'm with @ibehnam , Mac studio would be great for vllm

@oushu1zhangxiangxuan1
Copy link
Contributor

+1

5 similar comments
@joeywen
Copy link

joeywen commented Dec 20, 2023

+1

@vincent-pli
Copy link

+1

@jtoy
Copy link

jtoy commented Dec 30, 2023

+1

@csmac3144
Copy link

+1

@N8python
Copy link

+1

@evolu8
Copy link

evolu8 commented Jan 16, 2024

Right, many developers are on Macs while testing out architectures. Yes we'll commonly deploy to linux, but if we can't quickly test and experiment locally on our daily use machines... we'll mostly like look elsewhere for an inference engine. Would be great if it could be supported, even if performance suffers on 'Apple Metal'.

@anisingh1
Copy link

+1

@antahiap
Copy link

Does this issue still exist?

@ibehnam
Copy link

ibehnam commented Jan 21, 2024

@evolu8 Other solutions such as llama.cpp, LLM-MLC, Apple MLX, etc.?

@sislam-provenir
Copy link

Does anyone know about any alternatives to VLLM until this is supported?

@bluenevus
Copy link

same here please +1

@shushenghong
Copy link

+1
pip install vllm failed on my mac m1 pro
image

@jtoy
Copy link

jtoy commented Feb 20, 2024 via email

@bluenevus
Copy link

+1

2 similar comments
@WH2zzZ
Copy link

WH2zzZ commented Feb 25, 2024

+1

@emitrokhin
Copy link

+1

@csawtelle
Copy link

+1 I use a macbook and a mac studio. With the M3 architecture's unified memory these are very capable and cost effective systems for development and testing.

@bluenevus
Copy link

I'm willing to sponsor that effort because what we pay will be in savings in simply hardware on a mac.

@acodercat
Copy link

+1

@chen-bowen
Copy link

+10000

@NicolaDonelli
Copy link

+1 It would be very useful

@youkaichao
Copy link
Member

it is not possible for vllm to support mac when triton https://github.com/openai/triton does not support mac.

@balasaajay
Copy link

balasaajay commented Apr 21, 2024

+1 mac support would be very useful for experimentation

@AllanOricil
Copy link

I was about to try it with Snowflake arctic AI on my mac M2 Max hosted in AWS and I couldn't :/

@KE7
Copy link

KE7 commented May 2, 2024

More reason to support Mac. Best in class $/token: https://x.com/awnihannun/status/1786069640948719956

@AllanOricil
Copy link

Guys, please add a "like" in the main comment.

@kauabh
Copy link

kauabh commented Jul 26, 2024

okay

@manbax
Copy link

manbax commented Aug 5, 2024

+1

2 similar comments
@sd3ntato
Copy link

sd3ntato commented Aug 5, 2024

+1

@atbe
Copy link

atbe commented Aug 15, 2024

+1

@cderfvc
Copy link

cderfvc commented Aug 21, 2024

Support for Mac M1, M2 ➕

I wish install vllm in m1 mac pro

how to install ?

@mentisXdev
Copy link

Support for Mac M1, M2 ➕

I wish install vllm in m1 mac pro

how to install ?

Not supported :(

@stephanj
Copy link

If on Apple silicon, then have a look at Exo Platform (OSS), which supports MLX (using python) and allows inference with LLM sharding using multiple Macs @ https://github.com/exoplatform

@clearsitedesigns
Copy link

Yeah, this is common, in my experience of running so many models locally, cuda and GPUs is way worse than a simple M1 pro.

@nmadhire
Copy link

+1
this is needed to test out basic models locally before deploying

@sokoloveai
Copy link

+1

3 similar comments
@webboty
Copy link

webboty commented Sep 28, 2024

+1

@dzlabs
Copy link

dzlabs commented Oct 3, 2024

+1

@ExtraTon618
Copy link

+1

@willtejeda
Copy link

willtejeda commented Oct 6, 2024 via email

@kevin1193
Copy link

+1 would love to have vllm support for mac

@achrafmam2
Copy link

+1

1 similar comment
@x-0D
Copy link

x-0D commented Oct 30, 2024

+1

@willtejeda
Copy link

willtejeda commented Oct 30, 2024 via email

@Jhonnyr97
Copy link

Jhonnyr97 commented Nov 4, 2024

+1

1 similar comment
@lemmaa
Copy link

lemmaa commented Nov 6, 2024

+1

@stormstone
Copy link

+10086

@zuozuo
Copy link

zuozuo commented Nov 13, 2024

+10087

@white-spring
Copy link

+10088

@valenradovich
Copy link

+1111111

@youkaichao
Copy link
Member

I think #9228 should solve it now. You can build and run using macos.

That's not for performance though, mainly for developing.

@mjtechguy
Copy link

+1

@wangxiyuan
Copy link
Contributor

wangxiyuan commented Dec 9, 2024

I think #9228 should solve it now. You can build and run using macos.

That's not for performance though, mainly for developing.

Compile failed on mac m3. There are a few problem I found:

  1. hard limit to linux system: https://github.com/vllm-project/vllm/blob/main/setup.py#L36
  2. hard limit to use linux command "/proc/cpuinfo" https://github.com/vllm-project/vllm/blob/main/cmake/cpu_extension.cmake#L29
  3. some other C code is not work on mac, for example, numa.h is missing on macos.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests