Does vllm support the Mac/Metal/MPS? #1441

Phil-U-U · 2023-10-21T00:52:39Z

I ran into the error when pip install vllm in Mac:
RuntimeError: Cannot find CUDA_HOME. CUDA must be available to build the package.

WoosukKwon · 2023-10-22T08:01:00Z

Hi @Phil-U-U, vLLM does not support MPS backend at the moment as its main target scenario is to be deployed as a high-throughput server running on powerful accelerators like NVIDIA A100/H100.

ibehnam · 2023-12-12T17:25:31Z

Hi @Phil-U-U, vLLM does not support MPS backend at the moment as its main target scenario is to be deployed as a high-throughput server running on powerful accelerators like NVIDIA A100/H100.

That's understandable. Although, with Mac Studio, many people and companies are starting to use the Mac as LLM servers. I get very good t/s on Mac Studio M2 Ultra using llama.cpp, but something like vLLM on the Mac would be a game changer. I hope the dev team considers this.

willtejeda · 2023-12-14T12:18:59Z

I'm with @ibehnam , Mac studio would be great for vllm

oushu1zhangxiangxuan1 · 2023-12-20T06:54:04Z

+1

joeywen · 2023-12-20T07:59:04Z

+1

vincent-pli · 2023-12-22T02:59:58Z

+1

jtoy · 2023-12-30T14:01:58Z

+1

csmac3144 · 2024-01-12T15:12:29Z

+1

N8python · 2024-01-14T23:02:19Z

+1

evolu8 · 2024-01-16T10:12:12Z

Right, many developers are on Macs while testing out architectures. Yes we'll commonly deploy to linux, but if we can't quickly test and experiment locally on our daily use machines... we'll mostly like look elsewhere for an inference engine. Would be great if it could be supported, even if performance suffers on 'Apple Metal'.

anisingh1 · 2024-01-17T04:58:13Z

+1

antahiap · 2024-01-20T06:37:06Z

Does this issue still exist?

ibehnam · 2024-01-21T19:21:37Z

@evolu8 Other solutions such as llama.cpp, LLM-MLC, Apple MLX, etc.?

sislam-provenir · 2024-01-29T17:55:50Z

Does anyone know about any alternatives to VLLM until this is supported?

bluenevus · 2024-02-16T19:51:16Z

same here please +1

shushenghong · 2024-02-20T11:13:19Z

+1
pip install vllm failed on my mac m1 pro

jtoy · 2024-02-20T13:49:00Z

+1On Feb 20, 2024, at 3:13 AM, Ather Shu ***@***.***> wrote: +1 pip install vllm failed on my mac m1 pro image.png (view on web) —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>

bluenevus · 2024-02-20T22:43:34Z

+1

WH2zzZ · 2024-02-25T13:38:47Z

+1

emitrokhin · 2024-02-27T10:50:08Z

+1

csawtelle · 2024-02-28T17:01:10Z

+1 I use a macbook and a mac studio. With the M3 architecture's unified memory these are very capable and cost effective systems for development and testing.

bluenevus · 2024-02-28T17:02:23Z

I'm willing to sponsor that effort because what we pay will be in savings in simply hardware on a mac.

acodercat · 2024-03-10T11:09:32Z

+1

chen-bowen · 2024-03-11T17:52:13Z

+10000

NicolaDonelli · 2024-03-15T16:42:23Z

+1 It would be very useful

youkaichao · 2024-03-15T16:47:09Z

it is not possible for vllm to support mac when triton https://github.com/openai/triton does not support mac.

balasaajay · 2024-04-21T17:34:20Z

+1 mac support would be very useful for experimentation

AllanOricil · 2024-04-29T07:56:35Z

I was about to try it with Snowflake arctic AI on my mac M2 Max hosted in AWS and I couldn't :/

KE7 · 2024-05-02T21:04:41Z

More reason to support Mac. Best in class $/token: https://x.com/awnihannun/status/1786069640948719956

AllanOricil · 2024-07-26T11:31:58Z

Guys, please add a "like" in the main comment.

kauabh · 2024-07-26T12:34:58Z

okay

manbax · 2024-08-05T08:05:44Z

+1

sd3ntato · 2024-08-05T16:02:37Z

+1

atbe · 2024-08-15T04:42:32Z

+1

cderfvc · 2024-08-21T05:17:38Z

Support for Mac M1, M2 ➕

I wish install vllm in m1 mac pro

how to install ?

mentisXdev · 2024-08-23T16:14:06Z

Support for Mac M1, M2 ➕

I wish install vllm in m1 mac pro

how to install ?

Not supported :(

stephanj · 2024-08-25T11:49:55Z

If on Apple silicon, then have a look at Exo Platform (OSS), which supports MLX (using python) and allows inference with LLM sharding using multiple Macs @ https://github.com/exoplatform

clearsitedesigns · 2024-09-12T00:42:53Z

Yeah, this is common, in my experience of running so many models locally, cuda and GPUs is way worse than a simple M1 pro.

nmadhire · 2024-09-15T17:51:57Z

+1
this is needed to test out basic models locally before deploying

sokoloveai · 2024-09-23T14:54:06Z

+1

webboty · 2024-09-28T22:36:48Z

+1

dzlabs · 2024-10-03T02:37:43Z

+1

ExtraTon618 · 2024-10-06T04:18:47Z

+1

willtejeda · 2024-10-06T04:40:34Z

+1

…

On Sat, Oct 5, 2024, 9:19 PM ExtraTon618 ***@***.***> wrote: +1 — Reply to this email directly, view it on GitHub <#1441 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALWGZH4M5ZOPLQLP4RIBZITZ2C24DAVCNFSM6AAAAAA6JX4GAKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJVGI4DONBQHA> . You are receiving this because you commented.Message ID: ***@***.***>

kevin1193 · 2024-10-17T11:11:01Z

+1 would love to have vllm support for mac

achrafmam2 · 2024-10-24T18:53:30Z

+1

x-0D · 2024-10-30T20:24:34Z

+1

willtejeda · 2024-10-30T20:29:07Z

+1

…

On Wed, Oct 30, 2024, 1:25 PM Nikita Bragin ***@***.***> wrote: +1 — Reply to this email directly, view it on GitHub <#1441 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALWGZH4U2XBFQROZRWIKWJLZ6E6BZAVCNFSM6AAAAAA6JX4GAKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBYGMYDANRZGI> . You are receiving this because you commented.Message ID: ***@***.***>

Jhonnyr97 · 2024-11-04T13:42:37Z

+1

lemmaa · 2024-11-06T16:15:06Z

+1

stormstone · 2024-11-07T03:44:15Z

+10086

zuozuo · 2024-11-13T02:14:33Z

+10087

white-spring · 2024-11-19T01:19:41Z

+10088

valenradovich · 2024-11-25T23:02:32Z

+1111111

youkaichao · 2024-11-26T05:27:52Z

I think #9228 should solve it now. You can build and run using macos.

That's not for performance though, mainly for developing.

mjtechguy · 2024-12-05T14:40:38Z

+1

wangxiyuan · 2024-12-09T09:26:47Z

I think #9228 should solve it now. You can build and run using macos.

That's not for performance though, mainly for developing.

Compile failed on mac m3. There are a few problem I found:

hard limit to linux system: https://github.com/vllm-project/vllm/blob/main/setup.py#L36
hard limit to use linux command "/proc/cpuinfo" https://github.com/vllm-project/vllm/blob/main/cmake/cpu_extension.cmake#L29
some other C code is not work on mac, for example, numa.h is missing on macos.

WoosukKwon closed this as completed Oct 22, 2023

mairin mentioned this issue Aug 15, 2024

vLLM support for safe tensors on macOS instructlab/instructlab#2068

Closed

ericcurtin mentioned this issue Aug 29, 2024

Add basic vllm support containers/ramalama#97

Merged

Does vllm support the Mac/Metal/MPS? #1441

Does vllm support the Mac/Metal/MPS? #1441

Comments

Phil-U-U commented Oct 21, 2023

WoosukKwon commented Oct 22, 2023

ibehnam commented Dec 12, 2023

willtejeda commented Dec 14, 2023

oushu1zhangxiangxuan1 commented Dec 20, 2023

joeywen commented Dec 20, 2023

vincent-pli commented Dec 22, 2023

jtoy commented Dec 30, 2023

csmac3144 commented Jan 12, 2024

N8python commented Jan 14, 2024

evolu8 commented Jan 16, 2024

anisingh1 commented Jan 17, 2024

antahiap commented Jan 20, 2024

ibehnam commented Jan 21, 2024 • edited Loading

sislam-provenir commented Jan 29, 2024

bluenevus commented Feb 16, 2024

shushenghong commented Feb 20, 2024

jtoy commented Feb 20, 2024 via email

bluenevus commented Feb 20, 2024

WH2zzZ commented Feb 25, 2024

emitrokhin commented Feb 27, 2024

csawtelle commented Feb 28, 2024

bluenevus commented Feb 28, 2024

acodercat commented Mar 10, 2024

chen-bowen commented Mar 11, 2024

NicolaDonelli commented Mar 15, 2024

youkaichao commented Mar 15, 2024

balasaajay commented Apr 21, 2024 • edited Loading

AllanOricil commented Apr 29, 2024

KE7 commented May 2, 2024

AllanOricil commented Jul 26, 2024

kauabh commented Jul 26, 2024

manbax commented Aug 5, 2024

sd3ntato commented Aug 5, 2024

atbe commented Aug 15, 2024

cderfvc commented Aug 21, 2024

mentisXdev commented Aug 23, 2024

stephanj commented Aug 25, 2024

clearsitedesigns commented Sep 12, 2024

nmadhire commented Sep 15, 2024

sokoloveai commented Sep 23, 2024

webboty commented Sep 28, 2024

dzlabs commented Oct 3, 2024

ExtraTon618 commented Oct 6, 2024

willtejeda commented Oct 6, 2024 via email

kevin1193 commented Oct 17, 2024

achrafmam2 commented Oct 24, 2024

x-0D commented Oct 30, 2024

willtejeda commented Oct 30, 2024 via email

Jhonnyr97 commented Nov 4, 2024 • edited Loading

lemmaa commented Nov 6, 2024

stormstone commented Nov 7, 2024

zuozuo commented Nov 13, 2024

white-spring commented Nov 19, 2024

valenradovich commented Nov 25, 2024

youkaichao commented Nov 26, 2024

mjtechguy commented Dec 5, 2024

wangxiyuan commented Dec 9, 2024 • edited Loading

ibehnam commented Jan 21, 2024 •

edited

Loading

balasaajay commented Apr 21, 2024 •

edited

Loading

Jhonnyr97 commented Nov 4, 2024 •

edited

Loading

wangxiyuan commented Dec 9, 2024 •

edited

Loading