Skip to content
This repository has been archived by the owner on Dec 1, 2024. It is now read-only.

Pull requests: FMInference/FlexLLMGen

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add support for Llama and Qwen models
#135 opened Mar 29, 2024 by marswen Loading…
[Feature] Intel dGPU/SYCL support
#125 opened Sep 25, 2023 by abhilash1910 Loading…
fix torchrun inference
#112 opened Apr 25, 2023 by fsx950223 Loading…
Allow FlexGen to use locally downloaded models
#111 opened Apr 24, 2023 by Vinkle-hzt Loading…
Add SkyPilot example for running benchmarks
#96 opened Mar 9, 2023 by Michaelvll Loading…
1 task done
CPU and M1/M2 GPU platform support
#80 opened Mar 1, 2023 by xiezhq-hermann Loading…
ProTip! Follow long discussions with comments:>50.