Skip to content

Pull requests: HabanaAI/vllm-hpu-extension

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Update calibration readme link
#73 opened Jan 10, 2025 by michalkuligowski Loading…
Add flag to enable running softmax in fp32
#71 opened Jan 10, 2025 by madamczykhabana Loading…
Remove repeat KV cache
#69 opened Jan 8, 2025 by iboiko-habana Loading…
Pad to bmin if value is less
#67 opened Jan 2, 2025 by mfylcek Loading…
allow lm_head quantization in calibration process
#65 opened Dec 23, 2024 by nirda7 Loading…
Add exponential bucketing PoC
#61 opened Dec 17, 2024 by kzawora-intel Draft
vLLM-Ext: Full enabling of ALiBi
#60 opened Dec 17, 2024 by tannervoas742 Loading…
Remove vllm.logger.init_logger dependency
#53 opened Dec 9, 2024 by kzawora-intel Loading…
Add AWQ class
#29 opened Nov 8, 2024 by maktukmak Loading…
Add GPTQ class
#28 opened Nov 8, 2024 by maktukmak Loading…
ProTip! Filter pull requests by the default branch with base:main.