-
Notifications
You must be signed in to change notification settings - Fork 714
Pull requests: sgl-project/sglang
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
add config to swtich from vllm custom allreduce to sgl_kernel custom allreduce
#2981
opened Jan 19, 2025 by
yizhang2077
Loading…
4 tasks
[MOE] try to optimize cu kernel single block execution - distribute cumsum workload from thread 0 to other threads
#2970
opened Jan 19, 2025 by
yiakwy-xpu-ml-framework-team
Loading…
3 of 4 tasks
[EAGLE] Fix some boundary situation when retract reqs and req's max token = 1
#2939
opened Jan 17, 2025 by
josephydu
Loading…
[#2812] Make the decode status dict capcity adjustable by a CLI param
#2839
opened Jan 11, 2025 by
seungduk-yanolja
Loading…
2 of 3 tasks
Support distributed tensor when updating weights
#2831
opened Jan 10, 2025 by
fzyzcjy
Loading…
3 tasks done
Support custom device mesh for tensor parallel workers
#2827
opened Jan 10, 2025 by
fzyzcjy
Loading…
3 tasks done
Use CUDA_VISIBLE_DEVICES instead of gpu_id variables everywhere.
#2824
opened Jan 10, 2025 by
heiner
Loading…
1 task done
Improve the mixed chunk prefill by lanuch two kernels
#2811
opened Jan 9, 2025 by
libratiger
•
Draft
1 of 3 tasks
Add endpoint for file support, purely to speed up processing of input_embeds.
#2797
opened Jan 8, 2025 by
RinRin-32
Loading…
2 of 3 tasks
Speculative decoding with lookahead
enhancement
New feature or request
high priority
#2790
opened Jan 8, 2025 by
jjjjohnson
Loading…
3 tasks done
[Feature] Support regex as a stopping condition
#2699
opened Jan 2, 2025 by
mickqian
Loading…
3 tasks done
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.