v0.4.0 #18

XkunW · 2024-11-28T18:18:16Z

PR Type

[Release]
v0.4.0

Short Description

Onboarded various new models and new model types: text embedding model and reward reasoning model.
Added metrics command that streams performance metrics for inference server.
Enabled more launch command options: --max-num-seqs, "--model-weights-parent-dir", --pipeline-parallelism, --enforce-eager.
Improved support for launching custom models.
Improved command response time.
Improved visuals for list command.

…ode jobs, update vLLM container path

… models, added Llama 3.2 and Llama 3.1 Nemotron

…d list command based on model type, updated READMEs, removed debugging code

…ling for when errors that doesn't affect server launching show up in err logs

…erwrite the values from models.csv

…or launching custom models

… refactors based on mypy

…s a text embedding model

XkunW added 28 commits October 29, 2024 10:27

Update pyproject.toml and Dockerfile for the latest vllm installation

6f43ae4

Fix wrong config for Mistral large, use PP by default for all multi n…

73ce125

…ode jobs, update vLLM container path

Remove PP option in models csv

34f59d2

Update PP to be default for multinode jobs

8951a0f

Add back pipeline parallelism as an argument, default to True for all…

1e7eeeb

… models, added Llama 3.2 and Llama 3.1 Nemotron

Add Qwen2.5 and Pixtral models

b2bfdfb

Fix typo

e7ed931

Fix PP error, added metrics command

f5d2260

Add model type field to models, added 2 text embedding models, update…

c89f198

…d list command based on model type, updated READMEs, removed debugging code

Update color

5d580a0

Updated max model length for text embedding models, update error hand…

5d9c603

…ling for when errors that doesn't affect server launching show up in err logs

Fix wrong command syntax

6733d9b

Update screenshot

c5d7ae7

Enable default values for common params, added max_num_seqs param

487aef8

Black formatting

d1fac58

Update README

c26a1d0

Add command to support remote model launching via SSH

d578a8a

Update default params, remove default for qos and walltime as they ov…

66dfa8e

…erwrite the values from models.csv

Bugfix for metrics, update image for metrics command

657d43a

Add model weights parent directory parameter

6f43e54

Update description for --model-weights-parent-dir, add instructions f…

e0b194c

…or launching custom models

Added enforce eager option, added support for reward modeling models,…

e7b6871

… refactors based on mypy

Add suport for enforce eager, remove URL file creation

97178a8

Added new models

33e61e7

Update README

f9dad2a

Add QwQ 32B preview

2d1ba53

Bump version, update dependencies

60e6ddb

Remove bge-multilingual-gemma2 for now as vllm doesn't recognize it a…

f74c4f6

…s a text embedding model

XkunW merged commit d221dae into main Nov 28, 2024
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.4.0 #18

v0.4.0 #18

XkunW commented Nov 28, 2024

v0.4.0 #18

v0.4.0 #18

Conversation

XkunW commented Nov 28, 2024

PR Type

Short Description