-
Notifications
You must be signed in to change notification settings - Fork 24
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* chore(ci): remove TGI_VERSION argument from workflow The Dockerfile has a default value, it is easier to only maintain that. * feat(TGI): update to v3.0.0 Update to TGI 3.0.0, using a simplified Cargo.toml. This is based on the work done on optimum-neuron: huggingface/optimum-neuron#748 * fix(tgi): add merge_lora kwarg to download_weights * fix(tgi): return the correct FinishReason on stop string * fix(tgi): set max_batch_prefill_tokens Starting from TGI 2.4.1, the evaluation of the default value for max_batch_prefill_tokens in the TGI launcher has changed, leading it to be set to a default value of 4096 on tpu, while it was previously set to max_batch_size * max_input_tokens. This is now fixed in the entrypoint, pending a fix in the launcher. * review(docker): unset MAX_INPUT_LENGTH when set
- Loading branch information
1 parent
0b9cfd2
commit 20772b8
Showing
10 changed files
with
125 additions
and
32 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
[workspace] | ||
members = [ | ||
"backends/v2", | ||
"backends/grpc-metadata", | ||
"launcher", | ||
"router" | ||
] | ||
default-members = [ | ||
"backends/v2", | ||
"backends/grpc-metadata", | ||
"launcher", | ||
"router" | ||
] | ||
resolver = "2" | ||
|
||
[workspace.package] | ||
version = "3.0.0" | ||
edition = "2021" | ||
authors = ["Olivier Dehaene"] | ||
homepage = "https://github.com/huggingface/text-generation-inference" | ||
|
||
[workspace.dependencies] | ||
base64 = "0.22.0" | ||
tokenizers = { version = "0.20.0", features = ["http"] } | ||
hf-hub = { version = "0.3.1", features = ["tokio"] } | ||
metrics = { version = "0.23.0" } | ||
metrics-exporter-prometheus = { version = "0.15.1", features = [] } | ||
minijinja = { version = "2.2.0", features = ["json"] } | ||
minijinja-contrib = { version = "2.0.2", features = ["pycompat"] } | ||
pyo3 = { version = "0.22.2", features = ["auto-initialize"] } | ||
|
||
[profile.release] | ||
incremental = true | ||
|
||
[profile.release-binary] | ||
inherits = "release" | ||
debug = 1 | ||
incremental = true | ||
panic = "abort" | ||
|
||
[profile.release-opt] | ||
inherits = "release" | ||
debug = 0 | ||
incremental = false | ||
lto = "fat" | ||
opt-level = 3 | ||
codegen-units = 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
build | ||
grpcio-tools==1.62.1 | ||
mypy-protobuf==3.2.0 | ||
grpcio-tools==1.53.0 | ||
mypy-protobuf |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters