⬆️ Update dependencies #120

tengomucho · 2024-11-27T14:39:49Z

What does this PR do?

This contains many dependencies updates:

Jetstream v0.2.4
TGI v2.4.1
Transformers v4.46.3
Accelerate v1.1.1
Pytorch and Pytorch XLA v2.5.1
Safetensors v0.4.5

All these to update to newer versions. Required changes (some revealed by tests) were done accordingly.

HuggingFaceDocBuilderDev · 2024-11-27T14:42:34Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

A warning appeared on on an old version of torch xla (2.3.0), but that is not supported anymore.

When building the docker container, sometimes an error occurs due to a "too many files open" error. Increasing the ulimit makes the error disappear.

Also align Dockerfile to TGI's one.

The config object variable for these models was used by the Jetstream code, but it does not completely match with HF's config definitions. This creates a class that heritates from both classes, and makes the adjustments necessary to avoid errors.

It is still possible to import it importing modeling, but it will reduce the possibility of importing transformers and torch xla before xla2.

Conversions of scores tensors from jax to torch and back are done when calling logits processor. This will be required in newer versions of transformers.

This is to be coherent with accelerate dependencies, and to update to a newer version.

dacorvo · 2024-11-29T08:25:49Z

...tion-inference/server/text_generation_server/jetstream_pt_support/models/mixtral_model_hf.py

@@ -4,6 +4,20 @@
 from transformers import GenerationConfig, GenerationMixin, MixtralConfig


+class MixtralConfigHf(MixtralConfig, mixtral_config.ModelArgs):


This is a bit brittle, as setting one argument will not update its aliased value. What is the benefit of adding this ?

The Transformer class from Jetstream is the model class for Mistral, and it has a config variable that used to be an instance of ModelArgs. When I use the model in TGI and want to use the GenerationMixin methods for sampling, it ends up using the config, that it expects it to be a transformers' MixtralConfig instance, and it fails because it is not. This is why I came up with this solution. I could try doing composition instead of heritage to see if I can find a cleaner solution though.

dacorvo · 2024-11-29T10:53:19Z

...ration-inference/server/text_generation_server/jetstream_pt_support/models/gemma_model_hf.py

+        # args = GemmaConfigHf(**config.to_dict())
+        # args.device = device
+        # super().__init__(args, env)


Suggested change

# args = GemmaConfigHf(**config.to_dict())

# args.device = device

# super().__init__(args, env)

This never happened 🧙‍♂️

Instead of assigning separate variables for Jetstream's config class, properties are added, resulting in accessing the same data and avoiding ambiguity.

tengomucho force-pushed the update-deps branch from fd862c7 to d8d290a Compare November 27, 2024 16:49

tengomucho added 6 commits November 28, 2024 15:34

chore(fsdp): remove a warning not relevant anymore

3d35156

A warning appeared on on an old version of torch xla (2.3.0), but that is not supported anymore.

chore: update jetstream dependency to v0.2.4

a25a56f

chore(docker): increase ulimit to avoid error

446a923

When building the docker container, sometimes an error occurs due to a "too many files open" error. Increasing the ulimit makes the error disappear.

fix(docker): "AS" statement should be uppercase to avoid warning

780110c

chore: update TGI dependency to v2.4.1

9d0afd5

Also align Dockerfile to TGI's one.

chore(docker): update accelerate to v1.1.1

4e5ba90

tengomucho force-pushed the update-deps branch from d8d290a to 6bf378e Compare November 28, 2024 15:38

tengomucho added 6 commits November 28, 2024 16:31

chore(optimum): remove AutoModelForCausalLM from optimum.tpu

cf21e04

It is still possible to import it importing modeling, but it will reduce the possibility of importing transformers and torch xla before xla2.

chore: update torch and torch_xla to v2.5.1

9fd21a9

chore(jetstream): token selector operations are done in torch

340d4fd

Conversions of scores tensors from jax to torch and back are done when calling logits processor. This will be required in newer versions of transformers.

chore(dependencies): update transformers to v4.46.3

515f45a

chore: update safetensors to v0.4.5

f063f12

This is to be coherent with accelerate dependencies, and to update to a newer version.

tengomucho force-pushed the update-deps branch from 6bf378e to f063f12 Compare November 28, 2024 16:38

tengomucho marked this pull request as ready for review November 28, 2024 16:58

tengomucho requested review from dacorvo and baptistecolle November 28, 2024 16:58

dacorvo reviewed Nov 29, 2024

View reviewed changes

review(mixtral): use properties in config to avoid aliasing ambiguity

54183d7

Instead of assigning separate variables for Jetstream's config class, properties are added, resulting in accessing the same data and avoiding ambiguity.

tengomucho force-pushed the update-deps branch from 9e1fead to 54183d7 Compare November 29, 2024 10:57

dacorvo approved these changes Nov 29, 2024

View reviewed changes

tengomucho merged commit a1919c2 into main Nov 29, 2024
5 checks passed

tengomucho deleted the update-deps branch November 29, 2024 14:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⬆️ Update dependencies #120

⬆️ Update dependencies #120

tengomucho commented Nov 27, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Nov 27, 2024

dacorvo Nov 29, 2024

tengomucho Nov 29, 2024

dacorvo Nov 29, 2024

tengomucho Nov 29, 2024

		@@ -4,6 +4,20 @@
		from transformers import GenerationConfig, GenerationMixin, MixtralConfig


		class MixtralConfigHf(MixtralConfig, mixtral_config.ModelArgs):

	# args = GemmaConfigHf(**config.to_dict())
	# args.device = device
	# super().__init__(args, env)

⬆️ Update dependencies #120

⬆️ Update dependencies #120

Conversation

tengomucho commented Nov 27, 2024 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Nov 27, 2024

dacorvo Nov 29, 2024

Choose a reason for hiding this comment

tengomucho Nov 29, 2024

Choose a reason for hiding this comment

dacorvo Nov 29, 2024

Choose a reason for hiding this comment

tengomucho Nov 29, 2024

Choose a reason for hiding this comment

tengomucho commented Nov 27, 2024 •

edited

Loading