Skip to content

Commit

Permalink
[docs] Cache implementations (huggingface#34325)
Browse files Browse the repository at this point in the history
cache
  • Loading branch information
stevhliu authored Oct 25, 2024
1 parent 6a62a6d commit 1d06379
Showing 1 changed file with 9 additions and 1 deletion.
10 changes: 9 additions & 1 deletion src/transformers/generation/configuration_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,15 @@ class GenerationConfig(PushToHubMixin):
speed up decoding.
cache_implementation (`str`, *optional*, default to `None`):
Name of the cache class that will be instantiated in `generate`, for faster decoding. Possible values are:
{ALL_CACHE_IMPLEMENTATIONS}. We support other cache types, but they must be manually instantiated and
- `"static"`: [`StaticCache`]
- `"offloaded_static"`: [`OffloadedStaticCache`]
- `"sliding_window"`: [`SlidingWindowCache`]
- `"hybrid"`: [`HybridCache`]
- `"mamba"`: [`MambaCache`]
- `"quantized"`: [`QuantizedCache`]
We support other cache types, but they must be manually instantiated and
passed to `generate` through the `past_key_values` argument. See our
[cache documentation](https://huggingface.co/docs/transformers/en/kv_cache) for further information.
cache_config (`CacheConfig` or `dict`, *optional*, default to `None`):
Expand Down

0 comments on commit 1d06379

Please sign in to comment.