forked from EleutherAI/DeeperSpeed
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
KV Cache Improved Flexibility (microsoft#4668)
This KV-cache adds the foundation for appropriately supporting two key KV-cache improvements: 1. Delineation between local/dense KV caches/models at the cache level in addition to the attention module level. 2. Support for multiple types of disjoint KV caches (such as alternating local + dense attention GPT-Neo). Follow up item: Determine appropriate statistics for weighting local + dense KV block ratios when both are present. --------- Co-authored-by: Olatunji Ruwase <[email protected]>
- Loading branch information
Showing
10 changed files
with
272 additions
and
181 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
4 changes: 0 additions & 4 deletions
4
deepspeed/inference/v2/model_implementations/common_architectures/__init__.py
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.