-
Notifications
You must be signed in to change notification settings - Fork 30
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Refactor the busy loop for the continuous batching
Previous implementation of continuous batching on the host side is a busy loop that polls for incoming requests, which could starve other host side operations. We refactor this into the implementation with two threads (one for prefill/insert, the other for generate) per model method. PiperOrigin-RevId: 608734099 Change-Id: I6c07e51a1db23243d2b48c3ad9323a27aef2e453
- Loading branch information
1 parent
f817103
commit ab36fc0
Showing
1 changed file
with
142 additions
and
64 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters