Skip to content

Commit

Permalink
Use MAX_LIVE_BATCHES for prefill method
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 621347176
Change-Id: I339e204fe9b9d6bde6105daea9b62da139911497
  • Loading branch information
changlan authored and copybara-github committed Apr 3, 2024
1 parent 50beb83 commit 3ac3f3d
Showing 1 changed file with 1 addition and 3 deletions.
4 changes: 1 addition & 3 deletions saxml/server/model_service_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -1633,9 +1633,7 @@ def preprocess_rpc_tasks(
preprocess_fn=functools.partial(
preprocess_rpc_tasks, prefill=True
),
# Each batch has `batch_size` requests and we should have at least
# `num_cache_slots` * 2 live requests
max_live_batches=method.num_cache_slots * 2 // method.batch_size,
max_live_batches=method.max_live_batches,
batching_wait_secs=method.batching_wait_secs,
)

Expand Down

0 comments on commit 3ac3f3d

Please sign in to comment.