Skip to content

Commit

Permalink
tweak and clarify
Browse files Browse the repository at this point in the history
Signed-off-by: Roger Wang <[email protected]>
  • Loading branch information
ywang96 committed Dec 30, 2024
1 parent bbde414 commit ea928c6
Showing 1 changed file with 9 additions and 2 deletions.
11 changes: 9 additions & 2 deletions vllm/v1/worker/gpu_model_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -647,11 +647,18 @@ def profile_run(self) -> None:
self.mm_registry.get_max_tokens_per_item_by_modality(
self.model_config).values())

max_num_mm_items = min(
max_num_mm_items_encoder_budget = min(
self.max_num_encoder_input_tokens,
self.encoder_cache_size) // max_tokens_per_mm_item

max_num_mm_items = min(self.max_num_reqs, max_num_mm_items)
max_mm_items_per_req = max(
self.mm_registry.get_mm_limits_per_prompt(
self.model_config).values())
max_num_mm_items_decoder_budget = self.max_num_reqs * \
max_mm_items_per_req

max_num_mm_items = min(max_num_mm_items_encoder_budget,
max_num_mm_items_decoder_budget)

# Dummy data definition in V0 may contain multiple multimodal items
# (e.g, multiple images) for a single request, therefore here we
Expand Down

0 comments on commit ea928c6

Please sign in to comment.