Skip to content

Commit

Permalink
rebuilding site Wed Jan 10 09:44:22 EST 2024
Browse files Browse the repository at this point in the history
  • Loading branch information
insujang committed Jan 10, 2024
1 parent 26c03ee commit 1cc87d7
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -589,7 +589,7 @@ <h2 id="preemption-with-page-miss" class="relative group">Preemption with Page M
<span class="k">else</span><span class="p">:</span>
<span class="c1"># Decoding run.</span>
<span class="o">...</span>
</code></pre></div><p>Because query, key, and cache arguments include a batched input, all inputs should be either prompt or decode, and cannot be coalesced.
</code></pre></div><p>Because query, key, and value arguments include a batched input, all inputs should be either prompt or decode, and cannot be coalesced.
This is also verified in <a href="https://github.com/vllm-project/vllm/blob/v0.2.7/vllm/worker/model_runner.py#L331-L340" target="_blank">Model Runner</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="k">def</span> <span class="nf">prpare_input_tensors</span><span class="p">(</span><span class="o">...</span><span class="p">):</span>
<span class="c1"># NOTE: We assume that all sequences in the group are all prompts or</span>
Expand Down

0 comments on commit 1cc87d7

Please sign in to comment.