Skip to content

how to achieve the Guidance acceleration #515

Answered by slundberg
xujli asked this question in Q&A
Discussion options

You must be logged in to vote

It works by sending things in batch to the compute backend when we can. How this batching happens is specific to the backend but the if statement that detect if the next token is forced (and can so be batched) is at:

if is_forced:

hope that helps!

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@xujli
Comment options

Answer selected by slundberg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #503 on December 07, 2023 18:15.