Any possible chance to add a context-free grammar constraint language ? #1233

nova-land · 2023-09-30T16:27:36Z

nova-land
Sep 30, 2023

There are several existing approaches such as guidance and LMQL but they are not compatible with vLLM.
Some others like GGML BNF in llama.cpp also look promising to constrain the model output into a specific format.

Without a good output constraint method, LLM output can be wild and not applicable in many cases such as outputting non-JSON results when we expect the LLM to have a JSON response.

Therefore, the output constraint component is very important for the development of autonomous agents and other applications of LLMs.

Most of the Constraint approaches need to access the transformer directly to do the Grammar Sampling for each generated token, so it is best to embed the output constraint component into vLLM.

Maybe we can gather enough people to talk about this idea and implement it into vLLM?