How to handle context overflow with Interactive/Instruct Executors #713

ksanman · 2024-05-01T21:01:37Z

ksanman
May 1, 2024

When using the Interactive/Instruct executors with chat history once the context limit is reached the error "llama_decode failed: 'NoKvSlot'" will be thrown. This is discussed in the issue #660

Digging through the code I noticed the method "HandleRunOutOfContext" in "LlamaExecutorBase" has some comments from an example in llama.cpp, but does not implement the context switching or self-extension. The stateless executor does implement some of the logic even though it is not supposed to keep track of state. Is that correct?

Are there any plans to add this functionality (infinite context/self-extension) to the interactive/instruct executors.
Is there a way to extend the kvcache with these executors?
How are we supposed to handle context overflow with these executors?

martindevans · 2024-05-01T21:26:29Z

martindevans
May 1, 2024
Maintainer

The stateless executor does implement some of the logic even though it is not supposed to keep track of state. Is that correct?

The stateless executor does not track state between prompts. However, it does still do out-of-context handling so that if a single prompt + it's response is > context size it can handle it.

Are there any plans to add this functionality (infinite context/self-extension) to the interactive/instruct executors.

Long term, we want to replace the executors with something more flexible which allows things such as context extension to be plugged in more easily.

Short term, it looks like HandleRunOutOfContext should be modified to shift the context. That way you'd never see the NoKvSlot error in the executors.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to handle context overflow with Interactive/Instruct Executors #713

{{title}}

Replies: 1 comment

{{title}}

Select a reply

How to handle context overflow with Interactive/Instruct Executors #713

ksanman May 1, 2024

Replies: 1 comment

martindevans May 1, 2024 Maintainer

ksanman
May 1, 2024

martindevans
May 1, 2024
Maintainer