[BUG]: Different continuation after restoring state #888

TomoJu · 2024-08-02T09:45:10Z

Description

Hi,
I'm developing a client-application with Llamasharp and .NET 8 and found a behavior that does not meet my expectations.
When I run a session with two identical questions I get always the same answers for these two questions. This is what I expect when I use the same parameters and especially the same seed.
But when I first ask question 1, save the session state, create a new session, load the saved state and ask question 2 I do net get the same answer like in the first step. I would expect that I get the same result after saving/restoring the session with same questions and same parameters.
I modified the LoadAndSaveState-Sample to show this behavior.

I used llama2 7b q5km chat and llama2 13b q5km chat that for my tests

Regards
Tomo

Reproduction Steps

using LLama.Common;

namespace LLama.Examples.Examples
{
// This example shows how to save/load state of the executor.
public class LoadAndSaveState
{
public static async Task Run()
{
string modelPath = @"...llama-2-7b-chat.Q5_K_M.gguf";

        var promptSea = "What is the color of the sea?";
        var promptSky = "What is the color of the sky?";

        var parameters = new ModelParams(modelPath)
        {
            Seed = 1337,
            GpuLayerCount = 20
        };

        var inferenceParams = new InferenceParams() { Temperature = 0.6f, AntiPrompts = new List<string> { "User:" } };

        //Both questions in one session
        using (var model1 = await LLamaWeights.LoadFromFileAsync(parameters))
        {
            using (var context1 = model1.CreateContext(parameters))
            {
                var ex1 = new InteractiveExecutor(context1);

                Console.WriteLine(promptSea);

                await foreach (var text in ex1.InferAsync(promptSea, inferenceParams))
                {
                    Console.Write(text);
                }

                Console.WriteLine();

                Console.WriteLine(promptSky);

                //Answer for Question 2.
                await foreach (var text in ex1.InferAsync(promptSky, inferenceParams))
                {
                    Console.Write(text);
                }

                Console.WriteLine();
            }
        }

        var modelStatePath = @"...modelState.bin";

        var executorStatePath = @"...executorState.bin";

        //Only question 1 and save state
        using (var model2 = await LLamaWeights.LoadFromFileAsync(parameters))
        {
            using (var context2 = model2.CreateContext(parameters))
            {
                var ex2 = new InteractiveExecutor(context2);

                Console.WriteLine(promptSea);

                await foreach (var text in ex2.InferAsync(promptSea, inferenceParams))
                {
                    Console.Write(text);
                }

                Console.WriteLine();

                ex2.Context.SaveState(modelStatePath);
                await ex2.SaveState(executorStatePath);
            }
        }

        //Load state and question 2. Answer is not the same like in both questions in one session
        using (var model3 = await LLamaWeights.LoadFromFileAsync(parameters))
        {
            using (var context3 = model3.CreateContext(parameters))
            {
                var ex3 = new InteractiveExecutor(context3);

                var ctx3 = ex3.Context;
                ctx3.LoadState(modelStatePath);
                ex3 = new InteractiveExecutor(ctx3);
                await ex3.LoadState(executorStatePath);

                Console.WriteLine(promptSky);

                //Answer for Question 2. 
                await foreach (var text in ex3.InferAsync(promptSky, inferenceParams))
                {
                    Console.Write(text);
                }
            }
        }
    }
}

}

Environment & Configuration

Operating system: Windows 10
.NET runtime version: 8.0
LLamaSharp version: 0.14.0
CUDA version (if you are using cuda backend): 12.3
CPU & GPU device: Intel I7 64 GB RAM, NVIDIA GeForce RTX 2070

Known Workarounds

None

The text was updated successfully, but these errors were encountered:

martindevans · 2024-09-04T12:46:07Z

Unfortunately I think this is expected behaviour, llama.cpp itself is not entirely deterministic (even with a fixed seed etc).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: Different continuation after restoring state #888

[BUG]: Different continuation after restoring state #888

TomoJu commented Aug 2, 2024 •

edited

Loading

martindevans commented Sep 4, 2024

[BUG]: Different continuation after restoring state #888

[BUG]: Different continuation after restoring state #888

Comments

TomoJu commented Aug 2, 2024 • edited Loading

Description

Reproduction Steps

Environment & Configuration

Known Workarounds

martindevans commented Sep 4, 2024

TomoJu commented Aug 2, 2024 •

edited

Loading