Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add instructions for using this app on a system with no GPU #19

Open
jmatthiesen opened this issue Jul 10, 2024 · 2 comments
Open

Add instructions for using this app on a system with no GPU #19

jmatthiesen opened this issue Jul 10, 2024 · 2 comments
Assignees

Comments

@jmatthiesen
Copy link
Contributor

jmatthiesen commented Jul 10, 2024

This sample will work best on a system that has a GPU, but it can be used on a system without one if necessary. As we work on setup instructions for this app, we should include details for getting it to work on these systems with no GPU. The following are the steps I had to take so far to get it working and I'll keep adding to this until I can use all the sample use cases:

  1. Update the Ollama bootstrapping code so it does not use a GPU. In the AppHost project, Program.cs change this code:
var chatCompletion = builder.AddOllama("chatcompletion").WithDataVolume();

to

var chatCompletion = builder.AddOllama("chatcompletion", enableGpu: false).WithDataVolume();
  1. In the PythonInference project, change requirements.txt to use versions of Torch libraries without CUDA. Change it from:
--extra-index-url https://download.pytorch.org/whl/cu118
torch==2.3.1+cu118
torchaudio==2.3.1+cu118
torchvision==0.18.1+cu118

to:

torch==2.3.1
torchaudio==2.3.1
torchvision==0.18.1
  1. Change the PythonInference project so that it does not use CUDA for accessing models. In the routers/classifier.py file, change this line from:
classifier = pipeline('zero-shot-classification', model='cross-encoder/nli-MiniLM2-L6-H768', device='cuda')

to:

classifier = pipeline('zero-shot-classification', model='cross-encoder/nli-MiniLM2-L6-H768')
  1. When running without a GPU, responses from the models will be slower and default settings for timeouts are not enough. I had to update my ServiceDefaults project, Extensions.cs file to increase the StandardResilience timeouts. The following worked for me, but on some systems a different timeout may be needed. In Extensions.AddServiceDefaults, change the StandardResilienceHandler from:
http.AddStandardResilienceHandler();

to:

http.AddStandardResilienceHandler(options =>
{
    options.AttemptTimeout = new HttpTimeoutStrategyOptions
    {
        Timeout = TimeSpan.FromMinutes(10)
    };
    options.TotalRequestTimeout = new HttpTimeoutStrategyOptions
    {
        Timeout = TimeSpan.FromMinutes(10)
    };
    options.CircuitBreaker.SamplingDuration = TimeSpan.FromMinutes(20);
});
@kannan-cidc
Copy link

@jmatthiesen thanks for this solution this helped a lot

@PureKrome
Copy link

The README mentions the prerequisite of an NVIDIA GPU. What about AMD GPU's, like Radeon RX7700S ? Are AMD GPU's supported?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants