Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understand the Phi-3 Speed of Inference #20

Open
gordonwatts opened this issue Jun 29, 2024 · 2 comments
Open

Understand the Phi-3 Speed of Inference #20

gordonwatts opened this issue Jun 29, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@gordonwatts
Copy link
Owner

It runs very slow and the GPU is not efficiently used - why? How can we fix that?

@gordonwatts gordonwatts added the enhancement New feature or request label Jun 29, 2024
@gordonwatts gordonwatts self-assigned this Jun 29, 2024
@gordonwatts
Copy link
Owner Author

When running on the 3080 at home, things run very fast - GPU is "fully" Used. When run on the laptop with a GPU things do not run quickly.

@gordonwatts
Copy link
Owner Author

The other thing to do is see if there are other ways the weights are uploaded and if that might help things out. (this GGUF or whatever it is called, for example).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant