Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add llama-cpp-python model #666

Merged
merged 4 commits into from
Nov 3, 2024

Conversation

domdomegg
Copy link
Contributor

This PR contains:

  • New features
  • Changes to dev-tools e.g. CI config / github tooling
  • Docs
  • Bug fixes
  • Code refactor

What is the current behavior? (You can also link to an open issue here)

I wanted to test out scoring using logprob outputs for implementing MMLU. Unfortunately I couldn't do this with any of the local model providers I was testing, so ended up just using the top token rather than the logprobs (see https://github.com/UKGovernmentBEIS/inspect_evals/pull/21/files#diff-5b259e4149ade4c897cf11f70e74fd670ea0e71b6c62449c52808ef440c424f4R14-R18).

I realised there's no support for logprobs from any locally run LLM. This is because:

  • Ollama doesn't support logprobs (Provide logits or logprobs in the API ollama/ollama#2415)
  • the Inspect vLLM integration doesn't provide logprobs, although I think it might be possible to add this later given vLLM supports this. (I am also not super familiar with vLLM and haven't looked into the integration very deeply so might be wrong)

What is the new behavior?

I added support for the llama-cpp-python OpenAI-compatible server. This is a popular (7.9k stars) library that offers a OpenAI compatible server. It can run models locally, but unlikely Ollama can provide logprobs in its response.*

This was actually fairly straightforward, because it supports the OpenAI spec. So it's basically just another wrapper around OpenAI. I added tests and was able to get them running happily locally.*

Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)

I don't think this introduces a breaking change.

Other information:

*Unfortunately, while llama-cpp-python supports logprobs it does so for the completions API, not the chat completions API which is used by Inspect. This means that everything is not perfect with logprob support yet.

I have raised a pull request to fix this upstream: abetlen/llama-cpp-python#1788

In the mean time, this PR works fine for using llama-cpp-python, except for the logprobs functionality doesn't (the reason I created it 😅). If you checkout and run the version in my branch for llama-cpp-python everything works fully happily. I think for the sake of not confusing people though, it might be worth keeping this in draft until the upstream PR gets merged and released.

Copy link
Collaborator

@jjallaire-aisi jjallaire-aisi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Just a small request to change the name to llama-cpp and everything else looks great. I will leave it up to you to decide whether we should merge now w/o logprobs (I'm fine with that) or stay in draft and wait for the upstream.

CHANGELOG.md Outdated Show resolved Hide resolved
src/inspect_ai/_cli/eval.py Outdated Show resolved Hide resolved
@domdomegg domdomegg marked this pull request as ready for review November 2, 2024 20:50
@jjallaire jjallaire self-requested a review November 3, 2024 12:44
@jjallaire jjallaire merged commit f1ca17f into UKGovernmentBEIS:main Nov 3, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants