Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for users to specify custom request settings, model and optionally provider specific #14535

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

JonasHelming
Copy link
Contributor

@JonasHelming JonasHelming commented Nov 26, 2024

What it does

Add support for users to specify custom request settings, model and optionally provider specific.
The reason for making them provider specific is that providers have different options and sometimes even different names for the same (see below)

How to test

Add the settings below and adapt it to models you have.
Qwen/Qwen2.5-Coder-32B-Instruct is always "warm" on serverless Hugginface
Starcoder2.3B can be downloaded here: https://huggingface.co/Mozilla/starcoder2-llamafile/tree/main
gemma2 can be directly downloaded with ollama (ollama serve and ollama run gemma2)

Two good test cases:

  • set the max_length to something small and observe the model to stop
  • set a stop token and ask the model to say it :-)
"ai-features.modelSettings.requestSettings": [
        {
            "modelId": "Qwen/Qwen2.5-Coder-32B-Instruct",
            "requestSettings": {
                "max_new_tokens": 2048,
                "stop": [
                    "<|im_end|>",
                ]
            },
            "providerId": "huggingface"
        },
        {
            "modelId": "gemma2",
            "requestSettings": {
                "num_predict": 1024,
                "stop": [
                    "<|endoftext|>"
                ],
            },
            "providerId": "ollama"
        },
        {
            "modelId": "StarCoder2.3B",
            "requestSettings": {
                "n_predict": 200,
                "stream": true,
                "stop": [
                    "<file_sep>",
                    "<|endoftext|>"
                ],
                "cache_prompt": true,
            },
            "providerId": "llamafile"
        },
        {
            "modelId": "gpt-4o-2024-05-13",
            "requestSettings": {
                "max_tokens": 10,
                "stop": [
                    "<|im_end|>"
                ]
            },
            "providerId": "openai"
        },
    ]

Follow-ups

This should be the last thing we add to the provider layer before we refactor it all together:

  • Make the settings more concistent
  • Remove a lot of code duplication

Review checklist

Reminder for reviewers

@JonasHelming
Copy link
Contributor Author

@dhuebner Could you check the Ollama adaptations please?

@JonasHelming JonasHelming changed the title Gh 14526 Add support for users to specify custom request settings, model and optionally provider specific Nov 26, 2024
Signed-off-by: Jonas Helming <[email protected]>
Signed-off-by: Jonas Helming <[email protected]>
@JonasHelming
Copy link
Contributor Author

See this for docu: eclipse-theia/theia-website#662

@dhuebner
Copy link
Member

@JonasHelming
Sure, will do.

@dhuebner
Copy link
Member

dhuebner commented Nov 27, 2024

@JonasHelming
Ollama works as expected!

@sdirix
I have a question to another topic. Would it be possible to mark a LanguageModelRequest with an explicit flag that tells if it's a chat or a completion request? Having this information model implementors could decide if they should use stream or text response.

@JonasHelming
Copy link
Contributor Author

@sdirix
I have a question to another topic. Would it be possible to mark a LanguageModelRequest with an explicit flag that tells if it's a chat or a completion request? Having this information model implementors could decide if they should use stream or text response.

@dhuebner I also already thought about this, would you mind creating a new ticket and mark me there?

Copy link
Member

@sdirix sdirix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work ❤️ ! I found some inconsistencies which should be fixed before we merge.

@JonasHelming
Copy link
Contributor Author

Thank you for the great review. I tried to address all comments (in individual commits) and tested all providers again with the final state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Waiting on author
Development

Successfully merging this pull request may close these issues.

3 participants