Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Implementation for issue #136 : supports vlm requests for /chat/completions api #154

Merged
merged 13 commits into from
Oct 9, 2024

Conversation

zhycheng614
Copy link
Collaborator

@zhycheng614 zhycheng614 commented Oct 8, 2024

  • updates on the existing api: /chat/completions api to be compatible with OpenAI api
  • supports streaming
  • running model from local path supported, with --local_path and --model_type flags provided

@zhycheng614 zhycheng614 marked this pull request as ready for review October 8, 2024 19:29
@zhycheng614 zhycheng614 requested a review from zhiyuan8 October 8, 2024 19:29

```json
{
"model": "anything",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use an actual model name, developers may just want to copy it and it should work

@@ -15,13 +15,41 @@ def run_ggml_inference(args):
if model_type:
run_type = ModelType[model_type].value

def choose_file(local_path, file_type):
""" Helper function for Multimodal inference only: select the model and projector ggufs from the local_path. """
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we split proj and model file into two arguments? We might want to add a flag "--proj" or "-p" for projector. What if I mistakenly write proj at 1st position?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please confirm this UX with product @alanzhuly

@zhiyuan8 zhiyuan8 merged commit 43450f5 into main Oct 9, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants