Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refined nexa convert logic to make it more user friendly #226

Merged
merged 11 commits into from
Nov 10, 2024
32 changes: 19 additions & 13 deletions CLI.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ options:

### List Local Models

List all models on your local computer.
List all models on your local computer. You can use `nexa run <model_name>` to run any model shown in the list.

```
nexa list
Expand Down Expand Up @@ -96,6 +96,8 @@ Run a model on your local computer. If the model file is not yet downloaded, it

By default, `nexa` will run gguf models. To run onnx models, use `nexa onnx MODEL_PATH`

You can run any model shown in `nexa list` command.

#### Run Text-Generation Model

```
Expand All @@ -109,9 +111,9 @@ options:
-h, --help show this help message and exit
-pf, --profiling Enable profiling logs for the inference process
-st, --streamlit Run the inference in Streamlit UI, can be used with -lp or -hf
-lp, --local_path Indicate that the model path provided is the local path, must be used with -mt
-lp, --local_path Indicate that the model path provided is the local path
-mt, --model_type Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-hf, --huggingface Load model from Hugging Face Hub, must be used with -mt
-hf, --huggingface Load model from Hugging Face Hub

Text generation options:
-t, --temperature TEMPERATURE
Expand Down Expand Up @@ -143,9 +145,9 @@ positional arguments:
options:
-h, --help show this help message and exit
-st, --streamlit Run the inference in Streamlit UI, can be used with -lp or -hf
-lp, --local_path Indicate that the model path provided is the local path, must be used with -mt
-lp, --local_path Indicate that the model path provided is the local path
-mt, --model_type Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-hf, --huggingface Load model from Hugging Face Hub, must be used with -mt
-hf, --huggingface Load model from Hugging Face Hub

Image generation options:
-i2i, --img2img Whether to run image-to-image generation
Expand Down Expand Up @@ -189,9 +191,9 @@ options:
-h, --help show this help message and exit
-pf, --profiling Enable profiling logs for the inference process
-st, --streamlit Run the inference in Streamlit UI, can be used with -lp or -hf
-lp, --local_path Indicate that the model path provided is the local path, must be used with -mt
-lp, --local_path Indicate that the model path provided is the local path
-mt, --model_type Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-hf, --huggingface Load model from Hugging Face Hub, must be used with -mt
-hf, --huggingface Load model from Hugging Face Hub

VLM generation options:
-t, --temperature TEMPERATURE
Expand Down Expand Up @@ -223,9 +225,9 @@ positional arguments:
options:
-h, --help show this help message and exit
-st, --streamlit Run the inference in Streamlit UI, can be used with -lp or -hf
-lp, --local_path Indicate that the model path provided is the local path, must be used with -mt
-lp, --local_path Indicate that the model path provided is the local path
-mt, --model_type Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-hf, --huggingface Load model from Hugging Face Hub, must be used with -mt
-hf, --huggingface Load model from Hugging Face Hub

Automatic Speech Recognition options:
-b, --beam_size BEAM_SIZE
Expand Down Expand Up @@ -257,8 +259,8 @@ positional arguments:

options:
-h, --help show this help message and exit
-lp, --local_path Indicate that the model path provided is the local path, must be used with -mt
-hf, --huggingface Load model from Hugging Face Hub, must be used with -mt
-lp, --local_path Indicate that the model path provided is the local path
-hf, --huggingface Load model from Hugging Face Hub
-n, --normalize Normalize the embeddings
-nt, --no_truncate Not truncate the embeddings
```
Expand All @@ -274,6 +276,10 @@ nexa embed sentence-transformers/all-MiniLM-L6-v2:gguf-fp16 "I love Nexa AI." >>

### Convert and quantize a Hugging Face Model to GGUF

Additional package `nexa-gguf` is required to run this command.

You can install it by `pip install "nexaai[convert]"` or `pip install nexa-gguf`.

```
nexa convert HF_MODEL_PATH [ftype] [output_file]
usage: nexa convert [-h] [-t NTHREAD] [--convert_type CONVERT_TYPE] [--bigendian] [--use_temp_file] [--no_lazy]
Expand Down Expand Up @@ -342,9 +348,9 @@ positional arguments:

options:
-h, --help show this help message and exit
-lp, --local_path Indicate that the model path provided is the local path, must be used with -mt
-lp, --local_path Indicate that the model path provided is the local path
-mt, --model_type Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-hf, --huggingface Load model from Hugging Face Hub, must be used with -mt
-hf, --huggingface Load model from Hugging Face Hub
--host HOST Host to bind the server to
--port PORT Port to bind the server to
--reload Enable automatic reloading on code changes
Expand Down
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

[![MacOS][MacOS-image]][release-url] [![Linux][Linux-image]][release-url] [![Windows][Windows-image]][release-url]

[![GitHub Release](https://img.shields.io/github/v/release/NexaAI/nexa-sdk)](https://github.com/NexaAI/nexa-sdk/releases/latest) [![Build workflow](https://img.shields.io/github/actions/workflow/status/NexaAI/nexa-sdk/ci.yaml?label=CI&logo=github)](https://github.com/NexaAI/nexa-sdk/actions/workflows/ci.yaml?query=branch%3Amain) ![GitHub License](https://img.shields.io/github/license/NexaAI/nexa-sdk)
[![GitHub Release](https://img.shields.io/github/v/release/NexaAI/nexa-sdk)](https://github.com/NexaAI/nexa-sdk/releases/latest) [![Build workflow](https://img.shields.io/github/actions/workflow/status/NexaAI/nexa-sdk/ci.yaml?label=CI&logo=github)](https://github.com/NexaAI/nexa-sdk/actions/workflows/ci.yaml?query=branch%3Amain) ![GitHub License](https://img.shields.io/github/license/NexaAI/nexa-sdk)

[![](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2FNexaAI%2Fnexa-sdk%2Fbadge%3Ftype%3Dswift-versions)](https://swiftpackageindex.com/NexaAI/nexa-sdk) [![](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2FNexaAI%2Fnexa-sdk%2Fbadge%3Ftype%3Dplatforms)](https://swiftpackageindex.com/NexaAI/nexa-sdk)

Expand All @@ -26,6 +26,7 @@ Nexa SDK is a comprehensive toolkit for supporting **ONNX** and **GGML** models.
<video src="https://user-images.githubusercontent.com/assets/375570dc-0e7a-4a99-840d-c1ef6502e5aa.mp4" autoplay muted loop playsinline style="max-width: 100%;"></video>

## Latest News 🔥

- [2024/11] Support Nexa AI's own vision language model (0.9B parameters): `nexa run omnivision` and audio language model (2.9B): `nexa run omniaudio`
- [2024/11] Support audio language model: `nexa run qwen2audio`, **we are the first open-source toolkit to support audio language model with GGML tensor library.**
- [2024/10] Support embedding model: `nexa embed <model_path> <prompt>`
Expand Down Expand Up @@ -84,8 +85,9 @@ We have released pre-built wheels for various Python versions, platforms, and ba
> [!NOTE]
>
> 1. If you want to use <strong>ONNX model</strong>, just replace `pip install nexaai` with `pip install "nexaai[onnx]"` in provided commands.
> 2. If you want to convert and quantize huggingface models to GGUF models, just replace `pip install nexaai` with `pip install "nexaai[nexa-gguf]"`.
> 3. For Chinese developers, we recommend you to use <strong>Tsinghua Open Source Mirror</strong> as extra index url, just replace `--extra-index-url https://pypi.org/simple` with `--extra-index-url https://pypi.tuna.tsinghua.edu.cn/simple` in provided commands.
> 2. If you want to <strong>run benchmark evaluation</strong>, just replace `pip install nexaai` with `pip install "nexaai[eval]"` in provided commands.
> 3. If you want to <strong>convert and quantize huggingface models to GGUF models</strong>, just replace `pip install nexaai` with `pip install "nexaai[nexa-gguf]"` in provided commands.
> 4. For Chinese developers, we recommend you to use <strong>Tsinghua Open Source Mirror</strong> as extra index url, just replace `--extra-index-url https://pypi.org/simple` with `--extra-index-url https://pypi.tuna.tsinghua.edu.cn/simple` in provided commands.

#### CPU

Expand Down
4 changes: 2 additions & 2 deletions SERVER.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@ usage: nexa server [-h] [--host HOST] [--port PORT] [--reload] model_path

### Options:

- `-lp, --local_path`: Indicate that the model path provided is the local path, must be used with -mt
- `-lp, --local_path`: Indicate that the model path provided is the local path
- `-mt, --model_type`: Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
- `-hf, --huggingface`: Load model from Hugging Face Hub, must be used with -mt
- `-hf, --huggingface`: Load model from Hugging Face Hub
- `--host`: Host to bind the server to
- `--port`: Port to bind the server to
- `--reload`: Enable automatic reloading on code changes
Expand Down
Loading
Loading