Skip to content

Commit

Permalink
Merge pull request #226 from NexaAI/perry/convert-and-quantize
Browse files Browse the repository at this point in the history
refined nexa convert logic to make it more user friendly
  • Loading branch information
zhiyuan8 authored Nov 10, 2024
2 parents 568107f + 44d2aa5 commit 4580133
Show file tree
Hide file tree
Showing 6 changed files with 152 additions and 62 deletions.
32 changes: 19 additions & 13 deletions CLI.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ options:

### List Local Models

List all models on your local computer.
List all models on your local computer. You can use `nexa run <model_name>` to run any model shown in the list.

```
nexa list
Expand Down Expand Up @@ -96,6 +96,8 @@ Run a model on your local computer. If the model file is not yet downloaded, it

By default, `nexa` will run gguf models. To run onnx models, use `nexa onnx MODEL_PATH`

You can run any model shown in `nexa list` command.

#### Run Text-Generation Model

```
Expand All @@ -109,9 +111,9 @@ options:
-h, --help show this help message and exit
-pf, --profiling Enable profiling logs for the inference process
-st, --streamlit Run the inference in Streamlit UI, can be used with -lp or -hf
-lp, --local_path Indicate that the model path provided is the local path, must be used with -mt
-lp, --local_path Indicate that the model path provided is the local path
-mt, --model_type Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-hf, --huggingface Load model from Hugging Face Hub, must be used with -mt
-hf, --huggingface Load model from Hugging Face Hub
Text generation options:
-t, --temperature TEMPERATURE
Expand Down Expand Up @@ -143,9 +145,9 @@ positional arguments:
options:
-h, --help show this help message and exit
-st, --streamlit Run the inference in Streamlit UI, can be used with -lp or -hf
-lp, --local_path Indicate that the model path provided is the local path, must be used with -mt
-lp, --local_path Indicate that the model path provided is the local path
-mt, --model_type Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-hf, --huggingface Load model from Hugging Face Hub, must be used with -mt
-hf, --huggingface Load model from Hugging Face Hub
Image generation options:
-i2i, --img2img Whether to run image-to-image generation
Expand Down Expand Up @@ -189,9 +191,9 @@ options:
-h, --help show this help message and exit
-pf, --profiling Enable profiling logs for the inference process
-st, --streamlit Run the inference in Streamlit UI, can be used with -lp or -hf
-lp, --local_path Indicate that the model path provided is the local path, must be used with -mt
-lp, --local_path Indicate that the model path provided is the local path
-mt, --model_type Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-hf, --huggingface Load model from Hugging Face Hub, must be used with -mt
-hf, --huggingface Load model from Hugging Face Hub
VLM generation options:
-t, --temperature TEMPERATURE
Expand Down Expand Up @@ -223,9 +225,9 @@ positional arguments:
options:
-h, --help show this help message and exit
-st, --streamlit Run the inference in Streamlit UI, can be used with -lp or -hf
-lp, --local_path Indicate that the model path provided is the local path, must be used with -mt
-lp, --local_path Indicate that the model path provided is the local path
-mt, --model_type Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-hf, --huggingface Load model from Hugging Face Hub, must be used with -mt
-hf, --huggingface Load model from Hugging Face Hub
Automatic Speech Recognition options:
-b, --beam_size BEAM_SIZE
Expand Down Expand Up @@ -257,8 +259,8 @@ positional arguments:
options:
-h, --help show this help message and exit
-lp, --local_path Indicate that the model path provided is the local path, must be used with -mt
-hf, --huggingface Load model from Hugging Face Hub, must be used with -mt
-lp, --local_path Indicate that the model path provided is the local path
-hf, --huggingface Load model from Hugging Face Hub
-n, --normalize Normalize the embeddings
-nt, --no_truncate Not truncate the embeddings
```
Expand All @@ -274,6 +276,10 @@ nexa embed sentence-transformers/all-MiniLM-L6-v2:gguf-fp16 "I love Nexa AI." >>

### Convert and quantize a Hugging Face Model to GGUF

Additional package `nexa-gguf` is required to run this command.

You can install it by `pip install "nexaai[convert]"` or `pip install nexa-gguf`.

```
nexa convert HF_MODEL_PATH [ftype] [output_file]
usage: nexa convert [-h] [-t NTHREAD] [--convert_type CONVERT_TYPE] [--bigendian] [--use_temp_file] [--no_lazy]
Expand Down Expand Up @@ -342,9 +348,9 @@ positional arguments:
options:
-h, --help show this help message and exit
-lp, --local_path Indicate that the model path provided is the local path, must be used with -mt
-lp, --local_path Indicate that the model path provided is the local path
-mt, --model_type Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-hf, --huggingface Load model from Hugging Face Hub, must be used with -mt
-hf, --huggingface Load model from Hugging Face Hub
--host HOST Host to bind the server to
--port PORT Port to bind the server to
--reload Enable automatic reloading on code changes
Expand Down
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

[![MacOS][MacOS-image]][release-url] [![Linux][Linux-image]][release-url] [![Windows][Windows-image]][release-url]

[![GitHub Release](https://img.shields.io/github/v/release/NexaAI/nexa-sdk)](https://github.com/NexaAI/nexa-sdk/releases/latest) [![Build workflow](https://img.shields.io/github/actions/workflow/status/NexaAI/nexa-sdk/ci.yaml?label=CI&logo=github)](https://github.com/NexaAI/nexa-sdk/actions/workflows/ci.yaml?query=branch%3Amain) ![GitHub License](https://img.shields.io/github/license/NexaAI/nexa-sdk)
[![GitHub Release](https://img.shields.io/github/v/release/NexaAI/nexa-sdk)](https://github.com/NexaAI/nexa-sdk/releases/latest) [![Build workflow](https://img.shields.io/github/actions/workflow/status/NexaAI/nexa-sdk/ci.yaml?label=CI&logo=github)](https://github.com/NexaAI/nexa-sdk/actions/workflows/ci.yaml?query=branch%3Amain) ![GitHub License](https://img.shields.io/github/license/NexaAI/nexa-sdk)

[![](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2FNexaAI%2Fnexa-sdk%2Fbadge%3Ftype%3Dswift-versions)](https://swiftpackageindex.com/NexaAI/nexa-sdk) [![](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2FNexaAI%2Fnexa-sdk%2Fbadge%3Ftype%3Dplatforms)](https://swiftpackageindex.com/NexaAI/nexa-sdk)

Expand All @@ -26,6 +26,7 @@ Nexa SDK is a comprehensive toolkit for supporting **ONNX** and **GGML** models.
<video src="https://user-images.githubusercontent.com/assets/375570dc-0e7a-4a99-840d-c1ef6502e5aa.mp4" autoplay muted loop playsinline style="max-width: 100%;"></video>

## Latest News 🔥

- [2024/11] Support Nexa AI's own vision language model (0.9B parameters): `nexa run omnivision` and audio language model (2.9B): `nexa run omniaudio`
- [2024/11] Support audio language model: `nexa run qwen2audio`, **we are the first open-source toolkit to support audio language model with GGML tensor library.**
- [2024/10] Support embedding model: `nexa embed <model_path> <prompt>`
Expand Down Expand Up @@ -84,8 +85,9 @@ We have released pre-built wheels for various Python versions, platforms, and ba
> [!NOTE]
>
> 1. If you want to use <strong>ONNX model</strong>, just replace `pip install nexaai` with `pip install "nexaai[onnx]"` in provided commands.
> 2. If you want to convert and quantize huggingface models to GGUF models, just replace `pip install nexaai` with `pip install "nexaai[nexa-gguf]"`.
> 3. For Chinese developers, we recommend you to use <strong>Tsinghua Open Source Mirror</strong> as extra index url, just replace `--extra-index-url https://pypi.org/simple` with `--extra-index-url https://pypi.tuna.tsinghua.edu.cn/simple` in provided commands.
> 2. If you want to <strong>run benchmark evaluation</strong>, just replace `pip install nexaai` with `pip install "nexaai[eval]"` in provided commands.
> 3. If you want to <strong>convert and quantize huggingface models to GGUF models</strong>, just replace `pip install nexaai` with `pip install "nexaai[nexa-gguf]"` in provided commands.
> 4. For Chinese developers, we recommend you to use <strong>Tsinghua Open Source Mirror</strong> as extra index url, just replace `--extra-index-url https://pypi.org/simple` with `--extra-index-url https://pypi.tuna.tsinghua.edu.cn/simple` in provided commands.
#### CPU

Expand Down
4 changes: 2 additions & 2 deletions SERVER.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@ usage: nexa server [-h] [--host HOST] [--port PORT] [--reload] model_path

### Options:

- `-lp, --local_path`: Indicate that the model path provided is the local path, must be used with -mt
- `-lp, --local_path`: Indicate that the model path provided is the local path
- `-mt, --model_type`: Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
- `-hf, --huggingface`: Load model from Hugging Face Hub, must be used with -mt
- `-hf, --huggingface`: Load model from Hugging Face Hub
- `--host`: Host to bind the server to
- `--port`: Port to bind the server to
- `--reload`: Enable automatic reloading on code changes
Expand Down
Loading

0 comments on commit 4580133

Please sign in to comment.