diff --git a/README.md b/README.md index 330fe9ee..0e6b50db 100644 --- a/README.md +++ b/README.md @@ -20,6 +20,7 @@ - **Local UI:** Streamlit for interactive model deployment and testing ## Latest News 🔥 + - Support Nexa AI's own vision language model (0.9B parameters): `nexa run omnivision` and audio language model (2.9B parameters): `nexa run omniaudio` - Support audio language model: `nexa run qwen2audio`, **we are the first open-source toolkit to support audio language model with GGML tensor library.** - Support iOS Swift binding for local inference on **iOS mobile** devices. @@ -32,13 +33,13 @@ Welcome to submit your requests through [issues](https://github.com/NexaAI/nexa- ## Install Option 1: Executable Installer

- + macOS Installer

- + Windows Installer

@@ -205,18 +206,18 @@ pip install -e . Below is our differentiation from other similar tools: -| **Feature** | **[Nexa SDK](https://github.com/NexaAI/nexa-sdk)** | **[ollama](https://github.com/ollama/ollama)** | **[Optimum](https://github.com/huggingface/optimum)** | **[LM Studio](https://github.com/lmstudio-ai)** | -| -------------------------- | :------------------------------------------------: | :--------------------------------------------: | :---------------------------------------------------: | :---------------------------------------------: | -| **GGML Support** | ✅ | ✅ | ❌ | ✅ | -| **ONNX Support** | ✅ | ❌ | ✅ | ❌ | -| **Text Generation** | ✅ | ✅ | ✅ | ✅ | -| **Image Generation** | ✅ | ❌ | ❌ | ❌ | -| **Vision-Language Models** | ✅ | ✅ | ✅ | ✅ | -| **Audio-Language Models** | ✅ | ❌ | ❌ | ❌ | -| **Text-to-Speech** | ✅ | ❌ | ✅ | ❌ | -| **Server Capability** | ✅ | ✅ | ✅ | ✅ | -| **User Interface** | ✅ | ❌ | ❌ | ✅ | -| **Executable Installation** | ✅ | ✅ | ❌ | ✅ | +| **Feature** | **[Nexa SDK](https://github.com/NexaAI/nexa-sdk)** | **[ollama](https://github.com/ollama/ollama)** | **[Optimum](https://github.com/huggingface/optimum)** | **[LM Studio](https://github.com/lmstudio-ai)** | +| --------------------------- | :------------------------------------------------: | :--------------------------------------------: | :---------------------------------------------------: | :---------------------------------------------: | +| **GGML Support** | ✅ | ✅ | ❌ | ✅ | +| **ONNX Support** | ✅ | ❌ | ✅ | ❌ | +| **Text Generation** | ✅ | ✅ | ✅ | ✅ | +| **Image Generation** | ✅ | ❌ | ❌ | ❌ | +| **Vision-Language Models** | ✅ | ✅ | ✅ | ✅ | +| **Audio-Language Models** | ✅ | ❌ | ❌ | ❌ | +| **Text-to-Speech** | ✅ | ❌ | ✅ | ❌ | +| **Server Capability** | ✅ | ✅ | ✅ | ✅ | +| **User Interface** | ✅ | ❌ | ❌ | ✅ | +| **Executable Installation** | ✅ | ✅ | ❌ | ✅ | ## Supported Models & Model Hub @@ -257,25 +258,37 @@ Supported model examples (full list at [Model Hub](https://nexa.ai/models)): | [bark-small](https://nexa.ai/suno/bark-small/gguf-fp16/readme) | Text-to-Speech | GGUF | `nexa run bark-small:fp16` | ## Run Models from 🤗 HuggingFace or 🤖 ModelScope + You can pull, convert (to .gguf), quantize and run [llama.cpp supported](https://github.com/ggerganov/llama.cpp#description) text generation models from HF or MS with Nexa SDK. + ### Run .gguf File + Use `nexa run -hf ` or `nexa run -ms ` to run models with provided .gguf files: + ```bash nexa run -hf Qwen/Qwen2.5-Coder-7B-Instruct-GGUF ``` + ```bash nexa run -ms Qwen/Qwen2.5-Coder-7B-Instruct-GGUF ``` + > **Note:** You will be prompted to select a single .gguf file. If your desired quantization version has multiple split files (like fp16-00001-of-00004), please use Nexa's conversion tool (see below) to convert and quantize the model locally. + ### Convert .safetensors Files + Install [Nexa Python package](https://github.com/NexaAI/nexa-sdk?tab=readme-ov-file#install-option-2-python-package), and install Nexa conversion tool with `pip install "nexaai[convert]"`, then convert models from huggingface with `nexa convert `: + ```bash nexa convert HuggingFaceTB/SmolLM2-135M-Instruct ``` + Or you can convert models from ModelScope with `nexa convert -ms `: + ```bash nexa convert -ms Qwen/Qwen2.5-7B-Instruct ``` + > **Note:** Check our [leaderboard](https://nexa.ai/leaderboard) for performance benchmarks of different quantized versions of mainstream language models and [HuggingFace docs](https://huggingface.co/docs/optimum/en/concept_guides/quantization) to learn about quantization options. 📋 You can view downloaded and converted models with `nexa list`