NexaAI · zhiyuan8 · Nov 10, 2024 · Nov 9, 2024 · Nov 9, 2024 · Nov 9, 2024
diff --git a/CLI.md b/CLI.md
@@ -31,7 +31,7 @@ options:
 
 ### List Local Models
 
-List all models on your local computer.
+List all models on your local computer. You can use `nexa run <model_name>` to run any model shown in the list.
 
 ```
 nexa list
@@ -96,6 +96,8 @@ Run a model on your local computer. If the model file is not yet downloaded, it
 
 By default, `nexa` will run gguf models. To run onnx models, use `nexa onnx MODEL_PATH`
 
+You can run any model shown in `nexa list` command.
+
 #### Run Text-Generation Model
 
 ```
@@ -109,9 +111,9 @@ options:
   -h, --help            show this help message and exit
   -pf, --profiling      Enable profiling logs for the inference process
   -st, --streamlit      Run the inference in Streamlit UI, can be used with -lp or -hf
-  -lp, --local_path     Indicate that the model path provided is the local path, must be used with -mt
+  -lp, --local_path     Indicate that the model path provided is the local path
   -mt, --model_type     Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-  -hf, --huggingface    Load model from Hugging Face Hub, must be used with -mt
+  -hf, --huggingface    Load model from Hugging Face Hub
 
 Text generation options:
   -t, --temperature TEMPERATURE
@@ -143,9 +145,9 @@ positional arguments:
 options:
   -h, --help            show this help message and exit
   -st, --streamlit      Run the inference in Streamlit UI, can be used with -lp or -hf
-  -lp, --local_path     Indicate that the model path provided is the local path, must be used with -mt
+  -lp, --local_path     Indicate that the model path provided is the local path
   -mt, --model_type     Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-  -hf, --huggingface    Load model from Hugging Face Hub, must be used with -mt
+  -hf, --huggingface    Load model from Hugging Face Hub
 
 Image generation options:
   -i2i, --img2img       Whether to run image-to-image generation
@@ -189,9 +191,9 @@ options:
   -h, --help            show this help message and exit
   -pf, --profiling      Enable profiling logs for the inference process
   -st, --streamlit      Run the inference in Streamlit UI, can be used with -lp or -hf
-  -lp, --local_path     Indicate that the model path provided is the local path, must be used with -mt
+  -lp, --local_path     Indicate that the model path provided is the local path
   -mt, --model_type     Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-  -hf, --huggingface    Load model from Hugging Face Hub, must be used with -mt
+  -hf, --huggingface    Load model from Hugging Face Hub
 
 VLM generation options:
   -t, --temperature TEMPERATURE
@@ -223,9 +225,9 @@ positional arguments:
 options:
   -h, --help            show this help message and exit
   -st, --streamlit      Run the inference in Streamlit UI, can be used with -lp or -hf
-  -lp, --local_path     Indicate that the model path provided is the local path, must be used with -mt
+  -lp, --local_path     Indicate that the model path provided is the local path
   -mt, --model_type     Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-  -hf, --huggingface    Load model from Hugging Face Hub, must be used with -mt
+  -hf, --huggingface    Load model from Hugging Face Hub
 
 Automatic Speech Recognition options:
   -b, --beam_size BEAM_SIZE
@@ -257,8 +259,8 @@ positional arguments:
 
 options:
   -h, --help            show this help message and exit
-  -lp, --local_path     Indicate that the model path provided is the local path, must be used with -mt
-  -hf, --huggingface    Load model from Hugging Face Hub, must be used with -mt
+  -lp, --local_path     Indicate that the model path provided is the local path
+  -hf, --huggingface    Load model from Hugging Face Hub
   -n, --normalize       Normalize the embeddings
   -nt, --no_truncate    Not truncate the embeddings
 ```
@@ -274,6 +276,10 @@ nexa embed sentence-transformers/all-MiniLM-L6-v2:gguf-fp16 "I love Nexa AI." >>
 
 ### Convert and quantize a Hugging Face Model to GGUF
 
+Additional package `nexa-gguf` is required to run this command.
+
+You can install it by `pip install "nexaai[convert]"` or `pip install nexa-gguf`.
+
 ```
 nexa convert HF_MODEL_PATH [ftype] [output_file]
 usage: nexa convert [-h] [-t NTHREAD] [--convert_type CONVERT_TYPE] [--bigendian] [--use_temp_file] [--no_lazy]
@@ -342,9 +348,9 @@ positional arguments:
 
 options:
   -h, --help   show this help message and exit
-  -lp, --local_path     Indicate that the model path provided is the local path, must be used with -mt
+  -lp, --local_path     Indicate that the model path provided is the local path
   -mt, --model_type     Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-  -hf, --huggingface    Load model from Hugging Face Hub, must be used with -mt
+  -hf, --huggingface    Load model from Hugging Face Hub
   --host HOST  Host to bind the server to
   --port PORT  Port to bind the server to
   --reload     Enable automatic reloading on code changes

diff --git a/README.md b/README.md
@@ -4,7 +4,7 @@
 
 [![MacOS][MacOS-image]][release-url] [![Linux][Linux-image]][release-url] [![Windows][Windows-image]][release-url]
 
-[![GitHub Release](https://img.shields.io/github/v/release/NexaAI/nexa-sdk)](https://github.com/NexaAI/nexa-sdk/releases/latest) [![Build workflow](https://img.shields.io/github/actions/workflow/status/NexaAI/nexa-sdk/ci.yaml?label=CI&logo=github)](https://github.com/NexaAI/nexa-sdk/actions/workflows/ci.yaml?query=branch%3Amain) ![GitHub License](https://img.shields.io/github/license/NexaAI/nexa-sdk) 
+[![GitHub Release](https://img.shields.io/github/v/release/NexaAI/nexa-sdk)](https://github.com/NexaAI/nexa-sdk/releases/latest) [![Build workflow](https://img.shields.io/github/actions/workflow/status/NexaAI/nexa-sdk/ci.yaml?label=CI&logo=github)](https://github.com/NexaAI/nexa-sdk/actions/workflows/ci.yaml?query=branch%3Amain) ![GitHub License](https://img.shields.io/github/license/NexaAI/nexa-sdk)
 
 [![](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2FNexaAI%2Fnexa-sdk%2Fbadge%3Ftype%3Dswift-versions)](https://swiftpackageindex.com/NexaAI/nexa-sdk) [![](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2FNexaAI%2Fnexa-sdk%2Fbadge%3Ftype%3Dplatforms)](https://swiftpackageindex.com/NexaAI/nexa-sdk)
 
@@ -26,6 +26,7 @@ Nexa SDK is a comprehensive toolkit for supporting **ONNX** and **GGML** models.
 <video src="https://user-images.githubusercontent.com/assets/375570dc-0e7a-4a99-840d-c1ef6502e5aa.mp4" autoplay muted loop playsinline style="max-width: 100%;"></video>
 
 ## Latest News 🔥
+
 - [2024/11] Support Nexa AI's own vision language model (0.9B parameters): `nexa run omnivision` and audio language model (2.9B): `nexa run omniaudio`
 - [2024/11] Support audio language model: `nexa run qwen2audio`, **we are the first open-source toolkit to support audio language model with GGML tensor library.**
 - [2024/10] Support embedding model: `nexa embed <model_path> <prompt>`
@@ -84,8 +85,9 @@ We have released pre-built wheels for various Python versions, platforms, and ba
 > [!NOTE]
 >
 > 1. If you want to use <strong>ONNX model</strong>, just replace `pip install nexaai` with `pip install "nexaai[onnx]"` in provided commands.
-> 2. If you want to convert and quantize huggingface models to GGUF models, just replace `pip install nexaai` with `pip install "nexaai[nexa-gguf]"`.
-> 3. For Chinese developers, we recommend you to use <strong>Tsinghua Open Source Mirror</strong> as extra index url, just replace `--extra-index-url https://pypi.org/simple` with `--extra-index-url https://pypi.tuna.tsinghua.edu.cn/simple` in provided commands.
+> 2. If you want to <strong>run benchmark evaluation</strong>, just replace `pip install nexaai` with `pip install "nexaai[eval]"` in provided commands.
+> 3. If you want to <strong>convert and quantize huggingface models to GGUF models</strong>, just replace `pip install nexaai` with `pip install "nexaai[nexa-gguf]"` in provided commands.
+> 4. For Chinese developers, we recommend you to use <strong>Tsinghua Open Source Mirror</strong> as extra index url, just replace `--extra-index-url https://pypi.org/simple` with `--extra-index-url https://pypi.tuna.tsinghua.edu.cn/simple` in provided commands.
 
 #### CPU
 

diff --git a/SERVER.md b/SERVER.md
@@ -8,9 +8,9 @@ usage: nexa server [-h] [--host HOST] [--port PORT] [--reload] model_path
 
 ### Options:
 
-- `-lp, --local_path`: Indicate that the model path provided is the local path, must be used with -mt
+- `-lp, --local_path`: Indicate that the model path provided is the local path
 - `-mt, --model_type`: Indicate the model running type, must be used with -lp or -hf, choose from [NLP, COMPUTER_VISION, MULTIMODAL, AUDIO]
-- `-hf, --huggingface`: Load model from Hugging Face Hub, must be used with -mt
+- `-hf, --huggingface`: Load model from Hugging Face Hub
 - `--host`: Host to bind the server to
 - `--port`: Port to bind the server to
 - `--reload`: Enable automatic reloading on code changes