模型启动问题 #2563

sdlssq · 2024-11-21T03:31:54Z

System Info / 系統信息

python版本：3.10.15
xinference版本：0.16.2
llama_cpp_python：0.3.1

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

docker / docker
pip install / 通过 pip install 安装
installation from source / 从源码安装

Version info / 版本信息

xinference版本：0.16.2

The command used to start Xinference / 用以启动 xinference 的命令

xinferece-local --host 0.0.0.0 --port 9997
能够正常启动服务

Reproduction / 复现过程

能够启动xinference，我从modelscope下载了 qwen2.5-instruct的7b模型，具体为{ggufv2类型模型文件，量化:q5_k_m}，我将模型文件放到cache目录下的qwen2_5-instruct-ggufv2-7b目录（这个目录是我自己创建的），我通过UI界面，选择qwen2.5-instruct进行配置，完成配置模型参数以及模型加载路径，能够顺利启动模型。但是如果我通过命令行启动，则提示需要下载模型文件，由于我是离线的，无法下载导致报错，他忽略了我目录下的模型文件。
我的启动命令为：xinference launch --model-path /opt/..../inference/cache/qwen2_5-instruct-ggufv2-7b/ --model-engine llama.cpp --model-name qwen2.5-instruct --model-format ggufv2 -s 7 -q q5_k_m

Expected behavior / 期待表现

我发现通过UI启动模型后，若重启xinference后，之前已经启动的模型消失了，不知道有没有方法能够避免这种问题。由于我不清楚解决方法，只能通过编写脚本的方式，在xinference重启后，自动通过命令行启动模型，来保证模型重新上线。但是启动失败了。

sdlssq · 2024-11-21T08:02:49Z

找到问题的原因了，启动命令中的模型路径参数应该为model_path 而非model-path。能够正常启动了（没有搞明白为什么这样设计，其他参数都为model-xxx，到了路径变成model_path，不过问题解决了总归是好的，单纯吐槽下...&粗心的我）。
附正确的启动命令：xinference launch --model_path /opt/..../inference/cache/qwen2_5-instruct-ggufv2-7b/ --model-engine llama.cpp --model-name qwen2.5-instruct --model-format ggufv2 -s 7 -q q5_k_m

XprobeBot added this to the v0.16 milestone Nov 21, 2024

XprobeBot modified the milestones: v0.16, v1.x Nov 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

模型启动问题 #2563

模型启动问题 #2563

sdlssq commented Nov 21, 2024

sdlssq commented Nov 21, 2024 •

edited

Loading

模型启动问题 #2563

模型启动问题 #2563

Comments

sdlssq commented Nov 21, 2024

System Info / 系統信息

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

Version info / 版本信息

The command used to start Xinference / 用以启动 xinference 的命令

Reproduction / 复现过程

Expected behavior / 期待表现

sdlssq commented Nov 21, 2024 • edited Loading

sdlssq commented Nov 21, 2024 •

edited

Loading