diff --git a/README.md b/README.md index 71e390daa..77d1b79de 100644 --- a/README.md +++ b/README.md @@ -226,7 +226,7 @@ This is the user interface that users will interact with. By following these steps, you will be able to serve your models using the web UI. You can open your browser and chat with a model now. If the models do not show up, try to reboot the gradio web server. -#### (Optional): Advanced Features +#### (Optional): Advanced Features, Scalability - You can register multiple model workers to a single controller, which can be used for serving a single model with higher throughput or serving multiple models at the same time. When doing so, please allocate different GPUs and ports for different model workers. ``` # worker 0 @@ -240,6 +240,13 @@ python3 -m fastchat.serve.gradio_web_server_multi ``` - The default model worker based on huggingface/transformers has great compatibility but can be slow. If you want high-throughput batched serving, you can try [vLLM integration](docs/vllm_integration.md). +#### (Optional): Advanced Features, Third Party UI +- if you want to host it on your own UI or third party UI. Launch the OpenAI compatible server, host with a hosting service like ngrok, and enter the credentials approriatly. + - https://github.com/WongSaang/chatgpt-ui + - https://github.com/mckaywrigley/chatbot-ui +- Note some third party provider only offer the stand `gpt-3.5-turbo, gpt-4, etc`, so you will have to add your own custom model inside the code. [Here is an example of a modification of creating a UI with any custom model name](https://github.com/ztjhz/BetterChatGPT/pull/461) + + ## API ### OpenAI-Compatible RESTful APIs & SDK FastChat provides OpenAI-compatible APIs for its supported models, so you can use FastChat as a local drop-in replacement for OpenAI APIs.