Skip to content

Latest commit

ย 

History

History
342 lines (251 loc) ยท 11 KB

README_EN.md

File metadata and controls

342 lines (251 loc) ยท 11 KB

English | ็ฎ€ไฝ“ไธญๆ–‡

PyPI version License docker pull tests pypi downloads


Important

Significant configuration adjustments will be made after version v0.7.0, making it incompatible with previous versions. Configuring through the UI will be more convenient, and more powerful configuration options are provided.

OpenAI-Forward is an efficient forwarding service designed for large language models. Its core features include user request rate control, Token rate limits, intelligent prediction caching, log management, and API key management, aiming to provide a fast and convenient model forwarding service. Whether proxying local language models or cloud-based language models, such as LocalAI or OpenAI, OpenAI Forward facilitates easy implementation. With the support of libraries like uvicorn, aiohttp, and asyncio, OpenAI-Forward achieves impressive asynchronous performance.

News

Key Features

OpenAI-Forward offers the following capabilities:

  • Universal Forwarding: Supports forwarding of almost all types of requests.
  • Performance First: Boasts outstanding asynchronous performance.
  • Cache AI Predictions: Caches AI predictions, accelerating service access and saving costs.
  • User Traffic Control: Customize request and Token rates.
  • Real-time Response Logs: Enhances observability of the call chain.
  • Custom Secret Keys: Replaces the original API keys.
  • Black and White List: IP-based black and white list restrictions can be implemented.
  • Multi-target Routing: Forwards to multiple service addresses under a single service to different routes.
  • Automatic Retries: Ensures service stability; will automatically retry on failed requests.
  • Quick Deployment: Supports fast deployment locally or on the cloud via pip and docker.

Proxy services set up by this project include:

Note: The proxy services deployed here are for personal learning and research purposes only and should not be used for any commercial purposes.

Deployment Guide

๐Ÿ‘‰ Deployment Documentation

User Guide

Quick Start

Installation

pip install openai-forward 

# Or install `webui` version(Currently in alpha version)๏ผš
git clone https://github.com/KenyonY/openai-forward.git
cd openai-forward
pip install -e .[webui]

Starting the Service

aifd run
# or
aifd run -webui

If the configuration from the .env file at the root path is read, you will see the following startup information.

โฏ aifd run
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€ ๐Ÿค— openai-forward is ready to serve!  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚                                                    โ”‚
โ”‚  base url         https://api.openai.com           โ”‚
โ”‚  route prefix     /                                โ”‚
โ”‚  api keys         False                            โ”‚
โ”‚  forward keys     False                            โ”‚
โ”‚  cache_backend    MEMORY                           โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โฑ๏ธ Rate Limit configuration โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚                                                    โ”‚
โ”‚  backend                memory                     โ”‚
โ”‚  strategy               moving-window              โ”‚
โ”‚  global rate limit      100/minute (req)           โ”‚
โ”‚  /v1/chat/completions   100/2minutes (req)         โ”‚
โ”‚  /v1/completions        60/minute;600/hour (req)   โ”‚
โ”‚  /v1/chat/completions   60/second (token)          โ”‚
โ”‚  /v1/completions        60/second (token)          โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
INFO:     Started server process [191471]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

Proxy OpenAI Model:

The default option for aifd run is to proxy https://api.openai.com.

The following uses the set up service address https://api.openai-forward.com as an example.

Python

  from openai import OpenAI  # pip install openai>=1.0.0
  client = OpenAI(
+     base_url="https://api.openai-forward.com/v1", 
      api_key="sk-******"
  )
More

Use in Third-party Applications

Integrate within the open-source project ChatGPT-Next-Web: Replace the BASE_URL in the Docker startup command with the address of your self-hosted proxy service.

docker run -d \
    -p 3000:3000 \
    -e OPENAI_API_KEY="sk-******" \
    -e BASE_URL="https://api.openai-forward.com" \
    -e CODE="******" \
    yidadaa/chatgpt-next-web 

Integrate within Code


Image Generation (DALL-E)

curl --location 'https://api.openai-forward.com/v1/images/generations' \
--header 'Authorization: Bearer sk-******' \
--header 'Content-Type: application/json' \
--data '{
    "prompt": "A photo of a cat",
    "n": 1,
    "size": "512x512"
}'

Proxy Local Model

(More)

Proxy Other Cloud Models

  • Applicable scenarios: For instance, through LiteLLM, you can convert the API format of many cloud models to the OpenAI API format and then use this service as a proxy.

(More)

Configuration

Execute aifd run --webui to enter the configuration page (default service address http://localhost:8001).

Detailed configuration descriptions can be seen in the .env.example file. (To be completed)

Caching

After enabling caching, the content of specified routes will be cached. The forwarding types are divided into openai and general, with slightly different behaviors for each. When using general forwarding, by default, the same requests will all be responded to using the cache. When using openai forwarding, after enabling caching, the caching behavior can be controlled through OpenAI's extra_body parameter, such as

Python

  from openai import OpenAI 
  client = OpenAI(
+     base_url="https://smart.openai-forward.com/v1", 
      api_key="sk-******"
  )
  completion = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
      {"role": "user", "content": "Hello!"}
    ],
+   extra_body={"caching": True}
)

Curl

curl https://smart.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-******" \
  -d '{
    "model": "gpt-3.5-turbo",
    "messages": [{"role": "user", "content": "Hello!"}],
    "caching": true
  }'

Custom Api Keys

Click for more details

Use case:

  import openai
+ openai.api_base = "https://api.openai-forward.com/v1"
- openai.api_key = "sk-******"
+ openai.api_key = "fk-******"

Multi-Target Service Forwarding

Supports forwarding services from different addresses to different routes under the same port. Refer to the .env.example for examples.

Conversation Logs

Chat logs are not recorded by default. If you wish to enable it, set the LOG_CHAT=true environment variable.

Click for more details

Logs are saved in the current directory under Log/openai/chat/chat.log. The recording format is:

{'messages': [{'role': 'user', 'content': 'hi'}], 'model': 'gpt-3.5-turbo', 'stream': True, 'max_tokens': None, 'n': 1, 'temperature': 1, 'top_p': 1, 'logit_bias': None, 'frequency_penalty': 0, 'presence_penalty': 0, 'stop': None, 'user': None, 'ip': '127.0.0.1', 'uid': '2155fe1580e6aed626aa1ad74c1ce54e', 'datetime': '2023-10-17 15:27:12'}
{'assistant': 'Hello! How can I assist you today?', 'is_function_call': False, 'uid': '2155fe1580e6aed626aa1ad74c1ce54e'}

To convert to json format:

aifd convert

You'll get chat_openai.json:

[
  {
    "datetime": "2023-10-17 15:27:12",
    "ip": "127.0.0.1",
    "model": "gpt-3.5-turbo",
    "temperature": 1,
    "messages": [
      {
        "user": "hi"
      }
    ],
    "tools": null,
    "is_tool_calls": false,
    "assistant": "Hello! How can I assist you today?"
  }
]

Contributions

Feel free to make contributions to this module by submitting pull requests or raising issues in the repository.

License

OpenAI-Forward is licensed under the MIT license.