Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translation scripts stops after a few minutes #5

Open
stefangrotz opened this issue Mar 22, 2023 · 7 comments
Open

Translation scripts stops after a few minutes #5

stefangrotz opened this issue Mar 22, 2023 · 7 comments

Comments

@stefangrotz
Copy link

stefangrotz commented Mar 22, 2023

After 1-4% the translation script stops changing numbers and looks like this:

grafik

I experimented with lower numbers of parallel calls, but nothing really seems to help.

Any ideas? Could this be related to the rate limits? Is it just a visual issue?

@sungkim11
Copy link

I have not used this script, but OpenAI is having issues with their API service for the last few days, including today.

@raihan0824
Copy link

After 1-4% the translation script stops changing numbers and looks like this:

grafik

I experimented with lower numbers of parallel calls, but nothing really seems to help.

Any ideas? Could this be related to the rate limits? Is it just a visual issue?

did you managed to solve this problem?

@stefangrotz
Copy link
Author

No, right now I wait until the AlpacaDataCleaned progresses further, then I'll try again.

@Aciid
Copy link

Aciid commented May 14, 2023

Exceptions blocking the task execution is the most probable cause of this issue. The ratelimits are per account, typically very small 90000 tokens / minute. Start with 10 threads. Had this problem for myself and traced it.

You can add simple handling for to translate_text. I'm not sure are ratelimited requests billed in openai's API. I suggest testing the translation script with a smaller chunk. 0-100 for example.

Here is what you need to catch openai.error.RateLimitError

def translate_text(value):
    try:
        response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo",
            messages=[
                    {"role": "system", "content": "You are a helpful assistant."},
                    {"role": "user", "content": f"Translate the following text to Finnish: '{value}'"},
                ],
            max_tokens=1024,
            temperature=0,
            )
        return response.choices[0]["message"]["content"].strip()
    except openai.error.RateLimitError as e:
        print("Rate limit error: ", e)
        pass

This will lead to null returned instead of the translated string so you may prefer rewriting the code to retry the request or to skip the whole prompt from being appended if one of the instructions is null.

But please check the billing terms of how execption request are handled before adding retry to the requests in openai first, this may become very expensive other vice.


Retrying the request with backoff on exception as per openai's official documentation which I've implemented and using now.

import backoff
...
@backoff.on_exception(backoff.expo, openai.error.RateLimitError)
def translate_text(value):
    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[
                {"role": "system", "content": "You are a helpful assistant."},
                {"role": "user", "content": f"Translate the following text to Finnish: '{value}'"},
            ],
        max_tokens=1024,
        temperature=0,
        )
    return response.choices[0]["message"]["content"].strip()


I'm being ratelimited so much that even using only 10 threads it takes ~5 minutes to translate 350 complete instruction prompts with system, user and output. Insane.

Br;

@stefangrotz
Copy link
Author

Thanks for the analysis, very interesting. Maybe just paying for the Google Translate or DeepL API instead could be a better option?

@Aciid
Copy link

Aciid commented May 14, 2023

Thanks for the analysis, very interesting. Maybe just paying for the Google Translate or DeepL API instead could be a better option?

Hi, I'm unfamiliar with the these other translate services, but I'll surely look them up. I think if any of them have by default better rate limits they are better worth if they are priced accordingly. For me 1000 alpaca-lora cleaned promprts was around $0.75-1 with openai while debugging this code, but the rate-limit is killing producitvity to accomplish this in the matter of time I could with threads.

I created a whole rewrite of the code that now translates the whole prompt object string values, but not the keys and is tuned for performance. This is still slow 100-150/minute, due to spurious rate-limits on openai "model busy"/"ratelimit reached" then falling to exponential backoff timer.


import openai
import json
from concurrent.futures import ThreadPoolExecutor, as_completed
from threading import Lock
from concurrent.futures import ThreadPoolExecutor
from concurrent.futures import as_completed

# from tqdm import tqdm
import backoff

# Replace 'your_api_key' with your actual API key
openai.api_key = "your-key-here"


def backoff_hdlr(details):
    print(
        "Backing off {wait:0.1f} seconds after {tries} tries "
        "calling function {target} with args {args} and kwargs "
        "{kwargs}".format(**details)
    )


@backoff.on_exception(
    backoff.expo, openai.error.RateLimitError, max_tries=50, on_backoff=backoff_hdlr
)
def translate_text(value):
    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[
            {
                "role": "system",
                "content": "Translate the following JSON objects string values into Finnish, do not translate string "
                           "keys. No english string values are accepted.",
            },
            {
                "role": "user",
                "content": f"{json.dumps(value)}'",
            },
        ],
        max_tokens=1024,
        temperature=0,
    )
    return response.choices[0]["message"]["content"].strip()


def translate_item(item):
    translated_item = translate_text(item)
    return translated_item


# Maximum number of parallel requests
MAX_PARALLEL_REQUESTS = 25

# Assuming the input JSON is in a file named 'input.json'
with open("../data/alpaca_data_cleaned_archive.json", "r") as f:
    data = json.load(f)

lock = Lock()
start = 0
end = 51785
translated_data = []
data = data[start:end]
tasks_completed = 0


# simple progress indicator callback function
def progress_indicator(future):
    global lock, tasks_completed
    # obtain the lock
    with lock:
        try:
            translated_data.append(json.loads(future.result()))
            # update the counter
            tasks_completed += 1
            # report progress
            print(
                f"{tasks_completed}/{len(futures)} completed, {len(futures) - tasks_completed} remain."
            )

            # if tasks_completed is divisable by 1000 save snapshot
            if tasks_completed % 1000 == 0:
                # Save the translated data to a new JSON file named 'translated_data.json'
                with open(f"translated_data_up_to_{start}_to_{end}.json", "w") as f:
                    json.dump(translated_data, f, ensure_ascii=False, indent=4)

                print(
                    f"Translation snapshot is saved in 'translated_data_from_{start}_to_{tasks_completed}.json'"
                )
        except Exception as e:
            print(f"exception: {e}")
            pass


with ThreadPoolExecutor(max_workers=MAX_PARALLEL_REQUESTS) as executor:
    futures = {executor.submit(translate_item, item): item for item in data}

    for future in futures:
        future.add_done_callback(progress_indicator)

# Finally save the translated data to a new JSON file named 'translated_data.json'
with open(f"translated_data_up_to_{start}_to_{end}.json", "w") as f:
    json.dump(translated_data, f, ensure_ascii=False, indent=4)

print(
    f"Translation complete. The translated data is saved in 'translated_data_from_{start}_to_{end}.json'"
)

@Aciid
Copy link

Aciid commented May 14, 2023

DeepL API https://www.deepl.com/pro#developer

  • Pay as you go (€20.00 per 1,000,000 characters)

--
➜ wc -c alpaca_data_cleaned_archive.json
22,680,910 alpaca_data_cleaned_archive.json

This is actually much more expensive : D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants