Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡️ Speed up _get_prompt() by 48% in libs/langchain/langchain/smith/evaluation/runner_utils.py #25

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Feb 16, 2024

📄 _get_prompt() in libs/langchain/langchain/smith/evaluation/runner_utils.py

📈 Performance went up by 48% (0.48x faster)

⏱️ Runtime went down from 29.80μs to 20.20μs

Explanation and details

(click to show)

The original function spends a lot of time checking the types of input data, throwing an exception if the types do not match the expected ones, and then processing the data. This can be optimized by reducing the number of conditional statements and type checks, and also by simplifying the control flow.

Here's a rewritten version of the function that accomplishes the same thing more efficiently by reordering some checks and taking advantage of the early return pattern.

This function operates on the basis of short-circuiting, where it will return as soon as it has found a valid prompt. This reduces the number of checks it performs in many cases and streamlines the control flow of the function. It also simplifies the checks for 'prompt' and 'prompts', checking validity and returning the result in the same statement.

Correctness verification

The new optimized code was tested for correctness. The results are listed below.

✅ 0 Passed − ⚙️ Existing Unit Tests

✅ 0 Passed − 🎨 Inspired Regression Tests

✅ 7 Passed − 🌀 Generated Regression Tests

(click to show generated tests)
# imports
import pytest  # used for our unit tests
from typing import Dict, Any

# Assuming InputFormatError is a custom exception, we need to define it
class InputFormatError(Exception):
    pass
from langchain.smith.evaluation.runner_utils import _get_prompt

# unit tests

# Test valid inputs
def test_valid_single_prompt():
    assert _get_prompt({"prompt": "What is the capital of France?"}) == "What is the capital of France?"

def test_valid_multiple_prompts():
    with pytest.raises(InputFormatError):
        _get_prompt({"prompts": ["What is the capital of France?", "What is 2 + 2?"]})

def test_valid_single_key_value_pair_string():
    assert _get_prompt({"question": "What is the capital of France?"}) == "What is the capital of France?"

def test_valid_single_key_value_pair_list():
    with pytest.raises(InputFormatError):
        _get_prompt({"questions": ["What is the capital of France?", "What is 2 + 2?"]})

# Test invalid inputs
def test_empty_inputs():
    with pytest.raises(InputFormatError):
        _get_prompt({})

def test_prompt_with_non_string():
    with pytest.raises(InputFormatError):
        _get_prompt({"prompt": 123})

def test_prompts_with_non_list():
    with pytest.raises(InputFormatError):
        _get_prompt({"prompts": "This should be a list"})

def test_prompts_with_list_of_non_strings():
    with pytest.raises(InputFormatError):
        _get_prompt({"prompts": ["Valid string", 123, "Another valid string"]})

def test_multiple_key_value_pairs():
    with pytest.raises(InputFormatError):
        _get_prompt({"question": "What is the capital of France?", "answer": "Paris"})

def test_single_key_value_pair_invalid_value():
    with pytest.raises(InputFormatError):
        _get_prompt({"data": [123, 456]})

# Test edge cases
def test_prompts_with_empty_list():
    with pytest.raises(InputFormatError):
        _get_prompt({"prompts": []})

def test_prompts_with_list_of_empty_strings():
    with pytest.raises(InputFormatError):
        _get_prompt({"prompts": ["", ""]})

def test_prompts_with_list_of_one_string():
    assert _get_prompt({"prompts": ["What is the capital of France?"]}) == "What is the capital of France?"

def test_single_key_value_pair_empty_string():
    with pytest.raises(InputFormatError):
        _get_prompt({"question": ""})

def test_single_key_value_pair_empty_list():
    with pytest.raises(InputFormatError):
        _get_prompt({"questions": []})

def test_inputs_with_prompt_and_prompts_keys():
    with pytest.raises(InputFormatError):
        _get_prompt({"prompt": "What is the capital of France?", "prompts": ["What is 2 + 2?"]})

# Test rare or unexpected edge cases
def test_non_string_keys():
    with pytest.raises(InputFormatError):
        _get_prompt({99: "What is the capital of France?"})

def test_nested_structures_as_values():
    with pytest.raises(InputFormatError):
        _get_prompt({"prompt": {"question": "What is the capital of France?"}})

def test_mixing_valid_and_invalid_types_in_list():
    with pytest.raises(InputFormatError):
        _get_prompt({"prompts": ["What is the capital of France?", 42, "What is 2 + 2?", None]})

def test_special_string_types():
    assert _get_prompt({"prompt": "¿Cuál es la capital de Francia?"}) == "¿Cuál es la capital de Francia?"
    assert _get_prompt({"prompt": "What is the capital of 🇫🇷?"}) == "What is the capital of 🇫🇷?"

def test_prompts_with_control_characters():
    assert _get_prompt({"prompt": "What is the capital\nof France?"}) == "What is the capital\nof France?"

def test_extremely_large_inputs():
    large_string = "A" * 1000000
    assert _get_prompt({"prompt": large_string}) == large_string

def test_incorrect_data_types():
    with pytest.raises(InputFormatError):
        _get_prompt({"prompt": 3.14159})

@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by CodeFlash AI label Feb 16, 2024
@codeflash-ai codeflash-ai bot requested a review from aphexcx February 16, 2024 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by CodeFlash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants