Enhancement: Support for HuggingFace Self-Served TGI Serving with Updated Dependencies and Improved AI Handler #472

okotek · 2023-11-23T14:37:42Z

Type

Enhancement

Description

This PR introduces support for HuggingFace self-served TGI serving and makes several enhancements to the AI Handler. The main changes include:

Added a new attribute api_base to the AIHandler class.
Modified the chat_completion method to include api_base in the acompletion call.
Adjusted the chat_completion method to add prefixes and suffixes to system and user messages based on the configuration.
Handled the case where the model is not found in the MAX_TOKENS dictionary in get_max_tokens function.
Updated the openai version from 0.27.8 to 1.3.5 and litellm version from 0.12.5 to 1.7.1 in the requirements.txt file.
Replaced the import of TryAgain from openai.error to tenacity.

PR changes walkthrough

Relevant files

Enhancement

1 files

ai_handler.py pr_agent/algo/ai_handler.py The `AIHandler` class has been updated to include a new attribute `api_base`. The `chat_completion` method has been modified to include `api_base` in the `acompletion` call and to add prefixes and suffixes to system and user messages based on the configuration. The import of `TryAgain` has been replaced from `openai.error` to `tenacity`.	+38/-17

Bug fix

1 files

utils.py pr_agent/algo/utils.py The `get_max_tokens` function has been updated to handle the case where the model is not found in the `MAX_TOKENS` dictionary.	+1/-1

Dependencies

1 files

requirements.txt requirements.txt The `openai` version has been updated from `0.27.8` to `1.3.5` and `litellm` version from `0.12.5` to `1.7.1`.	+2/-2

github-actions · 2023-11-23T14:38:59Z

PR Analysis

(review updated until commit `5d9c767`)

🎯 Main theme: Enhancement of AI Handler to support HuggingFace self-served TGI serving and update dependencies
📝 PR summary: This PR introduces support for HuggingFace self-served TGI serving in the AI Handler. It adds a new attribute api_base to the AIHandler class and modifies the chat_completion method to include api_base in the acompletion call. It also updates the openai and litellm versions in the requirements.txt file and replaces the import of TryAgain from openai.error to tenacity.
📌 Type of PR: Enhancement
🧪 Relevant tests added: No
⏱️ Estimated effort to review [1-5]: 3, because the PR introduces significant changes to the AI Handler and updates dependencies, which requires a careful review to ensure compatibility and correct functionality.
🔒 Security concerns: No security concerns found

PR Feedback

💡 General suggestions: The PR is well-structured and the changes are clear. However, it would be beneficial to include tests for the new functionality introduced, particularly the changes in the chat_completion method and the handling of the api_base attribute. This would help ensure the new functionality works as expected and does not introduce regressions.

🤖 Code feedback:

relevant file	pr_agent/algo/ai_handler.py
suggestion	Consider handling the case where 'huggingface' is not in `get_settings().config.model` when setting `litellm.api_base` and `self.api_base`. This could prevent potential errors if the `HUGGINGFACE.API_BASE` setting is present but the model is not 'huggingface'. [important]
relevant line	if get_settings().get("HUGGINGFACE.API_BASE", None) and 'huggingface' in get_settings().config.model:

relevant file	pr_agent/algo/ai_handler.py
suggestion	It would be beneficial to handle the case where the `api_base` attribute is `None` in the `chat_completion` method. This could prevent potential errors when calling `acompletion` with `api_base=None`. [important]
relevant line	api_base=self.api_base,

relevant file	pr_agent/algo/utils.py
suggestion	Consider handling the case where the model is not found in the `MAX_TOKENS` dictionary more explicitly. For example, you could log a warning message when the default value is used. This could help with debugging if unexpected behavior occurs due to the model not being found in the dictionary. [medium]
relevant line	max_tokens_model = MAX_TOKENS.get(model, 4000)

relevant file	requirements.txt
suggestion	Ensure that the updated versions of `openai` and `litellm` are compatible with the rest of the codebase and do not introduce breaking changes. It would be beneficial to perform thorough testing after updating these dependencies. [important]
relevant line	openai==1.3.5

mrT23 · 2023-11-26T15:38:45Z

/improve --pr_code_suggestions.summarize=true

github-actions · 2023-11-26T15:39:51Z

PR Code Suggestions

💡 Suggestion:

Avoid hardcoding values directly in the code. Instead, consider defining them as constants at the top of your file or in a separate config file. This will make it easier to manage and modify these values in the future.

File: pr_agent/algo/ai_handler.py (74-79)

Example code:

Existing code:

@retry(
    exceptions=(APIError, TryAgain, AttributeError, RateLimitError),
    tries=OPENAI_RETRIES,
    delay=2,
    backoff=2,
    jitter=(1, 3),
)

Improved code:

OPENAI_RETRY_DELAY = 2
OPENAI_RETRY_BACKOFF = 2
OPENAI_RETRY_JITTER = (1, 3)

@retry(
    exceptions=(APIError, TryAgain, AttributeError, RateLimitError),
    tries=OPENAI_RETRIES,
    delay=OPENAI_RETRY_DELAY,
    backoff=OPENAI_RETRY_BACKOFF,
    jitter=OPENAI_RETRY_JITTER,
)

💡 Suggestion:

It's a good practice to use f-strings for formatting strings in Python. It's more readable and efficient than the traditional string formatting methods.

File: pr_agent/algo/ai_handler.py (135-142)

Example code:

Existing code:

get_logger().info(
    "AI response",
    response=resp,
    messages=messages,
    finish_reason=finish_reason,
    model=model,
    usage=usage,
)

Improved code:

get_logger().info(
    f"AI response: {resp}, messages: {messages}, finish_reason: {finish_reason}, model: {model}, usage: {usage}"
)

💡 Suggestion:

It's better to use the dict.get() method instead of dict[key] for dictionary access. This will prevent KeyError exceptions if the key is not present in the dictionary.

File: pr_agent/algo/utils.py (373-373)

Example code:

Existing code:

max_tokens_model = MAX_TOKENS.get(model, 4000)

Improved code:

max_tokens_model = MAX_TOKENS.get(model, 4000)

💡 Suggestion:

It's recommended to pin the versions of your dependencies. This can help to avoid unexpected breaks due to updates in the dependencies.

File: requirements.txt (25-25)

Example code:

Existing code:

litellm @ git+https://github.com/Codium-ai/litellm.git

Improved code:

litellm @ git+https://github.com/Codium-ai/litellm.git@<commit-hash>

GadiZimerman · 2023-11-30T09:32:16Z

/add_docs

pr_agent/algo/ai_handler.py

github-actions · 2023-12-01T09:30:51Z

Persistent review updated to latest commit 5d9c767

mrT23 · 2023-12-04T19:03:57Z

PR Description updated to latest commit (5d9c767)

mrT23 · 2023-12-04T19:25:14Z

PR Description updated to latest commit (5d9c767)

mrT23 · 2023-12-17T14:36:24Z

Persistent review updated to latest commit 5d9c767

mrT23 · 2023-12-17T14:43:54Z

Persistent review updated to latest commit 5d9c767

mrT23 · 2023-12-17T15:01:50Z

PR Code Suggestions

💡 Suggestion:

Instead of using the get_settings() function multiple times, consider storing the settings in a variable and reusing it.

File: pr_agent/algo/ai_handler.py (30-60)

Example code:

Existing code:

if get_settings().get("OPENAI.KEY", None):
    openai.api_key = get_settings().openai.key
    litellm.openai_key = get_settings().openai.key
if get_settings().get("HUGGINGFACE.KEY", None):
    litellm.huggingface_key = get_settings().huggingface.key
if get_settings().get("HUGGINGFACE.API_BASE", None) and 'huggingface' in get_settings().config.model:
    litellm.api_base = get_settings().huggingface.api_base
    self.api_base = get_settings().huggingface.api_base

Improved code:

settings = get_settings()
if settings.get("OPENAI.KEY", None):
    openai.api_key = settings.openai.key
    litellm.openai_key = settings.openai.key
if settings.get("HUGGINGFACE.KEY", None):
    litellm.huggingface_key = settings.huggingface.key
if settings.get("HUGGINGFACE.API_BASE", None) and 'huggingface' in settings.config.model:
    litellm.api_base = settings.huggingface.api_base
    self.api_base = settings.huggingface.api_base

💡 Suggestion:

Instead of concatenating strings using '+', consider using f-strings for better readability.

File: pr_agent/algo/ai_handler.py (112-115)

Example code:

Existing code:

system = get_settings().get("CONFIG.MODEL_SYSTEM_PREFIX", "") + system + \
         get_settings().get("CONFIG.MODEL_SYSTEM_SUFFIX", "")
user = get_settings().get("CONFIG.MODEL_USER_PREFIX", "") + user + \
       get_settings().get("CONFIG.MODEL_USER_SUFFIX", "")

Improved code:

system = f"{get_settings().get('CONFIG.MODEL_SYSTEM_PREFIX', '')}{system}{get_settings().get('CONFIG.MODEL_SYSTEM_SUFFIX', '')}"
user = f"{get_settings().get('CONFIG.MODEL_USER_PREFIX', '')}{user}{get_settings().get('CONFIG.MODEL_USER_SUFFIX', '')}"

💡 Suggestion:

Instead of using the get_settings() function multiple times, consider storing the settings in a variable and reusing it.

File: pr_agent/algo/utils.py (378-380)

Example code:

Existing code:

settings = get_settings()
max_tokens_model = MAX_TOKENS.get(model, 4000)
if settings.config.max_model_tokens:
    max_tokens_model = min(settings.config.max_model_tokens, max_tokens_model)

Improved code:

settings = get_settings()
max_tokens_model = MAX_TOKENS.get(model, 4000)
if settings.config.max_model_tokens:
    max_tokens_model = min(settings.config.max_model_tokens, max_tokens_model)

💡 Suggestion:

Keep the dependencies in the requirements.txt file sorted alphabetically. This makes it easier to find a dependency and see its version at a glance.

File: requirements.txt (5-25)

Example code:

Existing code:

openai==1.3.5
...
litellm==1.7.1

Improved code:

litellm==1.7.1
...
openai==1.3.5

mrT23 · 2023-12-20T14:42:15Z

PR Description updated to latest commit (5d9c767)

mrT23 · 2023-12-24T06:38:01Z

PR Code Suggestions

💡 [enhancement]

It seems that the api_base attribute is only used in the chat_completion method. Instead of storing it as an instance variable, you could retrieve it directly from the settings within the chat_completion method. This would make the code more concise and easier to understand.

File: pr_agent/algo/ai_handler.py (29-60)

Example code:

Existing code:

self.api_base = None
...
if get_settings().get("HUGGINGFACE.API_BASE", None) and 'huggingface' in get_settings().config.model:
    litellm.api_base = get_settings().huggingface.api_base
    self.api_base = get_settings().huggingface.api_base

Improved code:

if get_settings().get("HUGGINGFACE.API_BASE", None) and 'huggingface' in get_settings().config.model:
    litellm.api_base = get_settings().huggingface.api_base

💡 [maintainability]

The chat_completion method is quite long and does a lot of different things. Consider breaking it down into smaller helper methods to improve readability and maintainability.

File: pr_agent/algo/ai_handler.py (81-147)

Example code:

Existing code:

async def chat_completion(
    self, model: str, system: str, user: str, temperature: float = 0.2
):
    ...

Improved code:

async def chat_completion(
    self, model: str, system: str, user: str, temperature: float = 0.2
):
    system, user = self._add_prefix_suffix(system, user)
    messages = self._create_messages(system, user)
    response = await self._get_completion_response(model, messages, temperature)
    self._log_response(response, messages, model)
    return response["choices"][0]["message"]["content"], response["choices"][0]["finish_reason"]

💡 [enhancement]

The get_max_tokens function could be improved by handling the case where the model is not found in the MAX_TOKENS dictionary more gracefully. Instead of defaulting to 4000, consider raising an exception or logging a warning.

File: pr_agent/algo/utils.py (378-378)

Example code:

Existing code:

max_tokens_model = MAX_TOKENS.get(model, 4000)

Improved code:

if model not in MAX_TOKENS:
    get_logger().warning(f"Model {model} not found in MAX_TOKENS dictionary. Defaulting to 4000 tokens.")
max_tokens_model = MAX_TOKENS.get(model, 4000)

mrT23 · 2023-12-24T08:41:48Z

/improve

qodo-merge-pro · 2023-12-24T08:42:20Z

PR Code Suggestions

💡 Suggestion:

Instead of using the get_settings() function multiple times, consider storing the settings in a variable and reusing it.

File: pr_agent/algo/ai_handler.py (30-60)

Example code:

Existing code:

if get_settings().get("OPENAI.KEY", None):
    openai.api_key = get_settings().openai.key
    litellm.openai_key = get_settings().openai.key
if get_settings().get("HUGGINGFACE.KEY", None):
    litellm.huggingface_key = get_settings().huggingface.key
if get_settings().get("HUGGINGFACE.API_BASE", None) and 'huggingface' in get_settings().config.model:
    litellm.api_base = get_settings().huggingface.api_base
    self.api_base = get_settings().huggingface.api_base

Improved code:

settings = get_settings()
if settings.get("OPENAI.KEY", None):
    openai.api_key = settings.openai.key
    litellm.openai_key = settings.openai.key
if settings.get("HUGGINGFACE.KEY", None):
    litellm.huggingface_key = settings.huggingface.key
if settings.get("HUGGINGFACE.API_BASE", None) and 'huggingface' in settings.config.model:
    litellm.api_base = settings.huggingface.api_base
    self.api_base = settings.huggingface.api_base

💡 Suggestion:

Consider using f-strings for string concatenation for better readability.

File: pr_agent/algo/ai_handler.py (112-115)

Example code:

Existing code:

system = get_settings().get("CONFIG.MODEL_SYSTEM_PREFIX", "") + system + \
         get_settings().get("CONFIG.MODEL_SYSTEM_SUFFIX", "")
user = get_settings().get("CONFIG.MODEL_USER_PREFIX", "") + user + \
       get_settings().get("CONFIG.MODEL_USER_SUFFIX", "")

Improved code:

system = f"{get_settings().get('CONFIG.MODEL_SYSTEM_PREFIX', '')}{system}{get_settings().get('CONFIG.MODEL_SYSTEM_SUFFIX', '')}"
user = f"{get_settings().get('CONFIG.MODEL_USER_PREFIX', '')}{user}{get_settings().get('CONFIG.MODEL_USER_SUFFIX', '')}"

💡 Suggestion:

Consider using the dict.get() method with a default value to avoid a KeyError if the model is not found in the MAX_TOKENS dictionary.

File: pr_agent/algo/utils.py (378-378)

Example code:

Existing code:

max_tokens_model = MAX_TOKENS[model]

Improved code:

max_tokens_model = MAX_TOKENS.get(model, 4000)

💡 Suggestion:

Consider maintaining the order of dependencies in the requirements file. This can help in identifying the changes made to the dependencies more easily.

File: requirements.txt (5-25)

Example code:

Existing code:

openai==1.3.5
...
litellm==1.7.1

Improved code:

openai==1.3.5
litellm==1.7.1
...

mrT23 · 2023-12-24T12:43:59Z

pr_agent/algo/ai_handler.py

+        except Exception as e:
            get_logger().error("Unknown error during OpenAI inference: ", e)
            raise TryAgain from e


Suggestion: Consider using a more specific exception instead of the generic Exception in the chat_completion method. This will make the error handling more precise and prevent catching and handling unexpected exceptions that might need different handling logic. [best practice]

Suggested change

except Exception as e:

get_logger().error("Unknown error during OpenAI inference: ", e)

raise TryAgain from e

except SpecificException as e: # Replace SpecificException with the specific exception you want to catch

get_logger().error("Specific error during OpenAI inference: ", e)

raise TryAgain from e

mrT23 · 2023-12-24T12:44:00Z

pr_agent/algo/ai_handler.py

+    @retry(
+        exceptions=(APIError, TryAgain, AttributeError, RateLimitError),
+        tries=OPENAI_RETRIES,
+        delay=2,
+        backoff=2,
+        jitter=(1, 3),
+    )


Suggestion: It's a good practice to avoid using magic numbers directly in the code. The numbers used in the retry decorator in the chat_completion method (like 2, 1, 3) could be replaced with well-named constants to improve code readability and maintainability. [best practice]

Suggested change

@retry(

exceptions=(APIError, TryAgain, AttributeError, RateLimitError),

tries=OPENAI_RETRIES,

delay=2,

backoff=2,

jitter=(1, 3),

)

DELAY = 2

BACKOFF = 2

JITTER = (1, 3)

@retry(

exceptions=(APIError, TryAgain, AttributeError, RateLimitError),

tries=OPENAI_RETRIES,

delay=DELAY,

backoff=BACKOFF,

jitter=JITTER,

)

mrT23 · 2023-12-24T12:44:01Z

pr_agent/algo/utils.py

@@ -375,7 +375,7 @@ def get_user_labels(current_labels: List[str] = None):

 def get_max_tokens(model):
    settings = get_settings()
-    max_tokens_model = MAX_TOKENS[model]
+    max_tokens_model = MAX_TOKENS.get(model, 4000)


Suggestion: Consider adding a default value for MAX_TOKENS in the get_max_tokens function. This will make the function more robust in case the MAX_TOKENS dictionary doesn't contain the model as a key. [enhancement]

Suggested change

max_tokens_model = MAX_TOKENS.get(model, 4000)

DEFAULT_MAX_TOKENS = 4000

max_tokens_model = MAX_TOKENS.get(model, DEFAULT_MAX_TOKENS)

mrT23 · 2023-12-24T12:44:02Z

requirements.txt

@@ -23,3 +22,4 @@ pinecone-client
 pinecone-datasets @ git+https://github.com/mrT23/pinecone-datasets.git@main
 loguru==0.7.2
 google-cloud-aiplatform==1.35.0
+litellm==1.7.1


Suggestion: It's a good practice to pin the versions of your dependencies to ensure your project does not break due to updates in the dependencies. Consider pinning the version of litellm in your requirements.txt file. [best practice]

Suggested change

litellm==1.7.1

litellm==1.7.1 # Pin the version to ensure compatibility

benlaniado · 2023-12-27T13:43:47Z

/analyze

qodo-merge-pro · 2023-12-27T13:43:57Z

PR Analysis

Here is a components analysis of the PR, as well as a change summary for each component:

Relevant files Components

ai_handler.py

2 components changed

__init__ (method of AiHandler) Changes summary: The `__init__` method in the PR made the following changes: Added the initialization of `self.api_base` to `None`. Modified the condition for setting `litellm.api_base` and `self.api_base` based on the presence of "HUGGINGFACE.API_BASE" in the configuration file and the presence of "huggingface" in the model configuration. Removed an unnecessary line of code.	+4/-3
chat_completion (function) Changes summary: The `chat_completion` function in the PR underwent several changes. The function signature was modified to include a new parameter `temperature`. Additionally, the `system` and `user` messages were modified by adding prefixes and suffixes. The `acompletion` function call was also modified to include the `api_base` parameter. Some exception handling was updated to remove the `Timeout` exception and include the `RateLimitError` exception.	+24/-11

utils.py

1 components changed

get_max_tokens (function) Changes summary: The PR modified the `get_max_tokens` function by changing the way the `max_tokens_model` variable is assigned. Instead of directly accessing the `MAX_TOKENS` dictionary with the `model` key, the function now uses the `get` method with a default value of 4000. This change allows for a fallback value in case the `model` key is not found in the dictionary.	+1/-1

barnett-yuxiang

PTAL

Changed to support HuggingFace self served TGI serving

538daf6

github-actions bot added the Review effort [1-5]: 3 label Nov 23, 2023

mrT23 approved these changes Nov 23, 2023

View reviewed changes

okotek added 3 commits November 27, 2023 12:31

Merge remote-tracking branch 'origin/main' into ok/support_tgi

bf8852e

Further adaptations for TGI

5d9c767

Further adaptations for TGI

7390f52

github-actions bot reviewed Nov 30, 2023

View reviewed changes

pr_agent/algo/ai_handler.py Show resolved Hide resolved

pr_agent/algo/ai_handler.py Show resolved Hide resolved

pr_agent/algo/ai_handler.py Show resolved Hide resolved

pr_agent/algo/ai_handler.py Show resolved Hide resolved

github-actions bot added Review effort [1-5]: 4 and removed Review effort [1-5]: 3 labels Dec 1, 2023

Codium-ai deleted a comment from github-actions bot Dec 1, 2023

CodiumAI-Agent added Review effort [1-5]: 3 and removed Review effort [1-5]: 4 labels Dec 3, 2023

CodiumAI-Agent changed the title ~~Changed to support HuggingFace self served TGI serving~~ Support for HuggingFace self served TGI serving Dec 3, 2023

CodiumAI-Agent added the enhancement New feature or request label Dec 3, 2023

github-actions bot changed the title ~~Support for HuggingFace self served TGI serving~~ Enhancement: Support for HuggingFace self served TGI serving Dec 3, 2023

mrT23 changed the title ~~Enhancement: Support for HuggingFace self served TGI serving~~ Enhancement: Support for HuggingFace self served TGI serving with updated dependencies Dec 4, 2023

mrT23 changed the title ~~Enhancement: Support for HuggingFace self served TGI serving with updated dependencies~~ Enhancement: Support for HuggingFace Self-Served TGI Serving with Updated Dependencies and Improved AI Handler Dec 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement: Support for HuggingFace Self-Served TGI Serving with Updated Dependencies and Improved AI Handler #472

Enhancement: Support for HuggingFace Self-Served TGI Serving with Updated Dependencies and Improved AI Handler #472

okotek commented Nov 23, 2023 •

edited by mrT23

Loading

github-actions bot commented Nov 23, 2023 •

edited by mrT23

Loading

mrT23 commented Nov 26, 2023

github-actions bot commented Nov 26, 2023

GadiZimerman commented Nov 30, 2023

github-actions bot commented Dec 1, 2023

mrT23 commented Dec 4, 2023

mrT23 commented Dec 4, 2023

mrT23 commented Dec 17, 2023

mrT23 commented Dec 17, 2023

mrT23 commented Dec 17, 2023

mrT23 commented Dec 20, 2023

mrT23 commented Dec 24, 2023

mrT23 commented Dec 24, 2023

qodo-merge-pro bot commented Dec 24, 2023

mrT23 Dec 24, 2023

mrT23 Dec 24, 2023

mrT23 Dec 24, 2023

mrT23 Dec 24, 2023

benlaniado commented Dec 27, 2023

qodo-merge-pro bot commented Dec 27, 2023

barnett-yuxiang left a comment

	max_tokens_model = MAX_TOKENS.get(model, 4000)
	DEFAULT_MAX_TOKENS = 4000
	max_tokens_model = MAX_TOKENS.get(model, DEFAULT_MAX_TOKENS)

	litellm==1.7.1
	litellm==1.7.1 # Pin the version to ensure compatibility

Enhancement: Support for HuggingFace Self-Served TGI Serving with Updated Dependencies and Improved AI Handler #472

Enhancement: Support for HuggingFace Self-Served TGI Serving with Updated Dependencies and Improved AI Handler #472

Conversation

okotek commented Nov 23, 2023 • edited by mrT23 Loading

Type

Description

PR changes walkthrough

github-actions bot commented Nov 23, 2023 • edited by mrT23 Loading

PR Analysis

(review updated until commit 5d9c767)

PR Feedback

mrT23 commented Nov 26, 2023

github-actions bot commented Nov 26, 2023

PR Code Suggestions

GadiZimerman commented Nov 30, 2023

github-actions bot commented Dec 1, 2023

mrT23 commented Dec 4, 2023

mrT23 commented Dec 4, 2023

mrT23 commented Dec 17, 2023

mrT23 commented Dec 17, 2023

mrT23 commented Dec 17, 2023

PR Code Suggestions

mrT23 commented Dec 20, 2023

mrT23 commented Dec 24, 2023

PR Code Suggestions

mrT23 commented Dec 24, 2023

qodo-merge-pro bot commented Dec 24, 2023

PR Code Suggestions

mrT23 Dec 24, 2023

Choose a reason for hiding this comment

mrT23 Dec 24, 2023

Choose a reason for hiding this comment

mrT23 Dec 24, 2023

Choose a reason for hiding this comment

mrT23 Dec 24, 2023

Choose a reason for hiding this comment

benlaniado commented Dec 27, 2023

qodo-merge-pro bot commented Dec 27, 2023

PR Analysis

barnett-yuxiang left a comment

Choose a reason for hiding this comment

okotek commented Nov 23, 2023 •

edited by mrT23

Loading

github-actions bot commented Nov 23, 2023 •

edited by mrT23

Loading

(review updated until commit `5d9c767`)