Skip to content

Commit

Permalink
feat: add support for Falcon LLM
Browse files Browse the repository at this point in the history
At the moment, however, the Falcon LLM seems to hallucinate quite often, we should support custom prompts for each LLM
  • Loading branch information
gventuri committed Jun 8, 2023
1 parent c0097e2 commit 53a17fe
Show file tree
Hide file tree
Showing 8 changed files with 138 additions and 30 deletions.
2 changes: 1 addition & 1 deletion Notebooks/Getting Started.ipynb

Large diffs are not rendered by default.

5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ Options:

- **-d, --dataset**: The file path to the dataset.
- **-t, --token**: Your HuggingFace or OpenAI API token, if no token provided pai will pull from the `.env` file.
- **-m, --model**: Choice of LLM, either `openai`, `open-assistant`, `starcoder`, or Google `palm`.
- **-m, --model**: Choice of LLM, either `openai`, `open-assistant`, `starcoder`, `falcon`, `azure-openai` or `google-palm`.
- **-p, --prompt**: Prompt that PandasAI will run.

To view a full list of available options and their descriptions, run the following command:
Expand Down Expand Up @@ -205,6 +205,9 @@ llm = OpenAI(api_token="YOUR_API_KEY")

# Starcoder
llm = Starcoder(api_token="YOUR_HF_API_KEY")

# Falcon
llm = Falcon(api_token="YOUR_HF_API_KEY")
```

## License
Expand Down
8 changes: 8 additions & 0 deletions docs/API/llms.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,14 @@ Starcoder wrapper extended through Base HuggingFace Class
options:
show_root_heading: true

### Falcon

Falcon wrapper extended through Base HuggingFace Class

::: pandasai.llm.falcon
options:
show_root_heading: true

### Azure OpenAI

OpenAI API through Azure Platform wrapper
Expand Down
26 changes: 19 additions & 7 deletions docs/pai_cli.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,47 @@
# Command-Line Tool

[Pai] is the command line tool designed to provide a convenient way to interact with `pandasai` through a command line interface (CLI). In order to access the CLI tool, make sure to create a virtualenv for testing purpose and to install project dependencies in your local virtual environment using `pip` by running the following command:

```
pip install -e .
```

Alternatively, you can use `poetry` to create and activate the virtual environment by running the following command:

```
poetry shell
```

Inside the activated virtual environment, install the project dependencies by running the following command:

```
poetry install
```

By following these steps, you will now have the necessary environment to access the CLI tool.
By following these steps, you will now have the necessary environment to access the CLI tool.

```
pai [OPTIONS]
```

Options:

- **-d, --dataset**: The file path to the dataset.
- **-t, --token**: Your HuggingFace or OpenAI API token, if no token provided pai will pull from the `.env` file.
- **-m, --model**: Choice of LLM, either `openai`, `open-assistant`, or `starcoder`.
- **-m, --model**: Choice of LLM, either `openai`, `open-assistant`, `starcoder`, `falcon`, `azure-openai` or `google-palm`.
- **-p, --prompt**: Prompt that PandasAI will run.

To view a full list of available options and their descriptions, run the following command:

```
pai --help
```
>For example,
>```
>pai -d "~/pandasai/example/data/Loan payments data.csv" -m "openai" -p "How many loans are from men and have been paid off?"
>```
>Should result in the same output as the `from_csv.py` example.

> For example,
>
> ```
> pai -d "~/pandasai/example/data/Loan payments data.csv" -m "openai" -p "How many loans are from men and have been paid off?"
> ```
>
> Should result in the same output as the `from_csv.py` example.
52 changes: 35 additions & 17 deletions pai/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
- **-d, --dataset**: The file path to the dataset.
- **-t, --token**: Your HuggingFace or OpenAI API token, if no token provided pai will pull
from the `.env` file.
- **-m, --model**: Choice of LLM, either `openai`, `open-assistant`, or `starcoder`.
- **-m, --model**: Choice of LLM, either `openai`, `open-assistant`, `starcoder`, `falcon`, `azure-openai` or `google-palm`.
- **-p, --prompt**: Prompt that PandasAI will run.
To view a full list of available options and their descriptions, run the following command:
Expand All @@ -31,20 +31,35 @@
"""
import os

import click
import pandas as pd

from pandasai import PandasAI
from pandasai.llm.openai import OpenAI
from pandasai.llm.google_palm import GooglePalm
from pandasai.llm.open_assistant import OpenAssistant
from pandasai.llm.openai import OpenAI
from pandasai.llm.starcoder import Starcoder
from pandasai.llm.google_palm import GooglePalm


@click.command()
@click.option('-d', '--dataset', type=str, required=True, help='The dataset to use.')
@click.option('-t', '--token', type=str, required=False, default=None, help='The API token to use.')
@click.option('-m', '--model', type=click.Choice(['openai', 'open-assistant', 'starcoder', 'palm']),
required=True, help='The type of model to use.')
@click.option('-p', '--prompt', type=str, required=True, help='The prompt to use.')
@click.option("-d", "--dataset", type=str, required=True, help="The dataset to use.")
@click.option(
"-t",
"--token",
type=str,
required=False,
default=None,
help="The API token to use.",
)
@click.option(
"-m",
"--model",
type=click.Choice(["openai", "open-assistant", "starcoder", "falcon", "palm"]),
required=True,
help="The type of model to use.",
)
@click.option("-p", "--prompt", type=str, required=True, help="The prompt to use.")
def main(dataset: str, token: str, model: str, prompt: str) -> None:
"""Main logic for the command line interface tool."""

Expand Down Expand Up @@ -80,31 +95,34 @@ def main(dataset: str, token: str, model: str, prompt: str) -> None:
".xml": pd.read_xml,
}
if ext in file_format:
df = file_format[ext](dataset) # pylint: disable=C0103
df = file_format[ext](dataset) # pylint: disable=C0103
else:
print("Unsupported file format.")
return

except Exception as e: # pylint: disable=W0718 disable=C0103
except Exception as e: # pylint: disable=W0718 disable=C0103
print(e)
return

if model == "openai":
llm = OpenAI(api_token = token)
llm = OpenAI(api_token=token)

elif model == "open-assistant":
llm = OpenAssistant(api_token = token)
llm = OpenAssistant(api_token=token)

elif model == "starcoder":
llm = Starcoder(api_token=token)

elif model == 'starcoder':
llm = Starcoder(api_token = token)
elif model == "falcon":
llm = Starcoder(api_token=token)

elif model == 'palm':
llm = GooglePalm(api_key = token)
elif model == "palm":
llm = GooglePalm(api_key=token)

try:
pandas_ai = PandasAI(llm, verbose=True)
response = pandas_ai(df, prompt)
print(response)

except Exception as e: # pylint: disable=W0718 disable=C0103
except Exception as e: # pylint: disable=W0718 disable=C0103
print(e)
51 changes: 51 additions & 0 deletions pandasai/llm/falcon.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
""" Falcon LLM
This module is to run the Falcon API hosted and maintained by HuggingFace.co.
To generate HF_TOKEN go to https://huggingface.co/settings/tokens after creating Account
on the platform.
Example:
Use below example to call Falcon Model
>>> from pandasai.llm.falcon import Falcon
"""


import os
from typing import Optional

from dotenv import load_dotenv

from ..exceptions import APIKeyNotFoundError
from .base import HuggingFaceLLM

load_dotenv()


class Falcon(HuggingFaceLLM):

"""Falcon LLM API
A base HuggingFaceLLM class is extended to use Falcon model.
"""

api_token: str
_api_url: str = (
"https://api-inference.huggingface.co/models/tiiuae/falcon-7b-instruct"
)
_max_retries: int = 5

def __init__(self, api_token: Optional[str] = None):
"""
__init__ method of Falcon Class
Args:
api_token (str): API token from Huggingface platform
"""

self.api_token = api_token or os.getenv("HUGGINGFACE_API_KEY") or None
if self.api_token is None:
raise APIKeyNotFoundError("HuggingFace Hub API key is required")

@property
def type(self) -> str:
return "falcon"
7 changes: 3 additions & 4 deletions pandasai/llm/open_assistant.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
the platform.
Example:
Use below example to call Starcoder Model
Use below example to call OpenAssistant Model
>>> from pandasai.llm.open_assistant import OpenAssistant
"""
Expand All @@ -23,8 +23,8 @@

class OpenAssistant(HuggingFaceLLM):
"""Open Assistant LLM
A base HuggingFaceLLM class is extended to use OpenAssistant Model via its API.
Currently `oasst-sft-1-pythia-12b` is supported via this module.
A base HuggingFaceLLM class is extended to use OpenAssistant Model via its API.
Currently `oasst-sft-1-pythia-12b` is supported via this module.
"""

api_token: str
Expand All @@ -35,7 +35,6 @@ class OpenAssistant(HuggingFaceLLM):
_max_retries: int = 10

def __init__(self, api_token: Optional[str] = None):

"""
__init__ method of OpenAssistant Calss
Expand Down
17 changes: 17 additions & 0 deletions tests/llms/test_falcon.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
"""Unit tests for the falcon LLM class"""

import pytest

from pandasai.exceptions import APIKeyNotFoundError
from pandasai.llm.falcon import Falcon


class TestFalconLLM:
"""Unit tests for the Falcon LLM class"""

def test_type(self):
assert Falcon(api_token="test").type == "falcon"

def test_init(self):
with pytest.raises(APIKeyNotFoundError):
Falcon()

0 comments on commit 53a17fe

Please sign in to comment.