Skip to content

Commit

Permalink
Update the documentation for various AI providers and summarization
Browse files Browse the repository at this point in the history
  • Loading branch information
InAnYan committed Aug 1, 2024
1 parent 527e1e3 commit 3f3cd2a
Show file tree
Hide file tree
Showing 4 changed files with 126 additions and 22 deletions.
148 changes: 126 additions & 22 deletions en/ai.md
Original file line number Diff line number Diff line change
@@ -1,47 +1,86 @@
# AI functionality in JabRef

## AI summary tab

We have made a new entry editor tab: "AI Summary", where AI will generate for you a quick overview of the paper.

![AI summary tab screenshot](/img/AiSummary.png)

The AI will mention for you main objectives of the research, methods used, key findings, and conclusions.

## AI chat tab

We have made a new entry editor tab: "AI chat", where all the chatting happens.
The next new entry editor tab is "AI chat", where all the question and answering (Q&A) happens.

![AI chat tab screenshot](img/AiIntro.png)
![AI chat tab screenshot](/img/AiChat.png)

In this window you can see the following elements:

- Chat history with your messages
- Prompt for sending messages
- A button for clearing the chat history (just in case)

## How does AI work?
## How does the AI functionality work?

In the background, JabRef analyses the linked PDF files of library entries. The information used after the indexing is then supplied to the AI, which, to be precise, in our case is a Large Language Model (LLM). The LLM is currently not stored on your computer. Instead, we have many integrations with AI providers (OpenAI, Mistral AI, Hugging Face), so you can choose the one you like the most. These AI providers are available only remotely via the internet. In short: we send chunks of text to AI service and then receive processed responses. In order to use it you need to configure JabRef to use your [API](https://en.wikipedia.org/wiki/API) key.

## What is an AI provider?

AI provider is a company or a service that gives you the ability to send requests to and receive responses from LLM. In order to get the response, you also need to send an API key to authenticate and manage billing.

In the background, JabRef analyses the linked PDF files of library entries. The information used after the indexing
is then supplied to the AI, which, to be precise, in our case is a Large Language Model (LLM). The LLM is currently
not stored on your computer. Instead, we used OpenAI's ChatGPT service, which is only available remotely via the
internet. In short: we send chunks of text to OpenAI's service and then receive processed responses. In order to
use it you need to configure JabRef to use your OpenAI [API](https://en.wikipedia.org/wiki/API) key.
Here is the list of AI providers we currently support: OpenAI, Mistral AI, Hugging Face. Others include Google Vertex AI, Microsoft Azure OpenAI, Anthropic, etc.

## What is an API key?

An API key or API token is like a password that lets an app or program access information or services from another
app or website, such as a Language Model (LLM) service. It ensures only authorized users or applications can use
app or website, such as an LLM service. It ensures only authorized users or applications can use
the service. For example, when an app uses an LLM service to generate text or answer questions, it includes its
unique API key in the request. The LLM service checks this key to make sure the request is legitimate before
providing the response. This process keeps the data secure and helps track how the service is being used.

## How to get an OpenAI API key?
## Which AI provider should I use?

Unfortunately, you need to pay OpenAI a minimum fee for using ChatGPT via API, which is at the date of writing 5$.
We will describe all the necessary steps to get an API key in this section.
We recomend you chosing the OpenAI.

For Mistral AI you need to make a subscription, while for OpenAI you can send money one time.

Hugging Face gives you access to numerous count of models for free. But it will take a very long time for Hugging Face to find a free computer resources for you, and the response time will be also long.

## How to get an API key?

### How to get an OpenAI API key?

To get an OpenAI API key you need to perform these steps:

1. Login or create account [there](https://platform.openai.com/login?launch)
1. Login or create an account on [OpenAI website](https://platform.openai.com/login?launch)
2. Go to "API" section
3. Go to "Dashboard" (upper-right corner)
4. Go to "API keys" (left menu)
5. Click "Create new secret key"
6. Click "Create secret key"
7. OpenAI will show you the key. Do not share it with anyone.
7. OpenAI will show you the key

### How to get a Mistral AI API key?

1. Login or create an account on [Mistral AI website](https://auth.mistral.ai/ui/login)
2. Go to the [dashboard -> API keys](https://console.mistral.ai/api-keys/)
3. There you will find a button "Create new key". Click on it
4. You can optionally setup a name to API key and its expiration date
5. After the creation, you will see "Your key is:" with a string of random characters after that

### How to get a Hugging Face API key?

Hugging Face call an "API key" as "Access Token". It does not make much difference, you can interchangably use either "API key", or "API token", or "access token".

1. [Login](https://huggingface.co/login) or [create account](https://huggingface.co/join) on Hugging Face
2. Go to [create access token](https://huggingface.co/settings/tokens/new?)
3. Set "Token Type" to "Read"
4. Name a token
5. After you click "Create token", a popup will be shown with the API key

## What should I do with the API key and how can I enter it in JabRef?

Don't share the key to anyone, it's a secret that was created only for your account. Don't enter this key to unknown and unverfied services.

Now you need to copy and paste it in JabRef preferences. To do this:

Expand All @@ -53,6 +92,33 @@ Now you need to copy and paste it in JabRef preferences. To do this:

If you have some money on your credit balance, you can chat with your library!

## How to increase money balance for API key?

### OpenAI

In order to increase your credit balance on OpenAI, do this:

1. Add payment method [there](https://platform.openai.com/settings/organization/billing/payment-methods).
2. Add credit balance on [this](https://platform.openai.com/settings/organization/billing/overview) page.

### Mistral AI

Make the subscription on [their website](https://console.mistral.ai/billing/subscribe/).

### Hugging Face

You don't have to pay any cent for Hugging Face in order to send requests to LLMs. Though, the speed is very slow.

## What should I do with the API key?

1. Launch JabRef
2. Go "File" -> "Preferences" -> "AI" (a new tab!)
3. Check "Enable chatting with PDFs"
3. Paste the key into "OpenAI token"

Check failure on line 117 in en/ai.md

View workflow job for this annotation

GitHub Actions / lint

Ordered list item prefix [Expected: 4; Actual: 3; Style: 1/2/3]

en/ai.md:117:1 MD029/ol-prefix Ordered list item prefix [Expected: 4; Actual: 3; Style: 1/2/3]
9. Click "Save"

Check failure on line 118 in en/ai.md

View workflow job for this annotation

GitHub Actions / lint

Ordered list item prefix [Expected: 5; Actual: 9; Style: 1/2/3]

en/ai.md:118:1 MD029/ol-prefix Ordered list item prefix [Expected: 5; Actual: 9; Style: 1/2/3]

If you have some money on your credit balance, you can chat with your library!

In order to increase your credit balance on OpenAI, do this:

1. Add payment method [there](https://platform.openai.com/settings/organization/billing/payment-methods).
Expand All @@ -64,19 +130,23 @@ Here are some new options in the JabRef preferences.

![AI preferences](../img/AiPreferences.png)

- "Enable chatting with PDFs": by default chatting is turned off, so you need to check this option, if you want to use the new AI features
- "OpenAI token": here you page your API token
- "Expert settings": here you can change the parameters that affect how AI will generate your answers. If you don't understand the meaning of those settings, don't worry! We have experimented a lot and found the best parameters for you!
- "Enable AI functionality in JabRef": by default it's turned off, so you need to check this option, if you want to use the new AI features
- "AI provider": you can choose either OpenAI, Mistral AI, or Hugging Face
- "Chat model": choose the model you like (for OpenAI we recommend `gpt-4o-mini`, as it the cheapest and fastest)
- "API token": here you write your API token
- "Expert settings": here you can change the parameters that affect how AI will generate your answers. If you don't understand the meaning of those settings, don't worry! We have experimented a lot and found the best parameters for you! But if you are curious, then you can refer to [user documentation]()

Check failure on line 137 in en/ai.md

View workflow job for this annotation

GitHub Actions / lint

No empty links [Context: "[user documentation]()"]

en/ai.md:137:284 MD042/no-empty-links No empty links [Context: "[user documentation]()"]

## AI settings
## AI expert settings

### Chat model
### API base URL

**Requirements**: choose one available from combo box
**Type**: string

**Requirements**: valid URL address

The Chat model setting specifies what AI models you can use. This will differ from one provider to other. Models vary in their accuracy, knowledge of the world, context window (what amount of information can they process).
The "API Base URL" is a setting that tells your application where to find the language model's online service. Think of it as the main address or starting point for all communications with the language model. By specifying this URL, your application knows exactly where to send its requests to get responses from the language model.

Currently only OpenAI models are supported.
You don't have to set this parameter manually and remember all the addresses. JabRef will automatically substitute the address for you, when you select the AI provider.

### Embedding model

Expand Down Expand Up @@ -141,3 +211,37 @@ Setting this parameter controls the scope of information the AI model uses to ge
The "Retrieval augmented generation: minimum score" parameter sets the threshold for relevance when retrieving chunks of text for generation. It specifies the minimum score that segments must achieve to be included in the results. Any text segments scoring below this threshold are excluded from consideration in the AI's response generation process.

This parameter is crucial in ensuring that the AI model focuses on retrieving and utilizing only the most relevant information from the retrieved chunks. By filtering out segments that do not meet the specified relevance score, the AI enhances the quality and accuracy of its responses, aligning more closely with the user's needs and query context.

## BONUS: running a local LLM model

Notice:
1. This tutorial is intended for expert users

Check failure on line 218 in en/ai.md

View workflow job for this annotation

GitHub Actions / lint

Lists should be surrounded by blank lines [Context: "1. This tutorial is intended f..."]

en/ai.md:218 MD032/blanks-around-lists Lists should be surrounded by blank lines [Context: "1. This tutorial is intended f..."]
2. Local LLM model requires a lot of computational power
3. Smaller models typically have less performance then bigger ones like OpenAI models

### General explanation

You can use any program that will create a server with OpenAI compatible API.

After you started your service, you can do this:
1. The "Chat Model" field in AI preference is editable, so you can write any model you have downloaded

Check failure on line 227 in en/ai.md

View workflow job for this annotation

GitHub Actions / lint

Lists should be surrounded by blank lines [Context: "1. The "Chat Model" field in A..."]

en/ai.md:227 MD032/blanks-around-lists Lists should be surrounded by blank lines [Context: "1. The "Chat Model" field in A..."]
2. There is a field "API base URL" in "Expert Settings" where you need to supply the address of an OpenAI API compatible server

Voi la! You can use a local LLM right away in JabRef.

### More detailed tutorial

In this section we will explain how to use `ollama` for downloading and running local LLMs.

1. Install `ollama` from [their website](https://ollama.com/download)
2. Select a model that you want to run. The `ollama` provides [a big list of models](https://ollama.com/library) to choose from (we recommend you to try [`gemma2:2b`](https://ollama.com/library/gemma2:2b), or [`mistral:7b`](https://ollama.com/library/mistral), or [`tinyllama`](https://ollama.com/library/tinyllama))
3. When you selected your model, type `ollama pull <MODEL>:<PARAMETERS>` in your terminal. `<MODEL>` refers to the model name like `gemma2` or `mistral`, and `<PARAMETERS>` referes to parameters count like `2b` or `9b`
4. `ollama` will download the model for you
5. After that you can run `ollama serve` to start a local web-server. It's a server to which you can send requests and it will respond with LLM output. Notice: `ollama` server may be already running, so don't be scared of `cannot bind` error
6. Got to JabRef Preferences -> AI
7. Set the "AI provider" to "OpenAI"
8. Set the "Chat Model" to whichever model you've downloaded in form `<MODEL>:<PARAMETERS>`
9. Set the "API base URL" in "Expert Settings" to: `http://localhost:11434/v1/`

And now you are all set!

Binary file added en/img/AiChat.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added en/img/AiPreferences.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added en/img/AiSummary.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 3f3cd2a

Please sign in to comment.