Skip to content

Commit

Permalink
Merge branch 'main' into doc/google-tools
Browse files Browse the repository at this point in the history
  • Loading branch information
jzhao62 authored Dec 16, 2024
2 parents 89c7a75 + 0e402d6 commit 92dba45
Show file tree
Hide file tree
Showing 5 changed files with 24 additions and 15 deletions.
22 changes: 11 additions & 11 deletions docs/docs/understanding/rag/index.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
!!! tip
If you haven't, [install LlamaIndex](./installation.md) and complete the [starter tutorial](./starter_example.md) before you read this. It will help ground these steps in your experience.
If you haven't, [install LlamaIndex](../../getting_started/installation.md) and complete the [starter tutorial](../../getting_started/starter_example.md) before you read this. It will help ground these steps in your experience.

LLMs are trained on enormous bodies of data but they aren't trained on **your** data. Retrieval-Augmented Generation (RAG) solves this problem by adding your data to the data LLMs already have access to. You will see references to RAG frequently in this documentation. Query engines, chat engines and agents often use RAG to complete their tasks.

In RAG, your data is loaded and prepared for queries or "indexed". User queries act on the index, which filters your data down to the most relevant context. This context and your query then go to the LLM along with a prompt, and the LLM provides a response.

Even if what you're building is a chatbot or an agent, you'll want to know RAG techniques for getting data into your application.

![](../_static/getting_started/basic_rag.png)
![](../../_static/getting_started/basic_rag.png)

### Stages within RAG

Expand All @@ -23,36 +23,36 @@ There are five key stages within RAG, which in turn will be a part of most large

- **Evaluation**: a critical step in any flow is checking how effective it is relative to other strategies, or when you make changes. Evaluation provides objective measures of how accurate, faithful and fast your responses to queries are.

![](../_static/getting_started/stages.png)
![](../../_static/getting_started/stages.png)

### Important concepts within RAG

There are also some terms you'll encounter that refer to steps within each of these stages.

#### Loading stage

[**Nodes and Documents**](../module_guides/loading/documents_and_nodes/index.md): A `Document` is a container around any data source - for instance, a PDF, an API output, or retrieve data from a database. A `Node` is the atomic unit of data in LlamaIndex and represents a "chunk" of a source `Document`. Nodes have metadata that relate them to the document they are in and to other nodes.
[**Nodes and Documents**](../../module_guides/loading/documents_and_nodes/index.md): A `Document` is a container around any data source - for instance, a PDF, an API output, or retrieve data from a database. A `Node` is the atomic unit of data in LlamaIndex and represents a "chunk" of a source `Document`. Nodes have metadata that relate them to the document they are in and to other nodes.

[**Connectors**](../module_guides/loading/connector/index.md):
[**Connectors**](../../module_guides/loading/connector/index.md):
A data connector (often called a `Reader`) ingests data from different data sources and data formats into `Documents` and `Nodes`.

#### Indexing Stage

[**Indexes**](../module_guides/indexing/index.md):
[**Indexes**](../../module_guides/indexing/index.md):
Once you've ingested your data, LlamaIndex will help you index the data into a structure that's easy to retrieve. This usually involves generating `vector embeddings` which are stored in a specialized database called a `vector store`. Indexes can also store a variety of metadata about your data.

[**Embeddings**](../module_guides/models/embeddings.md): LLMs generate numerical representations of data called `embeddings`. When filtering your data for relevance, LlamaIndex will convert queries into embeddings, and your vector store will find data that is numerically similar to the embedding of your query.
[**Embeddings**](../../module_guides/models/embeddings.md): LLMs generate numerical representations of data called `embeddings`. When filtering your data for relevance, LlamaIndex will convert queries into embeddings, and your vector store will find data that is numerically similar to the embedding of your query.

#### Querying Stage

[**Retrievers**](../module_guides/querying/retriever/index.md):
[**Retrievers**](../../module_guides/querying/retriever/index.md):
A retriever defines how to efficiently retrieve relevant context from an index when given a query. Your retrieval strategy is key to the relevancy of the data retrieved and the efficiency with which it's done.

[**Routers**](../module_guides/querying/router/index.md):
[**Routers**](../../module_guides/querying/router/index.md):
A router determines which retriever will be used to retrieve relevant context from the knowledge base. More specifically, the `RouterRetriever` class, is responsible for selecting one or multiple candidate retrievers to execute a query. They use a selector to choose the best option based on each candidate's metadata and the query.

[**Node Postprocessors**](../module_guides/querying/node_postprocessors/index.md):
[**Node Postprocessors**](../../module_guides/querying/node_postprocessors/index.md):
A node postprocessor takes in a set of retrieved nodes and applies transformations, filtering, or re-ranking logic to them.

[**Response Synthesizers**](../module_guides/querying/response_synthesizers/index.md):
[**Response Synthesizers**](../../module_guides/querying/response_synthesizers/index.md):
A response synthesizer generates a response from an LLM, using a user query and a given set of retrieved text chunks.
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,10 @@ def messages_to_anthropic_messages(
if "cache_control" in message.additional_kwargs
else TextBlockParam(text=block.text, type="text")
)
content.append(content_)

# avoid empty text blocks
if content_["text"]:
content.append(content_)
elif isinstance(block, ImageBlock):
# FUTURE: Claude does not support URLs, so we need to always convert to base64
img_bytes = block.resolve_image(as_base64=True).read()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ exclude = ["**/BUILD"]
license = "MIT"
name = "llama-index-llms-anthropic"
readme = "README.md"
version = "0.6.0"
version = "0.6.1"

[tool.poetry.dependencies]
python = ">=3.9,<4.0"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,13 @@ def to_mistral_chatmessage(
elif m.role == MessageRole.SYSTEM:
new_messages.append(SystemMessage(content=m.content))
elif m.role == MessageRole.TOOL or m.role == MessageRole.FUNCTION:
new_messages.append(ToolMessage(content=m.content))
new_messages.append(
ToolMessage(
content=m.content,
tool_call_id=m.additional_kwargs.get("tool_call_id"),
name=m.additional_kwargs.get("name"),
)
)
else:
raise ValueError(f"Unsupported message role {m.role}")

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ exclude = ["**/BUILD"]
license = "MIT"
name = "llama-index-llms-mistralai"
readme = "README.md"
version = "0.3.0"
version = "0.3.1"

[tool.poetry.dependencies]
python = ">=3.9,<4.0"
Expand Down

0 comments on commit 92dba45

Please sign in to comment.