Skip to content

Commit

Permalink
docs: Add embedding section to context doc (#2375)
Browse files Browse the repository at this point in the history
* docs: Add embedding section to context doc

Signed-off-by: TennyZhuang <[email protected]>

* update doc

Signed-off-by: TennyZhuang <[email protected]>

* refine doc

Signed-off-by: TennyZhuang <[email protected]>

* Apply suggestions from code review

---------

Signed-off-by: TennyZhuang <[email protected]>
Co-authored-by: Meng Zhang <[email protected]>
  • Loading branch information
TennyZhuang and wsxiaoys authored Jun 9, 2024
1 parent 94d35f0 commit 02560fd
Show file tree
Hide file tree
Showing 2 changed files with 31 additions and 3 deletions.
2 changes: 1 addition & 1 deletion crates/tabby-scheduler/src/doc/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ impl IndexAttributeBuilder<SourceDocument> for DocBuilder {
}

let chunk = json!({
doc::fields::CHUNK_TEXT: chunk_text,
doc::fields::CHUNK_TEXT: chunk_text,
});

yield (chunk_embedding_tokens, chunk)
Expand Down
32 changes: 30 additions & 2 deletions website/docs/administration/context/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ The repository context is used to connect Tabby with a source code repository fr
<img src={GitUrl} alt="Git" width="700" />

- For GitHub / GitLab, a personal access token is required to access private repositories.
* Check the instructions in the corresponding tab to create a token.
* Check the instructions in the corresponding tab to create a token.

<img src={GitHubGitLabUrl} alt="GitHub or GitLab" width="700" />

Expand Down Expand Up @@ -59,4 +59,32 @@ Once connected, the indexing job will start automatically. You can check the sta

Additionally, you can also visit the **Code Browser** page to view the connected repository.

<img src={CodeBrowserUrl} alt="code browser" width="800" />
<img src={CodeBrowserUrl} alt="code browser" width="800" />'

## Internal: Vector Index

When adding a document, it is converted into vectors that help quickly find relevant context. During searches or chats, queries and messages are also converted into vectors to locate the most similar documents.

### Use the default embedding model

The default embedding model is "Nomic-Embed-Text", which is a high-performing open embedding model with a large token context window.

Currently, "Nomic-Embed-Text" is the only supported local embedding model.

### Using a remote embedding model provider

You can add also a remote embedding model provider by adding a new section to the `~/.tabby/config.toml` file.

```toml
[model.embedding.http]
kind = "openai/embedding"
api_key = "sk-..."
model_name = "gpt-4"
```

Following embedding model providers are supported:

* `openai/embedding`
* `voyageai/embedding`
* `llama.cpp/embedding`
* `ollama/embedding`

0 comments on commit 02560fd

Please sign in to comment.