-
Notifications
You must be signed in to change notification settings - Fork 46
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Refactoring to share rag things --------- Co-authored-by: Eugene Yurtsev <[email protected]>
- Loading branch information
Showing
45 changed files
with
1,979 additions
and
2,188 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
*.sql |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,5 @@ | ||
"""RAG environments.""" | ||
from langchain_benchmarks.rag.evaluators import RAG_EVALUATION | ||
from langchain_benchmarks.rag.environments.langchain_docs.task import ( | ||
LANGCHAIN_DOCS_TASK, | ||
) | ||
from langchain_benchmarks.rag.evaluators import get_eval_config | ||
from langchain_benchmarks.rag.tasks import LANGCHAIN_DOCS_TASK | ||
|
||
# Please keep this list sorted! | ||
__all__ = ["LANGCHAIN_DOCS_TASK", "RAG_EVALUATION"] | ||
# Please keep this sorted | ||
__all__ = ["get_eval_config", "LANGCHAIN_DOCS_TASK"] |
5 changes: 0 additions & 5 deletions
5
langchain_benchmarks/rag/environments/langchain_docs/architectures/__init__.py
This file was deleted.
Oops, something went wrong.
18 changes: 0 additions & 18 deletions
18
langchain_benchmarks/rag/environments/langchain_docs/langchain_docs_retriever/__init__.py
This file was deleted.
Oops, something went wrong.
Binary file removed
BIN
-4.18 MB
..._benchmarks/rag/environments/langchain_docs/langchain_docs_retriever/db_docs/docs.parquet
Binary file not shown.
47 changes: 0 additions & 47 deletions
47
langchain_benchmarks/rag/environments/langchain_docs/langchain_docs_retriever/download_db.py
This file was deleted.
Oops, something went wrong.
205 changes: 0 additions & 205 deletions
205
langchain_benchmarks/rag/environments/langchain_docs/langchain_docs_retriever/retriever.py
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
from langchain_benchmarks.rag.tasks.langchain_docs.task import LANGCHAIN_DOCS_TASK | ||
from langchain_benchmarks.rag.tasks.semi_structured_earnings.task import ( | ||
SEMI_STRUCTURED_EARNINGS_TASK, | ||
) | ||
|
||
# Please keep this sorted | ||
__all__ = ["LANGCHAIN_DOCS_TASK", "SEMI_STRUCTURED_EARNINGS_TASK"] |
6 changes: 5 additions & 1 deletion
6
...rag/environments/langchain_docs/README.md → ...hmarks/rag/tasks/langchain_docs/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,10 @@ | ||
# LangChain Docs Environment | ||
# LangChain Docs Task | ||
|
||
This code contains utilities to scrape the LangChain docs (already run) and index them | ||
using common techniques. The docs were scraped using the code in `_ingest_docs.py` and | ||
uploaded to gcs. To better compare retrieval techniques, we hold these constant and pull | ||
from that cache whenever generating different indices. | ||
|
||
|
||
The content in `indexing` composes some common indexing strategies with default paramaters for | ||
benchmarking on the langchain docs. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
from langchain_benchmarks.rag.tasks.langchain_docs import architectures, indexing | ||
from langchain_benchmarks.rag.tasks.langchain_docs.task import LANGCHAIN_DOCS_TASK | ||
|
||
DATASET_ID = ( | ||
"452ccafc-18e1-4314-885b-edd735f17b9d" # ID of public LangChain Docs dataset | ||
) | ||
|
||
__all__ = ["architectures", "indexing", "DATASET_ID", "LANGCHAIN_DOCS_TASK"] |
Oops, something went wrong.