-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathparams.yaml
39 lines (39 loc) · 1.5 KB
/
params.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
hp:
chunk-size: 250
overlap: 75
embeddings-model: all-MiniLM-L6-v2
doc-store:
collection: eidc-data
files: data/chroma-data
files:
metadata: data/eidc_metadata.json
extracted: data/extracted_metadata.json
supporting-docs: data/supporting-docs.json
chunked: data/chunked_data.json
embeddings: data/embeddings.json
doc-store: data/chroma-data
test-set: data/eidc_rag_testset.csv
cleaned-test-set: data/cleaned_testset.csv
eval-set: data/evaluation_data.csv
metrics: data/metrics.json
results: data/results.csv
eval-plot: data/eval.png
pipeline: data/pipeline.yml
sub-sample: 0 # sample n datasets for testing (0 will use all datasets)
max-length: 0 # truncate longer texts for testing (0 will use all data)
test-set-size: 200 # reduce the size of the test set for faster testing
rag:
model: llama3.1
prompt: >-
You are part of a retrieval augmented pipeline. You will be given a question and
a context on which to base your answer.\n
Do not use your own knowledge to answer the question.\n
The context provided will be metadata from datasets contained in the Environmental
Information Data Centre (EIDC).\n
Do not refer to "context" in your answer, instead refer to the context as available
information.
If the answer to the question is not clear from the context, suggest which dataset
or datasets might be helpful in answering the question.\n
Question: {{query}}\n
Context: {% for document in documents%}\n{{ document.content }}\n{% endfor %}
Answer: