from dart.data import *
Datasets
+Preset datasets so far:
+Dataset | +ID | +Stored At | +
---|---|---|
MATH | +"math-train" / "math-test" |
+HuggingFace | +
GSM8K | +"gsm8k-train" / "gsm8k-test" |
+HuggingFace | +
MWPBench/CollegeMath/Test | +"mwpbench/college-math-test" |
+dart/data/eval-dsets | +
DeepMind Mathematics | +"deepmind-mathematics" |
+dart/data/eval-dsets | +
OlympiadBench-Math | +"olympiadbench/OE_TO_maths_en_COMP" |
+dart/data/eval-dsets | +
TheoremQA | +"theoremqa" |
+dart/data/eval-dsets | +
For other datasets, please refer to load_query_dps
to add by yourself.
+
load_query_dps
++++load_query_dps (dataset:str|list[str]='math-test', + max_n_trials:int|list[int]=1, + min_n_corrects:int|list[int]=0, + prompt_template:str='alpaca')
Load dataset
(s) as QueryDataPoint
s. If needed, please add dataset
s here following the format of the existing datasets, or specify the dataset .json
path with the stem name as dataset ID.
+ | Type | +Default | +Details | +
---|---|---|---|
dataset | +str | list[str] | +math-test | +(List of) dataset ID or path to dataset of samples with “query” and “ref_ans” fields. Path will not use other two arguments. |
+
max_n_trials | +int | list[int] | +1 | +(List of) maximum number of raw responses to be generated for each dataset. Non-positive value or None means no limit. |
+
min_n_corrects | +int | list[int] | +0 | +(List of) minimum number of correct responses to be generated for each dataset. Non-positive value or None means no limit. |
+
prompt_template | +str | +alpaca | +ID / Path of the prompt template. | +
Returns | +list | ++ | QueryDataPoint to be input to dart.gen.gen . |
+
Data Templates
+We unified the data format across dart
.
+
QueryDataPoint
++++QueryDataPoint (dataset:str, query:str, ref_ans:str, + prompt_template:dart.utils.PromptTemplate='alpaca', + n_shots:int=-1, n_trials:int=0, n_corrects:int=0, + max_n_trials:int|None=None, min_n_corrects:int|None=None, + **kwargs:dict[str,typing.Any])
The query-level data point to generate responses with vllm
using sampling_params
(and evaluate with evaluator
) on.
+ | Type | +Default | +Details | +
---|---|---|---|
dataset | +str | ++ | The dataset name the the query belongs to. E.g. “math”. | +
query | +str | ++ | Raw query, without other prompt. | +
ref_ans | +str | ++ | The short reference answer to the query . |
+
prompt_template | +PromptTemplate | +alpaca | +The prompt template object to use. | +
n_shots | +int | +-1 | +Number of examples in the few-shot prompt. Negative means adaptive to the datasets. | +
n_trials | +int | +0 | +Number of raw responses already generated for the query . |
+
n_corrects | +int | +0 | +Number of correct responses already generated for the query . |
+
max_n_trials | +int | None | +None | +Maximum number of trials to generate a response, by default NoneNone or Negative means no limit. |
+
min_n_corrects | +int | None | +None | +Maximum number of trials to generate a response, by default NoneNone or Negative means no limit. |
+
kwargs | +dict | ++ | Other fields to store. | +
+
RespSampleBase
++++RespSampleBase (dataset:str, query:str, ref_ans:str, resp:str, + ans:str=None, correct:bool=None)
The response-level data point containing the query-level data point and other response-level information.
++ | Type | +Default | +Details | +
---|---|---|---|
dataset | +str | ++ | The dataset name the the query belongs to. | +
query | +str | ++ | The input query to generate responses on. | +
ref_ans | +str | ++ | The reference answer to the query. | +
resp | +str | ++ | The generated response. | +
ans | +str | +None | +The answer in the generated response, by default None | +
correct | +bool | +None | +Whether the generated response is correct, by default None | +
+
RespSampleVLLM
++++RespSampleVLLM (dataset:str, query:str, ref_ans:str, abs_tol:float=None, + resp:str=None, finish_reason:str=None, + stop_reason:str=None, cumulative_logprob:float=None, + ans:str=None, correct:bool=None, **kwargs)
The response-level data point from vllm
model, containg extra fields like finish_reason
, stop_reason
, cumulative_logprob
.
+ | Type | +Default | +Details | +
---|---|---|---|
dataset | +str | ++ | The dataset name the the query belongs to. | +
query | +str | ++ | The input query to generate responses on. | +
ref_ans | +str | ++ | The reference answer to the query. | +
abs_tol | +float | +None | +The absolute tolerance of the answer. | +
resp | +str | +None | +The generated response. | +
finish_reason | +str | +None | +The reason for finishing the generation from vllm |
+
stop_reason | +str | +None | +The reason for stopping the generation from vllm , e.g. EoS token. |
+
cumulative_logprob | +float | +None | +The cumulative log probability of the generated response. | +
ans | +str | +None | +The generated response. | +
correct | +bool | +None | +Whether the generated response is correct. | +
kwargs | ++ | + | Other fields to store. | +