TrQuAD is Turkish SQuAD - Turkish translation of the SQuAD. The dataset can be downloaded from its devoted HuggingFace repo. The dataset is of version 1.1. - there are no empty answer strings.
More about the dataset preparation can be found in our research paper. For the dataset format and more information please visit the HF repo.
We adopted the original SQuAD benchmarking scripts. The benchmarking process can be fired up via run.sh
. Here's the set of hyperparameters we used for benchmarking BERTurk:
python3 run_qa.py \
--model_name_or_path dbmdz/bert-base-turkish-cased \
--dataset_name turkish-nlp-suite/TrQuAD \
--do_train \
--do_eval \
--per_device_train_batch_size 256 \
--learning_rate 5e-5 \
--num_train_epochs 3 \
--max_seq_length 128 \
--doc_stride 32 \
--output_dir berturk \
--overwrite_output_dir True
The success metrics are same with original SQuAD, exact match and F1 scores. BERTurk achieved scores of 56.09/71.56
.
Coming soon!