Evaluation tutorials

FlagOpen · Nov 21, 2024 · 7842388 · 7842388
1 parent b1849f8
commit 7842388
Show file tree

Hide file tree

Showing 7 changed files with 1,431 additions and 1 deletion.
diff --git a/Tutorials/4_Evaluation/4.3.1_C-MTEB.ipynb → Tutorials/4_Evaluation/4.2.3_C-MTEB.ipynb b/Tutorials/4_Evaluation/4.3.1_C-MTEB.ipynb → Tutorials/4_Evaluation/4.2.3_C-MTEB.ipynb
diff --git a/...on/4.4.1_Sentence_Transformers_Eval.ipynb → ...on/4.3.1_Sentence_Transformers_Eval.ipynb b/...on/4.4.1_Sentence_Transformers_Eval.ipynb → ...on/4.3.1_Sentence_Transformers_Eval.ipynb
diff --git a/Tutorials/4_Evaluation/4.4.2_BEIR.ipynb → Tutorials/4_Evaluation/4.4.1_BEIR.ipynb b/Tutorials/4_Evaluation/4.4.2_BEIR.ipynb → Tutorials/4_Evaluation/4.4.1_BEIR.ipynb
@@ -49,7 +49,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "BEIR contains 18 datasets which can be downloaded from the [link](https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/), while 4 of them are private datasets that need appropriate licences. If you want to access to those 4 datasets, take a look at their [wiki](https://github.com/beir-cellar/beir/wiki/Datasets-available) for more information. "
+    "BEIR contains 18 datasets which can be downloaded from the [link](https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/), while 4 of them are private datasets that need appropriate licences. If you want to access to those 4 datasets, take a look at their [wiki](https://github.com/beir-cellar/beir/wiki/Datasets-available) for more information. Information collected and codes adapted from BEIR GitHub [repo](https://github.com/beir-cellar/beir)."
    ]
   },
   {

diff --git a/Tutorials/4_Evaluation/4.5.1_MIRACL.ipynb b/Tutorials/4_Evaluation/4.5.1_MIRACL.ipynb
diff --git a/Tutorials/4_Evaluation/4.5.2_MLDR.ipynb b/Tutorials/4_Evaluation/4.5.2_MLDR.ipynb
diff --git a/Tutorials/4_Evaluation/4.5.3_MKQA.ipynb b/Tutorials/4_Evaluation/4.5.3_MKQA.ipynb
@@ -0,0 +1,76 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Evaluate on MKQA"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "[MKQA](https://github.com/apple/ml-mkqa) is an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically diverse languages."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 0. Installation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "First install the library we are using:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# % pip install FlagEmbedding"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 1. Dataset"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "MKQA contains 10,000 queries sampled from the [Google Natural Questions dataset](https://github.com/google-research-datasets/natural-questions). We use the well-processed [corpus](https://huggingface.co/datasets/BeIR/nq) of NQ offered by the BEIR."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "dev",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.7"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/docs/source/API/evaluation/mkqa.rst b/docs/source/API/evaluation/mkqa.rst
@@ -2,6 +2,7 @@ MKQA
 ====
 
 `MKQA <https://github.com/apple/ml-mkqa>`_ is an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically diverse languages.
+The queries are sampled from the [Google Natural Questions Dataset](https://github.com/google-research-datasets/natural-questions). 
 
 Each example in the dataset has the following structure:
-Original file line number
+Diff line change
@@ Expand Up / @@ -2,6 +2,7 @@ MKQA @@
     ====
     `MKQA <https://github.com/apple/ml-mkqa>`_ is an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically diverse languages.
+    The queries are sampled from the [Google Natural Questions Dataset](https://github.com/google-research-datasets/natural-questions).
     Each example in the dataset has the following structure:
@@ Expand Down @@