Merge pull request #219 from ckosten/v2.0

New KGQA Benchmark Dataset Spider4SPARQL V2.0
KGQA · Feb 12, 2024 · fff164a · fff164a
2 parents bee17a0 + 344245e
commit fff164a
Show file tree

Hide file tree

Showing 4 changed files with 25 additions and 0 deletions.
diff --git a/frontend/static/entries.txt b/frontend/static/entries.txt
@@ -63,4 +63,5 @@
 /datasets/other/WC2014QA - 1 Hop
 /datasets/other/WC2014QA - 2 Hop
 /datasets/other/WC2014QA - Total
+/datasets/other/Spider4SPARQL
 /systems/
diff --git a/other/$Spider4SPARQL.md b/other/$Spider4SPARQL.md
@@ -0,0 +1,2 @@
+## References 
+<a name="myfootnote1">[1]</a> Kosten et al. “Spider4SPARQL: A Complex Benchmark for Evaluating Knowledge Graph Question Answering Systems.” IEEE BigData (2023).
diff --git a/other/Spider4SPARQL.md b/other/Spider4SPARQL.md
@@ -0,0 +1,12 @@
+---
+    name: Spider4SPARQL
+    datasetUrl: https://github.com/ckosten/Spider4SPARQL
+---
+
+|   Model / System    | Year | Accuracy | Language |                                 Reported by                                  |
+|:-------------------:|:----:|:--------:|:--------:|:----------------------------------------------------------------------------:|
+|    GPT-3.5 (10-shot)    | 2023 |    45%     |    EN    | [Kosten, 2023](https://arxiv.org/pdf/2309.16248.pdf) |
+|    T5-Base     | 2023 |    42%     |    EN    | [Kosten, 2023](https://arxiv.org/pdf/2309.16248.pdf) |
+|     ValueNet4SPARQL     | 2023 |    41%     |    EN    | [Kosten, 2023](https://arxiv.org/pdf/2309.16248.pdf) |
+|       T5-Small       | 2023 |    27%     |    EN    | [Kosten, 2023](https://arxiv.org/pdf/2309.16248.pdf) |
+|    GPT-3.5 (zero-shot)    | 2023 |    8%     |    EN    | [Kosten, 2023](https://arxiv.org/pdf/2309.16248.pdf) |
diff --git a/other/^Spider4SPARQL.md b/other/^Spider4SPARQL.md
@@ -0,0 +1,10 @@
+# Spider4SPARQL
+
+**Spider4SPARQL**<sup>[[1]](#myfootnote1)</sup>  is a new SPARQL benchmark dataset featuring 9,693 manually generated NL questions and 4,721 unique, novel, and complex SPARQL queries of varying complexity (There can be mutilple natural language questions per SPARQL query). In addition to the NL/SPARQL pairs, we also provide the corresponding 166 knowledge graphs and ontologies, which cover 138 different domains. This complex benchmark enables novel ways of evaluating the strengths and weaknesses of modern KGQA systems.  For more details on how the dataset was created and evaluated please see our [paper](https://arxiv.org/pdf/2309.16248.pdf).
+
+The total dataset consists of 166 novel knowledge graphs, with 9,693 natural language/SPARQL pairs that have been split into 1034/8695 pairs for dev/ train respectively. The model performance was evaluated using execution accuracy.  
+
+This dataset can be downloaded via the [link](https://github.com/ckosten/Spider4SPARQL).
+
+
+
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		## References
		<a name="myfootnote1">[1]</a> Kosten et al. “Spider4SPARQL: A Complex Benchmark for Evaluating Knowledge Graph Question Answering Systems.” IEEE BigData (2023).