-
Notifications
You must be signed in to change notification settings - Fork 0
/
ep_npl_eval.Rmd
96 lines (65 loc) · 1.65 KB
/
ep_npl_eval.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
---
title: "Evaluation ref extraction"
output: github_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
```
```{r}
library(tidyverse)
library(googlesheets4)
```
### Sheet
Load sheet with evaluation results for a random sample of 100 matched references:
```{r}
bg_eval <- googlesheets4::read_sheet("https://docs.google.com/spreadsheets/d/1MF7MZBJcRe9ZEj6Id-ixYBK4t_FDXLXEa9ehVCtlrcM/edit#gid=1399287127")
```
### Indicators
Indicators:
- the number of correct predictions (COR),
- the number of actual predictions (ACT)
- the number of possible predictions (POS)
**Precision** is defined as COR / ACT
**Recall** is defined COR / POS
See David Nadeau, Satoshi Sekine. A survey of named entity recognition and classification. <https://nlp.cs.nyu.edu/sekine/papers/li07.pdf>
### Calculation
#### Precision
```{r}
# precision COR / ACT
cor_sum <- bg_eval %>%
pull(COR) %>%
sum(na.rm = TRUE)
act_sum <- bg_eval %>%
pull(ACT) %>%
sum(na.rm = TRUE)
cor_sum / act_sum
```
#### Recall
```{r}
# recall is defined COR / POS
pos_sum <- bg_eval %>%
pull(POS) %>%
sum(na.rm = TRUE)
cor_sum / pos_sum
```
### Issues
Publication types other than journal articles:
Book chapter
- EP-2684744-B1
- EP-3205514-A1
Database entries
- EP-3162897-A1
- EP-3461498-A1
- EP-3392268-A1
Conference proceedings
- EP-3540662-A1
Leaflets
- EP-2959200-B1
Reports
- EP-3419205-A1
- EP-2151935-A4
Year and number not found, when reference follows the following structure:
```
GALLUZZO P; BOCCHETTA M: "Notch signaling in lung cancer", EXPERT REV ANTICANCER THER., vol. 11, 2011, pages 533 - 40
````
(EP-3095797-A1)