TextEE

Authors: Kuan-Hao Huang, I-Hung Hsu, Tanmay Parekh, Zhiyu Xie, Zixuan Zhang, Premkumar Natarajan, Kai-Wei Chang, Nanyun Peng, Heng Ji

Introduction

TextEE is a standardized, fair, and reproducible benchmark for evaluating event extraction approaches.

Standardized data preprocessing for 10+ datasets.
Standardized data splits for reducing performance variance.
10+ implemented event extraction approaches published in recent years.
Comprehensive reevaluation results for future references.

Please check mroe details our paper TextEE: Benchmark, Reevaluation, Reflections, and Future Challenges in Event Extraction. We will keep adding new datasets and new models!

Updates

[04/21/2024] TextEE supports two more datasets: SPEED and MUC-4.
[02/23/2024] TextEE supports the CEDAR approach now.
[12/26/2023] TextEE supports three more datasets: MLEE, Genia2011, Genia2013.
[11/15/2023] We release TextEE, a framework for reevaluation and benchmark for event extraction. Feel free to contact us ([email protected]) if you want to contribute your models or datasets!

Supported Datasets

Dataset Name	Task	Paper Title	Venue
`ACE05`	E2E, ED, EAE	The Automatic Content Extraction (ACE) Program - Tasks, Data, and Evaluation	LREC 2004
`ERE`	E2E, ED, EAE	From Light to Rich ERE: Annotation of Entities, Relations, and Events	EVENTS@NAACL 2015
`MLEE`	E2E, ED, EAE	Event extraction across multiple levels of biological organization	Bioinformatics 2012
`Genia2011`	E2E, ED, EAE	Overview of Genia Event Task in BioNLP Shared Task 2011	BioNLP Shared Task 2011 Workshop
`Genia2013`	E2E, ED, EAE	The Genia Event Extraction Shared Task, 2013 Edition - Overview	BioNLP Shared Task 2013 Workshop
`M2E2`	E2E, ED, EAE	Cross-media Structured Common Space for Multimedia Event Extraction	ACL 2020
`CASIE`	E2E, ED, EAE	CASIE: Extracting Cybersecurity Event Information from Text	AAAI 2020
`PHEE`	E2E, ED, EAE	PHEE: A Dataset for Pharmacovigilance Event Extraction from Text	EMNLP 2022
`MEE`	ED	MEE: A Novel Multilingual Event Extraction Dataset	EMNLP 2022
`FewEvent`	ED	Meta-Learning with Dynamic-Memory-Based Prototypical Network for Few-Shot Event Detection	WSDM 2020
`MAVEN`	ED	MAVEN: A Massive General Domain Event Detection Dataset	EMNLP 2020
`SPPED`	ED	Event Detection from Social Media for Epidemic Prediction	NAACL 2024
`MUC-4`	EAE	Fourth Message Understanding Conference	MUC-4 1992
`RAMS`	EAE	Multi-Sentence Argument Linking	ACL 2020
`WikiEvents`	EAE	Document-Level Event Argument Extraction by Conditional Generation	NAACL 2021
`GENEVA`	EAE	GENEVA: Benchmarking Generalizability for Event Argument Extraction with Hundreds of Event Types and Argument Roles	ACL 2023

Supported Models

Model Name	Task	Paper Title	Venue
`DyGIE++`	E2E	Entity, Relation, and Event Extraction with Contextualized Span Representations	EMNLP 2019
`OneIE`	E2E	A Joint Neural Model for Information Extraction with Global Features	ACL 2020
`AMR-IE`	E2E	Abstract Meaning Representation Guided Graph Encoding and Decoding for Joint Information Extraction	NAACL 2021
`DEGREE`	E2E, ED, EAE	DEGREE: A Data-Efficient Generation-Based Event Extraction Model	NAACL 2022
`EEQA`	ED, EAE	Event Extraction by Answering (Almost) Natural Questions	EMNLP 2020
`RCEE`	ED, EAE	Event Extraction as Machine Reading Comprehension	EMNLP 2020
`Query&Extract`	ED, EAE	Query and Extract: Refining Event Extraction as Type-oriented Binary Decoding	ACL-Findings 2022
`TagPrime`	ED, EAE	TAGPRIME: A Unified Framework for Relational Structure Extraction	ACL 2023
`UniST`	ED	Unified Semantic Typing with Meaningful Label Inference	NAACL 2022
`CEDAR`	ED	GLEN: General-Purpose Event Detection for Thousands of Types	EMNLP 2023
`BART-Gen`	EAE	Document-Level Event Argument Extraction by Conditional Generation	NAACL 2021
`PAIE`	EAE	Prompt for Extraction? PAIE: Prompting Argument Interaction for Event Argument Extraction	ACL 2022
`X-Gear`	EAE	Multilingual Generative Language Models for Zero-Shot Cross-Lingual Event Argument Extraction	ACL 2022
`AMPERE`	EAE	AMPERE: AMR-Aware Prefix for Generation-Based Event Argument Extraction Model	ACL 2023

Reevaluation Results

Please check here.

Environment

Please install the following packages from both conda and pip.

conda install
  - python 3.8
  - pytorch 2.0.1
  - numpy 1.24.3
  - ipdb 0.13.13
  - tqdm 4.65.0
  - beautifulsoup4 4.11.1
  - lxml 4.9.1
  - jsonlines 3.1.0
  - jsonnet 0.20.0
  - stanza=1.5.0

pip install
  - transformers 4.30.0
  - sentencepiece 0.1.96
  - scipy 1.5.4
  - spacy 3.1.4
  - nltk 3.8.1
  - tensorboardX 2.6
  - keras-preprocessing 1.1.2
  - keras 2.4.3
  - dgl-cu111 0.6.1
  - amrlib 0.7.1
  - cached_property 1.5.2
  - typing-extensions 4.4.0
  - penman==1.2.2

Alternatively, you can use the following command.

conda env create -f env.yml

Run the following command.

python -m spacy download en_core_web_lg

Running

Training

./scripts/train.sh [config]

Evaluation for End-to-End Model

# Evaluating End-to-End
python TextEE/evaluate_end2end.py --task E2E --data [eval_data] --model [saved_model_folder]

# Evaluating EAE
python TextEE/evaluate_end2end.py --task EAE --data [eval_data] --model [saved_model_folder]

Evaluation for Pipeline Model

# Evaluating ED
python TextEE/evaluate_pipeline.py --task ED --data [eval_data] --ed_model [saved_model_folder]

# Evaluating EAE
python TextEE/evaluate_pipeline.py --task EAE --data [eval_data] --eae_model [saved_model_folder]

# Evaluating ED+EAE
python TextEE/evaluate_pipeline.py --task E2E --data [eval_data] --ed_model [saved_model_folder] --eae_model [saved_model_folder]

Making Predictions for New Texts with End-to-End Model

# Predicting End-to-End
python TextEE/predict_end2end.py --input_file demo_input.txt --model [saved_model_folder] --output_file demo_output.json

Making Predictions for New Texts with Pipeline Model

# Predicting ED+EAE
python TextEE/predict_pipeline.py --input_file demo_input.txt --ed_model [saved_model_folder] --eae_model [saved_model_folder] --output_file demo_output.json

Citation

@article{Huang23textee,
  author       = {Kuan{-}Hao Huang and
                  I{-}Hung Hsu and
                  Tanmay Parekh and 
                  Zhiyu Xie and
                  Zixuan Zhang and
                  Premkumar Natarajan and
                  Kai{-}Wei Chang and
                  Nanyun Peng and
                  Heng Ji},
  title        = {TextEE: Benchmark, Reevaluation, Reflections, and Future Challenges in Event Extraction},
  journal      = {arXiv preprint arXiv:2311.09562},
  year         = {2023},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TextEE

Introduction

Updates

Supported Datasets

Supported Models

Reevaluation Results

Environment

Running

Training

Evaluation for End-to-End Model

Evaluation for Pipeline Model

Making Predictions for New Texts with End-to-End Model

Making Predictions for New Texts with Pipeline Model

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
TextEE		TextEE
config		config
data		data
demo		demo
docs		docs
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
env.yml		env.yml

License

ej0cl6/TextEE

Folders and files

Latest commit

History

Repository files navigation

TextEE

Introduction

Updates

Supported Datasets

Supported Models

Reevaluation Results

Environment

Running

Training

Evaluation for End-to-End Model

Evaluation for Pipeline Model

Making Predictions for New Texts with End-to-End Model

Making Predictions for New Texts with Pipeline Model

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages