Entifier

CS6120

Extracting quality named entity relations in textual documents for visualisation and querying in a performant graph database

Output Columns of ReVerb from OpenIE: 1. filename 2. sentence number 3. arg1 4. rel 5. arg2 6. arg1 start 7. arg1 end 8. rel start 9. rel end 10. arg2 start 11. arg2 end 12. conf 13. sentence words 14. sentence pos tags 15. sentence chunk tags 16. arg1 normalized 17. rel normalized 18. arg2 normalized

Installation Instructions

This requires pandas, neo4j Install anaconda if in doubt.

How to run reverb

java -Xmx512m -jar reverb.jar yourfile.txt

java -Xmx512m -jar reverb.jar yourfile.txt > outputfile.tsv

Note that this can cause parsing errors when done in windows

How to extract wikipedia

https://github.com/attardi/wikiextractor python3 WikiExtractor.py -b 2G -o new Wikipedia-20181026150828.xml

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
data		data
.gitignore		.gitignore
LICENSE		LICENSE
Mapping_Open_Class_Entity_Relation_Extraction_to_an_Ontology (4).pdf		Mapping_Open_Class_Entity_Relation_Extraction_to_an_Ontology (4).pdf
README.md		README.md
dbloader.py		dbloader.py
distance.py		distance.py
entity_mapper.py		entity_mapper.py
pickle_fix.py		pickle_fix.py
relmap.py		relmap.py
tabremove.py		tabremove.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Entifier

Installation Instructions

How to run reverb

How to extract wikipedia

About

Releases

Packages

Contributors 2

Languages

License

cdilga/entifier

Folders and files

Latest commit

History

Repository files navigation

Entifier

Installation Instructions

How to run reverb

How to extract wikipedia

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages