Annotator for Chinese Text Corpus (under development, welcome for ideas and codes)
Many NLP tasks require many labelling data. Current annotators are mostly for English. We want to develop Chinese Annotator based on some open source annotators.
Reference:
**** IEPY
整个工程比较完整,有用户管理系统。前端略重,对用户不是非常友好
代码 https://github.com/machinalis/iepy
说明 http://iepy.readthedocs.io/en/latest/index.html
**** DeepDive (Mindtagger)
Screenshot of Mindtagger precision task in progress
介绍 http://deepdive.stanford.edu/labeling
前端比较简单,用户界面友好。
前端代码 https://github.com/HazyResearch/mindbender
将DeepDive的corenlp部分转为支持中文的代码尝试:
https://github.com/SongRb/DeepDiveChineseApps
https://github.com/qiangsiwei/DeepDive_Chinese
https://github.com/mcavdar/deepdive/commit/6882178cbd38a5bbbf4eee8b76b1e215537425b2
**** BRAT
介绍 http://brat.nlplab.org/index.html
在线试用 http://weaver.nlplab.org/~brat/demo/latest/#/
代码 https://github.com/nlplab/brat
**** SUTDAnnotator
用的不是网页前端而是pythonGUI,但比较轻量。
代码 https://github.com/jiesutd/SUTDAnnotator
Paper https://github.com/jiesutd/SUTDAnnotator/blob/master/lrec2018.pdf
**** Snorkel
Page: https://hazyresearch.github.io/snorkel/
Github: https://github.com/HazyResearch/snorkel
Demo Paper:https://hazyresearch.github.io/snorkel/pdfs/snorkel_demo.pdf
**** Slate
Code: https://bitbucket.org/dainkaplan/slate/
Paper: http://www.jlcl.org/2011_Heft2/11.pdf
**** Prodigy
和著名的spacy是一家做的
Website: https://prodi.gy/docs/