Reference-Scope-Identification-for-Citances-Using-CNN

This repository contains all files necessary to reproduce the results of our paper "Reference Scope Identification for Citances Using Convolutional Neural Network¹".

Reference-Citance Pair Extraction:

generate_test.py along with test*.py are meant for data set parsing.

Stopwords removal: remove_stopwords.py

csv_convert.py writes the <RP, CP> pair into tab-separated csv cells along with assigning corresponding binary labels (1- true, 0 - false).

Feature Extraction Modules:

Lexical Features: All the similarity measures require a pair of texts as input and work by averaging over all the words of the sentence.

Word Overlap Measures:
- dice_coeff.py measures the Dice Similarity.
- cosine.py measures the Cosine similarity.
- jaccard_sim.py measures the Jaccard similarity.
- fuzz_string_matching.py measures Levenshtein distance based fuzzy string similarity.
- sequence_matcher.py gives a measure of modified Gestalt pattern-matching based sequence matcher score.
TF-IDF similarity: tfidf_using_cosine.py measures the TF-IDF vector cosine similarity between the citance and the reference sentence.
ROUGE measure: rouge_score.py gives a measure of ROUGE-1, ROUGE-2 and ROUGE-L metrics.
Named entity overlap: ner_overlap.py

Knowledge-based Feature: wordnet_similariy.py measures the best semantic similarity score between words in the citance and the reference sentence out of all the sets of cognitive synonyms (synsets) present in the WordNet.
Corpus-based Feature: word2vec_similarity.py, as the name denotes.
Surface Features:

surface_features.py
sentiWordNet.py measures the overall positive and negative sentiment score of the reference sentence averaged over all the words, based on the SentiWordNet 3.0 lexical resource.
yuleK.py measures lexical richness of the reference sentence based on Yule’s K index.

Classification Algorithm:

main.py contains the 1-D CNN implementation along with GBC and ABC classifiers for training and testing the above generated feature vectors.

The original presentation can be found here.

In case of queries, feel free to reach out at [email protected].

References

S. Jha, A. Chaurasia, A. Sudhakar, and A. K. Singh, “Reference scope identification for citances using convolutional neural networks,” in Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017). Kolkata, India: NLP Association of India, December 2017, pp. 23–32. [Online]. Available: http://www.aclweb.org/anthology/W/W17/W17-7504

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
README.md		README.md
_config.yml		_config.yml
build_dictionary.py		build_dictionary.py
cosine.py		cosine.py
create_datafile.py		create_datafile.py
create_ref_facet.py		create_ref_facet.py
csv_convert.py		csv_convert.py
datafile_remove_stopwords.py		datafile_remove_stopwords.py
dependency_overlap.py		dependency_overlap.py
dice_coeff.py		dice_coeff.py
facet_lstm.py		facet_lstm.py
facet_main.py		facet_main.py
find_numbers.py		find_numbers.py
fmeasure.py		fmeasure.py
fuzz_string_matching.py		fuzz_string_matching.py
generate_test.py		generate_test.py
get_all_text.py		get_all_text.py
get_features.py		get_features.py
jaccard_sim.py		jaccard_sim.py
main.py		main.py
mine_POS_pats.py		mine_POS_pats.py
ner_overlap.py		ner_overlap.py
ner_tagger.py		ner_tagger.py
ngram_association_measures.py		ngram_association_measures.py
parse_ref.py		parse_ref.py
pmi.py		pmi.py
read_annt.py		read_annt.py
read_citance.py		read_citance.py
read_facet.py		read_facet.py
remove_stopwords.py		remove_stopwords.py
rouge_score.py		rouge_score.py
sentiWordNet.py		sentiWordNet.py
sequence_matcher.py		sequence_matcher.py
similarity.py		similarity.py
stopwords.py		stopwords.py
surface_features.py		surface_features.py
tanimoto.py		tanimoto.py
test.py		test.py
test1.py		test1.py
test10.py		test10.py
test2.py		test2.py
test4.py		test4.py
test4_facet.py		test4_facet.py
test5.py		test5.py
test8.py		test8.py
test9.py		test9.py
tfidf.py		tfidf.py
tfidf_using_cosine.py		tfidf_using_cosine.py
using_glove.py		using_glove.py
word2vec_similarity.py		word2vec_similarity.py
wordnet_similarity.py		wordnet_similarity.py
yuleK.py		yuleK.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reference-Scope-Identification-for-Citances-Using-CNN

Reference-Citance Pair Extraction:

Feature Extraction Modules:

Classification Algorithm:

References

About

Releases

Packages

Languages

NLPRL/Reference-Scope-Identification-for-Citances

Folders and files

Latest commit

History

Repository files navigation

Reference-Scope-Identification-for-Citances-Using-CNN

Reference-Citance Pair Extraction:

Feature Extraction Modules:

Classification Algorithm:

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages