Skip to content

NemaVatsala/Sarcasm-detection-over-Reddit-Corpus

Repository files navigation

Sarcasm-detection-over-Reddit-Corpus

Course Project for DSE 407 - Natural Language Processing

by Dr. Tanmay Basu and Dr. Jasabanta patro

The task was to develop an NLP method that could identify the sarcastic comments perfectly based on the learnings from the labelled dataset. We used bag of words model and TF-idf vectorizers and applied 3 classifiers namely, Multinomial Naive Bayes, Logistic regression and Support vector machine to train the machine to learn labelling. We applied different NLP techniques like stop-word removal, lemmatization and stemming on the dataset to test for the accuracy of prediction. Later, we used the best trained model to predict the class 'sarcastic' or 'non-sarcastic' on the given test dataset.

Khodak, Mikhail and Saunshi, Nikunj and Vodrahalli, Kiran Proceedings of the Linguistic Resource and Evaluation Conference (LREC) (2018) [A Large Self-Annotated Corpus for Sarcasm] (https://doi.org/10.48550/arXiv.1704.05579)

About

NLP Course project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published