Sentimix2020

Abstract

In social-media platforms such as Twitter, Facebook, and Reddit, people prefer to use code-mixed language such as Spanish-English, Hindi-English to express their opinions. In this paper, we describe different models we used, using the external dataset to train embeddings, ensembling methods for Sentimix, and OffensEval tasks. The use of pre-trained embeddings usually helps in multiple tasks such as sentence classification, and machine translation. In this experiment, we have used our trained code-mixed embeddings and twitter pre-trained embeddings to SemEval tasks. We evaluate our models on macro F1-score, precision, accuracy, and recall on the datasets. We intend to show that hyper-parameter tuning and data pre-processing steps help a lot in improving the scores. In our experiments, we are able to achieve 0.886 F1-Macro on OffenEval Greek language subtask post-evaluation, whereas the highest is 0.852 during the Evaluation Period. We stood third in Spanglish competition with our best F1-score of 0.756. Codalab username is asking28.

File Structure

This repository contains two folders Sentimix and OffensEval. Sentimix folder contains Jupyter and corresponding Python files of Spanglish and Hinglish. OffensEval Folder contains code files of OffensEval English Task 1,2,3 and Turkish, Arabic, Danish and Greek languages.

BibTex

@article{singh2020voice, title={Voice@ SRIB at SemEval-2020 Task [9, 12]: Sentiment and Offensiveness detection in Social Media}, author={Singh, Abhishek and Parmar, Surya Pratap Singh}, journal={arXiv preprint arXiv:2007.10021}, year={2020} }

External libraries

Sklearn, Numpy, Pandas, Tensorflow, Keras, Beautiful Soup
Run below commands before running files (Pip):
!pip install focal-loss
!pip install keras-tcn==2.8.3
!pip install keras-multi-head
!pip install keras_metrics
!pip install tqdm
!pip install keras-self-attention

References

https://github.com/SilentFlame/Named-Entity-Recognition/blob/master/Twitterdata/processedTweets.csv
https://arxiv.org/pdf/1805.11869.pdf
http://ceur-ws.org/Vol-2111/paper5.pdf
https://github.com/sahilswami96/SarcasmDetection_CodeMixed/blob/master/Dataset/Sarcasm_tweets.txt
https://github.com/sahilswami96/SarcasmDetection_CodeMixed/blob/master/Classification_system/build_feature_vector.py
https://www.aclweb.org/anthology/C18-1247.pdf how emotional are you
https://arxiv.org/pdf/1905.12516.pdf multiple datasets

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Offenseval		Offenseval
Sentimix		Sentimix
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentimix2020

Abstract

File Structure

BibTex

External libraries

References

About

Releases

Packages

Languages

asking28/sentimix2020

Folders and files

Latest commit

History

Repository files navigation

Sentimix2020

Abstract

File Structure

BibTex

External libraries

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages