Task is to predict the sentiment of the tweet i.e. positive, negative or neutral.
Twitter US Airline Sentiment Dataset, which contains data for over 14000 tweets.
dataset.csv
- Dataset used for training and testing the model.twitter_sentiments.ipynb
- Jupyter Notebook.emogi_sentiment.csv
- CSV file for getting the sentiment of the emogi used in tweets.
numpy
pandas
sklearn
nltk
- Remove the airline name.
- Remove/Replace(with sentiments) the emogi.
- Remove punctuations and stop words.
- Applying PCA.
- Building model.
I hope to further increase its efficiency by training it on a larger dataset. Also planning to use different classifiers and observing the effect of emogis.