GitHub - athrvkk/UCI-Sentiment-Analysis: Implementation of various Machine Learning and Deep Learning models for Sentiment Analysis on the 'Sentiment Labelled Sentences Data Set' by University of California, Irvine.

Implementation of Machine Learning and Deep Learning for Sentiment Analysis

Sentiment analysis is the process of determining whether a piece of writing is positive, negative or neutral.
In this project I have demondtrated how various Machine Learning and Deep Learning models can be used for sentiment analysis.

The Dataset:

The dataset used is "Sentiment Labelled Sentences Dataset", from the UC Irvine Machine Learning Repository.
The sentences come from three different websites/fields:
- amazon.com
- imdb.com
- yelp.com
Each sentence is labelled as either 1 (for positive) or 0 (for negative).
For each website,tThere exist 500 positive and 500 negative sentences.
This dataset was created for the Paper 'From Group to Individual Labels using Deep Features', Kotzias et. al,. KDD 2015. (Please cite the paper if you want to use it :))
Link to the dataset is: Sentiment Labelled Sentences Data Set
The dataset is present in the Dataset folder.

Machine Learning models:

I have used the follwoing Machine Learning models:

Multinomial Naive bayes
Random Forest
LinearSVC

The code implementing these models is in 'modules/Sentiment_Analysis_ML.ipynb'.
All the trained models are stored at 'models/ML'. Thereafter the models are segrated as per the dataset (Amazon, IMDB, Yelp).

Deep Learning models:

I have used the follwoing Deep Learning models:

Feed Forward Neural Network (FFNN)
Convolutional Neural Network (CNN)
Recurrent Neural Network (LSTM)

As the dataset consists of three different set of data, I have created three different implementations for each of them.

Amazon product Rreview Dataset ('modules/Amazon_Sentiment_Analysis_DL.ipynb')
IMDB Movie Review Dataset ('modules/IMDB_Sentiment_Analysis_DL.ipynb')
Yelp Restuarant Review Dataset ('modules/Yelp_Sentiment_Analysis_DL.ipynb')

All the trained models are stored at 'models/DL'. Thereafter the models are segrated as per the dataset (Amazon, IMDB, Yelp).

Word Embeddings:

All the Deep Learning architectures use the GloVe Word Embeddings.
To download click here (please download them before running the code.)
The 6 Billion words, 100 dimensional vector representation variant is used.
The have been stored at location 'Dataset/GloVe_Word_Embeddings'

Results:

After tyring various machine learning and deep learning models, I got the following results.

Model	Amazon Reviews	IMDB Reviews	Yelp Reviews
Multinomial Naive Bayes	85%	85%	78%
Random Forest	80%	79%	79%
Linear SVC	84%	81.50%	80%
FFNN	81.50%	84%	82%
CNN	87%	85.50%	82.50%
LSTM	87%	85%	83%

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.idea		.idea
Dataset		Dataset
models		models
modules		modules
LICENSE		LICENSE
ReadMe.md		ReadMe.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Implementation of Machine Learning and Deep Learning for Sentiment Analysis

The Dataset:

Machine Learning models:

Deep Learning models:

Word Embeddings:

Results:

About

Releases

Packages

Languages

License

athrvkk/UCI-Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

Implementation of Machine Learning and Deep Learning for Sentiment Analysis

The Dataset:

Machine Learning models:

Deep Learning models:

Word Embeddings:

Results:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages