Sentiment-Analysis and EDA on the IMDB Movie Review Dataset

The main part of the work focuses on the exploration and study of different approaches which are used for Sentiment Analysis (e.g. Bag of Words, TF-IDF, Word Embeddings). In addition, the work utilizes and compares different classification algorithms for Sentiment Analysis tasks in Natural Language Processing (e.g. Tree based Algorithms, Linear Models and Support Vector Machines).

Author: Nikolas Petrou, MSc in Data Science

Technical-Report and Code Availability

The technical report and analysis of the work is available and located in EDA-and-Sentiment-Analysis-on IMDB-Dataset.pdf file
The implementation and code of the project is located in the Implementation-Python Files folder

Overview

The goal of this work focuses on the exploration and study of different approaches which are used for Sentiment Analysis (e.g. Bag of Words, TF-IDF, Word Embeddings). In addition, the work utilizes and compares different classification algorithms for Sentiment Analysis tasks in Natural Language Processing (e.g. Tree based Algorithms, Linear Models and Support Vector Machines).

Dataset

For this work, a large dataset which consists of movie reviews was used. Specifically, the publicly available Internet Movie Database (IMDB) review dataset

The data can be obtained from Kaggle or direcetly from Stanford

Methodology

An abstract methodology scheme of the work is illustrated in the following Figure.

Summarizing, firstly the initial questions were set in respect to the used dataset. Subsequentially, the data scrapping and data collection were performed. In addition, after the data preprocessing steps were performed, different data analytics and analysis were ,employed in order to better understand the data insights. Finally, during the final analysis, different methodologies and models were utilized in order to classify the textual data based on the sentiment. It is crucial to mention that the whole processed followed a cyclical scheme.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Implementation-Python Files		Implementation-Python Files
EDA-and-Sentiment-Analysis-on IMDB-Dataset.pdf		EDA-and-Sentiment-Analysis-on IMDB-Dataset.pdf
README.md		README.md
methodology scheme.png		methodology scheme.png
tv-gcd05f2dbf_1920.jpg		tv-gcd05f2dbf_1920.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment-Analysis and EDA on the IMDB Movie Review Dataset

Technical-Report and Code Availability

Overview

Dataset

Methodology

About

Releases

Packages

Languages

nikopetr/Sentiment-Analysis-and-EDA-for-the-IMDB-Dataset

Folders and files

Latest commit

History

Repository files navigation

Sentiment-Analysis and EDA on the IMDB Movie Review Dataset

Technical-Report and Code Availability

Overview

Dataset

Methodology

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages