In this Project we will be using different dataset collectively to gather as much data as we can to do sentiment analysis much more efficiently using Logistic Regression.
Therefore the datasets used in this projects are from kaggle.
->Yelp
->Amazon
->Twitter
I had also incoporated Hadoop data cluster and also implemented pipeline processing using spark-python api pyspark for it.
We can use different approaches for sentiment analysis like:
->Reinforcement Learning
->Recurrent Neural Networks(Deep Learning Approach) and many more.
But in this project we will be using Logistic regression to perform the task of Sentiment Analysis in this Project.As mentioned above that we will be
using hadoop Data Cluster for this project and pyspark for processing.This data consist of 3 differrent data nodes and a single name node.
Version of the softwares:
Hadoop-3.1.2
Spark-2.4.3
python-3.4
Link to the Dataset:https://drive.google.com/drive/folders/1Hw52s40bgwFywk1wUafiKZKsbyCxeamA
-
Notifications
You must be signed in to change notification settings - Fork 2
nimish931999/Sentiment-Analysis
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
In this project we will be performing Sentiment Analysis and predicting whether the review is negative or positive using Logistic Regression.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published