The goal of this project is to proof hypothesis based on data, in our case, we will use Twitter to fetch some real-time data.
We will choose two campaigns, one of them will show negative emotions in the languague of the tweets, the other one will show positive ones. Here we will use Manchester united and FC Barcelona as our campaign.
Ok, Why did I choose Manchester united and Barcelona?
If you are following the news lately, or not even following it. You must have read it somewhere in social media that Cristiano Ronaldo has joined his old team again which is united and he left juventus. Upon this, most people are very happy. On the other hand, Messi also reccently left Barcelona that he was in for over 20 years. Also, most of people are sad or negative about it and the team is performing so badly without him.
Natural Language Processing or NLP is a branch of Artificial Intelligence which deal with bridging the machines understanding humans in their Natural Language. Natural Language can be in form of text or sound, which are used for humans to communicate each other. NLP can enable humans to communicate to machines in a natural way.
Text Classification is a process involved in Sentiment Analysis. It is classification of peoples opinion or expressions into different sentiments. Sentiments include Positive, Neutral, and Negative, Review Ratings and Happy, Sad. Sentiment Analysis can be done on different consumer centered industries to analyse people's opinion on a particular product or subject.
Sentiment Analysis is a perfect problem in NLP for getting started in it. You can really learn a lot of concepts and techniques to master through doing project. Kaggle is a great place to learn and contribute your own ideas and creations. I learnt lot of things from other, now it's my turn to make document my project.
We will build a ML model that classify tweets as positive or negative. Using Twitter API and tweety to get tweets based on keywords that we will specifiy later. We will use The ML model that we built earlier to make a real-time Setimental Analysis on tweets that we got.
You can get the data from here.