Perform sentimental analysis on the Elon-musk tweets (Exlon-musk.csv)
Text Preprocessing:
- remove both the leading and the trailing characters
- removes empty strings, because they are considered in Python as False
Joining the list into one string/text
Remove Twitter username handles from a given twitter text. (Removes @usernames)
Again Joining the list into one string/text
Remove Punctuations
Remove https or url within text
Converting into Text Tokens
Tokenization
Remove Stopwords
Normalize the data
Stemming (Optional)
Lemmatization
Feature Extaction
- Using BoW CountVectorizer
- CountVectorizer with N-grams (Bigrams & Trigrams)
- TF-IDF Vectorizer
Generate Word Cloud
Named Entity Recognition (NER)
Emotion Mining - Sentiment Analysis