This project is about the sentiment analysis of twitter data using PySpark.
The project uses the dataset downloaded from kaggle https://www.kaggle.com/kazanova/sentiment140/code
The tweet data is classified into three labels 'positive', 'neutral', 'negative'.
Two different classification models 'Logistic Regression' and 'Naive Bayes' classifier models are trained and evaluated.