Packages used: numpy, pandas, sklearn, matplotlib, seaborn, etc.
- Movie Data Analysis: used tmdb movie data from Kaggle to explore the question 'what makes a high rating movie?'
- Twitter Data Wrangling: wrangle WeRateDogs Twitter data (accessed from Twitter API) to create analyses and visualizations
- Propser Loan Data Visualization: what factors influence loan status/default rate?
- Comparing Classifiers: use simulated data to compare different machine learning classifiers' performances (LDA,QDA, Naive Bayes, Logistic Regressoin, KNN); use different classifiers to predict voter turnout using real world data.