This project aims at using statistics and a vast amount of information to find whether there is a relationship between the physical attributes of a song and its popularity. To do so, we make a thorough analysis of the available data and train a few machine learning and statistical models to predict a song's success.
This project is part of a university databases class. As such, one of the goals we accomplished was to populate a relational database using MySQL. Additionally, we used that to build a model and test our hypothesis with Python.
Million Songs Dataset Cleaned with billboard and grammy successes
36.000 Songs Subset of the MSD Cleaned with billboard, grammy and spotify successes