Porto Seguro’s Safe Driver Prediction Contains jupyter notebook to classify the Porto-Seguro dataset on Kaggle. Porto Seguro is a car insurance provider in Brazil. The dataset is not available here - it can be downloaded on the Kaggle website after accepting rules of the competition. https://www.kaggle.com/c/porto-seguro-safe-driver-prediction
The dataset is a good example of imbalanced dataset. The people claiming insurance to people not claiming is in the ratio 32:1.
Taking inspiration from this post: https://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/,
I will try out a few different strategies to deal with the imbalance.