Mike O'Malley's and my joint entry to the 2019 recommender systems challenge. For leaderboard see https://recsys.trivago.cloud/leaderboard/
Mike and I used a light gradient boosted regression model to predict which item a user was most likely to click out on. We used an NGCD- acquisition function as it has previously been found to correlate very well with the MRR. It got a leaderboard score of 0.692.
Below you will find steps reproducing our solution. Please don't hesitate to get in touch!
- This solution requires approximately 400GB of RAM. It takes about 36h to train using 40 cores.
- Put all data into the data folder.
- Run preprocessing_scripts/update_metadata.py
- Run model/Main.py
- The predictions can now be found in the predictions folder.