-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for cross-validation in scenarios #2
Comments
Hi @paraschakis, Thanks for raising the question! Could you give us and idea of what kind of behavior you would expect when you perform cross-validation with, for example, the WeakGeneralization and StrongGeneralizationTimed scenarios? For now, if you want multiple experiments to achieve a kind of cross-validation, we propose you do the following:
Hope this helps! |
Thanks for the provided code! Strictly speaking, seed-based splits won't give you true cross-validation. Cross-validation has been a standard way of evaluating recommender systems. Even much older libraries like MyMediaLite implement it. Just for reference, here's an extract from the "Recommender Systems Handbook" by Ricci et al.:
Here's another extract from the book 'Practical Recommender Systems' by Falk:
In weak generalization, you would perform cross-validation splits 'vertically', whereas in strong generalization you would do it 'horizontally'. This is my understanding. P. S. I already implemented a custom cross-validation procedure for my needs by creating a custom splitter and scenario, but my solution is hacky. That's why I think it would be nice to have an in-built support for CV. |
Hi @paraschakis, Thanks for the references and added information! However, in RecSys, samples are not independent. On the contrary, collaborative filtering algorithms actually exploit relationships between samples (either items or users) to learn useful patterns to make recommendations. In summary,
Lien |
Hi, @LienM and @paraschakis, I'd like to bring up a specific point about hyperparameter tuning and its interaction with cross-validation methods. The solution provided by @LienM for Monte Carlo cross-validation is helpful, but it seems challenging to integrate it directly with the hyperparameter tuning options currently available in the library. In my opinion, introducing built-in support for Monte Carlo cross-validation, particularly one that seamlessly integrates with hyperparameter tuning methods provided in the library, could significantly enhance the value of RecPack. This integration would provide a more robust methodology for hyperparameter tuning, allowing for the variability in the model's performance due to the different splits. Therefore, I would like to request a reconsideration of the priority for implementing a built-in cross-validation feature in RecPack. Thank you for your consideration. P.S.: Here are two papers that propose cross-validation for hyperparameter tuning: (a) "Top-N Recommendation Algorithms: A Quest for the State-of-the-Art" by Anelli et al.; and (b) "On the discriminative power of Hyper-parameters in Cross-Validation and how to choose them" by Anelli et al. Best, |
Scenarios currently cover single train-test-validation splits. It would be nice to have a mechanism for cross-validation as well.
The text was updated successfully, but these errors were encountered: