Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalize results in results folder #18

Merged
merged 10 commits into from
Aug 15, 2024
Merged

Conversation

KennethEnevoldsen
Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen commented Aug 14, 2024

Added script to normalize results in the folder to follow the format created by mteb run. This also merged the results if multiple folders are present (e.g. due to. #15).

The path is created based on the model_meta.json so any folder without this will stay the same.

Edit: ran an additional model on a task where the results were missing
Edit: added test to ensure folder structure going forward

@KennethEnevoldsen KennethEnevoldsen changed the title wip Normalize results in results folder Aug 14, 2024
@Muennighoff
Copy link
Contributor

Nice! It's hard viewing the diff, we should just be careful not to delete results that are unique & if they are duplicate that it's not just the filename & model that's duplicate but also the scores (maybe checking a few files to make sure scores are within changes that may be due to randomness)

@KennethEnevoldsen
Copy link
Contributor Author

@Muennighoff did a manual review of quite a few files. Naturally, as you say, it is hard to know exactly. However, almost all of the file changes are caused by moving models to a new folder, and the cases where it wasn't were primarily due to #15, where I manually resolved the diff (since you ran it with a slightly lower version, it could not be manually resolved). I have "validated" the files for MMTEB as well to ensure that they contain all the relevant splits for all the relevant tasks.

Do note that this does not change any of the results in a folder without the "model_meta.json".

@KennethEnevoldsen KennethEnevoldsen merged commit 9a79f7e into main Aug 15, 2024
2 checks passed
@KennethEnevoldsen KennethEnevoldsen deleted the normalize_results_folder branch August 15, 2024 14:44
@Samoed Samoed mentioned this pull request Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants