-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix models path #25
fix models path #25
Conversation
Interesting. Locally I can run |
This sounds like a change that should be fixed within |
I don't think so. Because this repo don't use mteb for work with data and source of problem wrong |
tests/test_load_results.py
Outdated
@pytest.mark.parametrize("model", MODELS) | ||
def test_load_results_from_datasets(model): | ||
"""Ensures that all models can be imported from dataset""" | ||
path = Path(__file__).parent.parent / "results.py" | ||
ds = load_dataset(str(path.absolute()), model, trust_remote_code=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this required? Does the leaderboard load it using datasets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Samoed if you resolve a comment, please resolve - it makes it easier for me to see what you have fixed. Also seems like there is still unresolved files |
Now I resolved only |
Co-authored-by: Kenneth Enevoldsen <[email protected]>
# Conflicts: # paths.json
Took a look over it again. Are you referring to:
Are you saying that the leaderboard repo CI, assumes something about the file structure on this repo and that is what is causing the issue? The current file format is def. the one we want to stick with. So we should update the leaderboard code to match if that is the case. |
A bit yes, now in paths.json can be two versions of model result files
And results repo looks only for models that are defined in split and now in repo for bge-m3 result will be files without revision and others with revision can't be imported. My PR merges these results by finding model_name after author name, so new format results can be used with old format |
…to fix_models_path
# Conflicts: # paths.json
Hmm right, getting a better hang of the code here - the integration with mteb is quite poor here, which leads to a few issues. So ideal situation:
and that results should be compatible with
For splits etc. this is a duplicate of information already within I don't see any reason why we should filter it within this repository (I might be missing something). That is the ideal case. We might need to make some shortcuts here. |
I understand your idea and agree that it should be implemented this way. However, it will require more time to integrate the old results into the new format, and some refactoring of the leaderboard will be needed |
Now tests fails becase of huggingface/datasets#7141 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your idea and agree that it should be implemented this way. However, it will require more time to integrate the old results into the new format, and some refactoring of the leaderboard will be needed
Thanks. Yep def. required more changes - I think this looks reasonable as is. Though we need to resolve the failing tests
@orionw since you are more familiar with the leaderboard integration, can I ask you to review this PR as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR @Samoed!! There was definitely some problem with it before that you fixed, thanks! I may be having a hard time following this PR, but it seems like we just make it so that any shortened model name works also so we don't miss any/have duplicates.
If so, did we make sure we kept all the existing results? I see the GritLM no instruction one got deleted for example.
That's my only concern, otherwise LGTM.
@Samoed we still have to resolve the failing test before we can merge this in. Also please respond to the comments. You also write:
Which, when I read it along with the failing tests it seems like there is only partly a solution here. In which case please describe what is stopping it from becoming a finished solution. |
I've added a test to load all model results by splits, but the results take path.json from the HF repository, where the paths are broken. To make this test pass, it needs to be fixed in the HF repository. TIL: I wrote replies to all your comments, but I didn't know I needed to press the submit button in the files tab to make them visible for you |
Currently, there is one model that won't be fixed by my PR: |
Thanks for the clarification should have noted that. Seems like a test that will never fail when we want it to (when we introduce an error it will use the current working version) and when that error is added the next (working) PR will fail. Shouldn't it just reference the path.json within the PR?
Ahh we have all been there :) re names, we should use '__' to separate the author from the model names. For now, let us treat 256 as a part of the authors (though my guess is that it refers to the embed size) so: google-gecko-256.text-embedding-preview-0409/no_revision_available -> google-gecko-256__text-embedding-preview-0409/no_revision_available Re: I can't seem to find that folder in the results repo? |
I can't find it either, but I think it might be google-gecko.text-embedding-preview-0409. However, I'm not sure. Found these models in leaderboard config https://github.com/embeddings-benchmark/leaderboard/blob/12fdafc4eb41fb4e4dddfc6a6a202f13e23b734a/model_meta.yaml#L814-L831
I thought the same regarding #24, but I believe that when HF tries to download the repository, it will only download results.py and README, not the paths file. So, the paths file is being downloaded by the results file at runtime, and I'm not sure how to improve this. |
Revert "rename" This reverts commit e252add. rename
e252add
to
266d095
Compare
Let us delete
hmm not sure either the code:
does seem to follow the docs. It does say legacy though. If we can't get it to work we should just remove the test |
Yes, this will work for local development if full repo is provided, but when using |
Thanks for the response @Samoed, I understand now.
I have been confused by the difference between Google Gecko and this text-embedding-004 model on the Arena. Google's naming conventions make this very hard to understand. |
Thanks @Samoed for taking the time on this - I will merge it in now to allow the other PRs to be resolved as well |
Since #18, you can't load results for models that are in the
author__model
folder format because the script only looks for folders that match the model name. For example:This would result in an error because there is only an
intfloat__e5-small-v2
folder. However, models without a revision still work since their folder names haven't changed.I've also added a test to load all model results. I'm not sure how long this will take to run on CI, but locally, with a hard-coded local JSON, it took 2 minutes. Currently, it is failing due to errors like the one mentioned above.
@KennethEnevoldsen, could you please take a look at this?