Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Enforce revision ID and model names for future contributions #56

Merged
merged 1 commit into from
Nov 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@ type: evaluation
submission_name: MTEB
---

> [!NOTE]
> Previously it was possible to submit models results to MTEB by adding the results to the model metadata. This is no longer an option as we want to ensure high quality metadata.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: is that currently enforced? Or will only be in effect once the new leaderboard goes public?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new leaderboard does not fetch it automatically (only fetches from here)

However, we had a PR that fetches everything from HF (once, but probably not something we will run again).

E.g. the models:

'vectoriseai/bge-base-en-v1.5',
'vectoriseai/bge-large-en-v1.5',
'vectoriseai/bge-small-en',
'vectoriseai/bge-small-en-v1.5',
'vectoriseai/e5-base',
'vectoriseai/e5-base-v2',
'vectoriseai/e5-large',
'vectoriseai/e5-large-v2',
'vectoriseai/e5-small-v2',
'vectoriseai/gte-base',
'vectoriseai/gte-large',
'vectoriseai/gte-small',
'vectoriseai/multilingual-e5-large',

are all copies and then creates duplicates on the leaderboard.

Copy link
Contributor Author

@KennethEnevoldsen KennethEnevoldsen Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pr. #57 removes these



This repository contain the results of the embedding benchmark evaluated using the package `mteb`.


Expand Down
6 changes: 2 additions & 4 deletions paths.json
Original file line number Diff line number Diff line change
Expand Up @@ -14632,10 +14632,6 @@
"results/vesteinn__DanskBERT/no_revision_available/LccSentimentClassification.json",
"results/vesteinn__DanskBERT/no_revision_available/DalajClassification.json"
],
"voyage-multilingual-2": [
"results/voyage-multilingual-2/1/BornholmBitextMining.json",
"results/voyage-multilingual-2/1/STSB.json"
],
"voyageai__voyage-2": [
"results/voyageai__voyage-2/no_revision_available/MassiveIntentClassification.json",
"results/voyageai__voyage-2/no_revision_available/MLSUMClusteringP2P.json",
Expand Down Expand Up @@ -15003,6 +14999,8 @@
"results/voyageai__voyage-lite-02-instruct/no_revision_available/StackOverflowDupQuestions.json"
],
"voyageai__voyage-multilingual-2": [
"results/voyageai__voyage-multilingual-2/1/BornholmBitextMining.json",
"results/voyageai__voyage-multilingual-2/1/STSB.json"
"results/voyageai__voyage-multilingual-2/no_revision_available/MassiveIntentClassification.json",
"results/voyageai__voyage-multilingual-2/no_revision_available/LEMBWikimQARetrieval.json",
"results/voyageai__voyage-multilingual-2/no_revision_available/MLSUMClusteringP2P.json",
Expand Down
1 change: 0 additions & 1 deletion results/voyage-multilingual-2/1/model_meta.json

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"name": "voyageai__voyage-3-lite", "revision": "no_revision_available", "release_date": "2024-09-22", "languages": null, "n_parameters": null, "memory_usage": null, "max_tokens": 32000, "embed_dim": 512, "license": null, "open_source": false, "similarity_fn_name": null, "framework": [], "loader": "VoyageWrapper"}
{"name": "voyageai/voyage-3-lite", "revision": "no_revision_available", "release_date": "2024-09-22", "languages": null, "n_parameters": null, "memory_usage": null, "max_tokens": 32000, "embed_dim": 512, "license": null, "open_source": false, "similarity_fn_name": null, "framework": [], "loader": "VoyageWrapper"}
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"name": "voyageai__voyage-3", "revision": "no_revision_available", "release_date": "2024-09-22", "languages": null, "n_parameters": null, "memory_usage": null, "max_tokens": 32000, "embed_dim": 1024, "license": null, "open_source": false, "similarity_fn_name": null, "framework": [], "loader": "VoyageWrapper"}
{"name": "voyageai/voyage-3", "revision": "no_revision_available", "release_date": "2024-09-22", "languages": null, "n_parameters": null, "memory_usage": null, "max_tokens": 32000, "embed_dim": 1024, "license": null, "open_source": false, "similarity_fn_name": null, "framework": [], "loader": "VoyageWrapper"}
1 change: 1 addition & 0 deletions results/voyageai__voyage-multilingual-2/1/model_meta.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"name": "voyageai/voyage-multilingual-2", "revision": "1", "release_date": "2024-06-10", "languages": null, "n_parameters": null, "memory_usage": null, "max_tokens": 32000, "embed_dim": 1024, "license": null, "open_source": false, "similarity_fn_name": null, "framework": [], "loader": "VoyageWrapper"}
414 changes: 211 additions & 203 deletions tests/test_correct_folder_structure.py

Large diffs are not rendered by default.

73 changes: 73 additions & 0 deletions tests/test_ensure_correct_metadata.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
import json

import pytest

from tests.test_correct_folder_structure import folders_without_meta, results_folder

model_rev_pairs = [
(model_folder, rev_folder)
for model_folder in results_folder.glob("*")
for rev_folder in model_folder.glob("*")
if model_folder.name not in [".DS_Store"]
and rev_folder.name not in [".DS_Store"]
and ((model_folder.name, rev_folder.name) not in folders_without_meta)
]


@pytest.mark.parametrize("model_rev_pair", model_rev_pairs)
def test_model_meta_in_folders(model_rev_pair):
"""
Models added after the 26th of November should contain a model_meta.json file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Models added after the 26th of November should contain a model_meta.json file
Models added after November 26, 2024 should contain a model_meta.json file

"""
model_folder, rev_folder = model_rev_pair

meta_file = rev_folder / "model_meta.json"
assert meta_file.exists()
assert meta_file.is_file()
assert meta_file.parent == rev_folder
assert meta_file.suffix == ".json"
assert meta_file.stem == "model_meta"


# please do not add to this list, this is only for historic results, all new results should include a revision ID
revision_exceptions = [
("castorini__mdpr-tied-pft-msmarco", "no_revision_available"),
("voyageai__voyage-3", "no_revision_available"),
("sentence-transformers__all-mpnet-base-v2", "no_revision_available"),
("Snowflake__snowflake-arctic-embed-m-v1.5", "no_revision_available"),
("sentence-transformers__all-MiniLM-L12-v2", "no_revision_available"),
("nthakur__mcontriever-base-msmarco", "no_revision_available"),
('voyageai__voyage-3-lite', 'no_revision_available')
]


@pytest.mark.parametrize("model_rev_pair", model_rev_pairs)
def test_revision_is_specified_for_new_additions(model_rev_pair):
"""
Models added after 26th of November should include a revision ID and can not use the "no_revision_available" fallback.
"""
model_folder, rev_folder = model_rev_pair
if (model_folder.name, rev_folder.name) not in revision_exceptions:
meta_file = rev_folder / "model_meta.json"
with meta_file.open("r") as f:
meta = json.load(f)
assert meta["revision"].lower() not in [
"no_revision_available",
"",
"na",
"no-revision-available",
]


@pytest.mark.parametrize("model_rev_pair", model_rev_pairs)
def test_organization_is_specified_for_new_additions(model_rev_pair):
"""
Models added after 26th of November should include a organization ID within their name, e.g. "myorg/my_embedding_model".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Models added after 26th of November should include a organization ID within their name, e.g. "myorg/my_embedding_model".
Models added after November 26, 2024 should include a organization ID within their name, e.g. "myorg/my_embedding_model".


This is to avoid mispecified names such as "myorg__my_embedding_model" and similar.
"""
model_folder, rev_folder = model_rev_pair
meta_file = rev_folder / "model_meta.json"
with meta_file.open("r") as f:
meta = json.load(f)
assert "/" in meta["name"]
Loading