Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: run bm25 and ColBERT test in ci #1829

Open
wants to merge 11 commits into
base: v2.0.0
Choose a base branch
from

Conversation

sam-hey
Copy link
Contributor

@sam-hey sam-hey commented Jan 17, 2025

cI: Enable skipped tests for ColBERT and BM25
fix: get model meta from CrossEncoder
ref: use tmp_path for tests
test: add tests for model_meta

Checklist

  • Run tests locally to make sure nothing is broken using make test.
  • Run the formatter to format the code using make lint.

start in: #1759 (comment)

Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only really checks if they can be installed. We should add an actual test as well?

@sam-hey
Copy link
Contributor Author

sam-hey commented Jan 17, 2025

Hi @KennethEnevoldsen,

I added two tests in this PR. However, they are not being run in the CI because the dependencies don't exist.

@KennethEnevoldsen
Copy link
Contributor

Ahh thanks for the pointer. Any reason not to remove the skip? (I could later clean up the installs without knowing that I disabled a test).

Will you also add a time diff? (coverage report would also be nice, but not required)

@sam-hey
Copy link
Contributor Author

sam-hey commented Jan 17, 2025

The current issue with PyLate (referenced in GitHub Issue #79) is related to compatibility problems with older Python versions, which is causing the tests for Python 3.9 to fail.

Not skipping would require all developers to install additional dependencies: pylate, bm25s, and pystemmer. While this is feasible, it does introduce more requirements for everyone on the team. This decision is ultimately up to you, and I am fine with proceeding if you decide to include them.

@sam-hey sam-hey marked this pull request as draft January 17, 2025 15:57
@sam-hey
Copy link
Contributor Author

sam-hey commented Jan 17, 2025

Tests conducted on v2.0.0 with Python 3.11:

checks occasionally take up to 12 minutes.
This branch test runtime: 8-9 minutes.

Performance benchmarks:
ColBERT + BM25: Test execution on my machine completes in 15.17 seconds.

Test coverage summary:
Overall test coverage: 66%.
mteb/models/colbert_models.py: 63%
mteb/models/bm25.py: 94%

@sam-hey sam-hey marked this pull request as ready for review January 17, 2025 21:52
@sam-hey
Copy link
Contributor Author

sam-hey commented Jan 18, 2025

I believe I’ve added everything for now and would be happy to get your review of the changes!

  • Skipping pylate: Skipped for Python 3.9.
  • Updates:
    • Added extra tests.
    • Introduced ModelMeta for CrossEncoders passed as models to run().
  • Pending Decision: Awaiting your input on skipping BM25s and ColBERT if the version is not available.

@KennethEnevoldsen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants