Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Modeling] Release a 3.3B Open-NLLB checkpoint (~202 languages) #17

Open
gordicaleksa opened this issue Sep 12, 2023 · 0 comments
Open
Labels
enhancement New feature or request

Comments

@gordicaleksa
Copy link
Owner

This the end goal for the current project scope.

Here the goal is to release a model with following properties:

  • Truly open-source
  • 3.3B dense
  • Supports all 202 NLLB languages in both direction

Note: it will be very hard to get a satisfactory level of quality for 202 languages with a dense checkpoint. The original work from Meta used a ~54B parameter MoE (mixture of experts) model to get decent results + a ton of compute (~52k hours on A100-SXM-80GB).

We do have plans to scale beyond 3.3B parameters scale.

@gordicaleksa gordicaleksa added the enhancement New feature or request label Sep 12, 2023
@gordicaleksa gordicaleksa changed the title Release a 3.3B Open-NLLB checkpoint (~202 languages) [Modeling] Release a 3.3B Open-NLLB checkpoint (~202 languages) Sep 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

No branches or pull requests

1 participant