[Modeling] Release a 3.3B Open-NLLB checkpoint (~202 languages) #17

gordicaleksa · 2023-09-12T10:18:10Z

This the end goal for the current project scope.

Here the goal is to release a model with following properties:

Truly open-source
3.3B dense
Supports all 202 NLLB languages in both direction

Note: it will be very hard to get a satisfactory level of quality for 202 languages with a dense checkpoint. The original work from Meta used a ~54B parameter MoE (mixture of experts) model to get decent results + a ton of compute (~52k hours on A100-SXM-80GB).

We do have plans to scale beyond 3.3B parameters scale.

gordicaleksa added the enhancement New feature or request label Sep 12, 2023

gordicaleksa added this to Open-NLLB: roadmap Sep 12, 2023

gordicaleksa moved this to Todo in Open-NLLB: roadmap Sep 12, 2023

gordicaleksa changed the title ~~Release a 3.3B Open-NLLB checkpoint (~202 languages)~~ [Modeling] Release a 3.3B Open-NLLB checkpoint (~202 languages) Sep 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Modeling] Release a 3.3B Open-NLLB checkpoint (~202 languages) #17

[Modeling] Release a 3.3B Open-NLLB checkpoint (~202 languages) #17

gordicaleksa commented Sep 12, 2023

[Modeling] Release a 3.3B Open-NLLB checkpoint (~202 languages) #17

[Modeling] Release a 3.3B Open-NLLB checkpoint (~202 languages) #17

Comments

gordicaleksa commented Sep 12, 2023