You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here the goal is to release a model with following properties:
Truly open-source
3.3B dense
Supports all 202 NLLB languages in both direction
Note: it will be very hard to get a satisfactory level of quality for 202 languages with a dense checkpoint. The original work from Meta used a ~54B parameter MoE (mixture of experts) model to get decent results + a ton of compute (~52k hours on A100-SXM-80GB).
We do have plans to scale beyond 3.3B parameters scale.
The text was updated successfully, but these errors were encountered:
This the end goal for the current project scope.
Here the goal is to release a model with following properties:
Note: it will be very hard to get a satisfactory level of quality for 202 languages with a dense checkpoint. The original work from Meta used a ~54B parameter MoE (mixture of experts) model to get decent results + a ton of compute (~52k hours on A100-SXM-80GB).
We do have plans to scale beyond 3.3B parameters scale.
The text was updated successfully, but these errors were encountered: