Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Set Leo default model to Llama 3 8b (25% rollout) (#1225)
Related to brave/brave-core#21398 Blocked on deploying brave/aichat-ops#359 to prod This sets the default model for 25% of free users to `chat-basic`, which corresponds to [Llama 3 8b](https://github.com/brave/brave-core/blob/c839e1676031a3d155b04b9dbfe49e11b3b8601b/components/ai_chat/core/browser/model_service.cc#L154). The intention is to start this to progressively roll out this change (25%, 50%, 75%, 100%), so we can ensure our single instance of Llama 3 can handle the increase in traffic. Based on these [estimations](https://artificialanalysis.ai/?models_selected=llama-3-1-instruct-8b%2Cmixtral-8x7b-instruct), one instance of llama 3 (one gpu) should have higher token throughput than our single instance of mixtral (four gpus), so we expect this to work with two llama instances which will be added in brave/aichat-ops#359. cc @petemill @LorenzoMinto Note: * Chromium version 122.0.6261.57 was selected because that was the first chromium version when 1.63.x was released (see https://bravesoftware.slack.com/archives/C04PX1BUN/p1708629893634639), which is when brave/brave-core#21398 went in. * I have included all platforms because we want to make this change across all platforms, however this does differ from the BraveAIChatEnabledStudy which applies only to desktop --------- Co-authored-by: Nick von Pentz <[email protected]> Co-authored-by: Pete Miller <[email protected]> Co-authored-by: Aleksey Khoroshilov <[email protected]>
- Loading branch information