-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set Leo default model to Llama 3 8b (25% rollout) #1225
Conversation
✅ Test Seed Generated SuccessfullyTo apply the test seed:
Seed Details
|
seed/seed.json
Outdated
"BETA", | ||
"NIGHTLY" | ||
], | ||
"min_version": "122.0.6261.57", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like in this file the versioning is [chrome_major].[brave_version], e.g 122.1.63.0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated to use 122.1.63.161
as the min version of the default model change. 1.63.161 seems to be the first 1.63 release. This also adds in a little buffer because brave/brave-browser#34721 was uplifted into 1.62.x.
I specified 122.1.63.160
as the new max version for the previous setting. This may not correlate to an actual release, it's just one patch version behind 122.1.63.161
.
I tested running with |
@petemill will grab this tomorrow/verify and push it into |
Use [chrome_major].[brave_version] format, not just the chromium version.
9d410b5
to
646922f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uplift into production
approved after deliberating with @brave/uplift-approvers.
Follows on #1225 Co-authored-by: Nick von Pentz <[email protected]>
Related to brave/brave-core#21398
Blocked on deploying https://github.com/brave/aichat-ops/pull/359 to prod
This sets the default model for 25% of free users to
chat-basic
, which corresponds to Llama 3 8b.The intention is to start this to progressively roll out this change (25%, 50%, 75%, 100%), so we can ensure our single instance of Llama 3 can handle the increase in traffic. Based on these estimations, one instance of llama 3 (one gpu) should have higher token throughput than our single instance of mixtral (four gpus), so we expect this to work with two llama instances which will be added in https://github.com/brave/aichat-ops/pull/359.
cc @petemill @LorenzoMinto
Note: