Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantization losses in the L1 biases of L1 3072 #279

Open
Viren6 opened this issue Apr 7, 2024 · 0 comments
Open

Quantization losses in the L1 biases of L1 3072 #279

Viren6 opened this issue Apr 7, 2024 · 0 comments

Comments

@Viren6
Copy link

Viren6 commented Apr 7, 2024

The average absolute L1 bias of the L1 3072 master net is 186531 / 3072 = 60.72

In contrast, the average absolute L1 bias of the L1 128 small net is 272474 / 128 = 2128.7

What appears to have happened, is that as the L1 size has increased the average absolute bias of each neuron has decreased to the point at which we are now encountering quantization losses in the L1 biases.

EDIT: It seems applying an adjustment of 1 randomly to each bias (instead of all of them) results in a 0 Elo result. This situation is closer to quantization, will verify the quantization loss properly once I receive the model in a few days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant