You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The average absolute L1 bias of the L1 3072 master net is 186531 / 3072 = 60.72
In contrast, the average absolute L1 bias of the L1 128 small net is 272474 / 128 = 2128.7
What appears to have happened, is that as the L1 size has increased the average absolute bias of each neuron has decreased to the point at which we are now encountering quantization losses in the L1 biases.
EDIT: It seems applying an adjustment of 1 randomly to each bias (instead of all of them) results in a 0 Elo result. This situation is closer to quantization, will verify the quantization loss properly once I receive the model in a few days
The text was updated successfully, but these errors were encountered:
The average absolute L1 bias of the L1 3072 master net is 186531 / 3072 = 60.72
In contrast, the average absolute L1 bias of the L1 128 small net is 272474 / 128 = 2128.7
What appears to have happened, is that as the L1 size has increased the average absolute bias of each neuron has decreased to the point at which we are now encountering quantization losses in the L1 biases.
EDIT: It seems applying an adjustment of 1 randomly to each bias (instead of all of them) results in a 0 Elo result. This situation is closer to quantization, will verify the quantization loss properly once I receive the model in a few days
The text was updated successfully, but these errors were encountered: