-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug] Companies table and developers table have different data for the same period #51
Comments
This could be due to HLL (hyper log log) - I can change metrics (for Istio only) to use exact count distincts instead of approximate counts that HLL gives - but this will require creating custom SQLs just for Istio usage, can be done in a day or two, but not earlier than about week or two from now. |
Can you recheck and LMK if this is still needed? I've optimised some metrics recently and they no longer use HLL, so this might be Ok already. If not LMK, I'll iterate on this when I can. |
Companies table:
Sum of developers table:
Closer, and current in the same order/ballpark. (Edit: initial miscalculation around Google was my error.) |
I will TAL on Friday or Monday. |
No rush, Istio doesn't need this until January!
|
Hmm the first link is giving sum of all contributions (this one) while another) is giving values per developer and you are summing them manually, right? I'll check if both use HLL or doth don't use it - actually I will also update to use exact counts in case of Istio - because HLL was used to save cycles, but it makes more sense in |
One was using HLL while another not, I will sync them now and regenerate data, then I'll let you know when finished. Also pls note that all statistics across DevStats are not calculated "on the fly" but synced at a given point in time and saved in tables (so later Grafana UI does just a simple select to those "calculated" tables) - if calculation for "last year" happened on different tome for two metrics - they can be slightly out of sync, but the difference shouldn't be hight - after this manual sync that I'll do now - they should be as close to each other as possible. |
I've regenerated data, I don't have a script to sum all developers to check those value, PTAL again pls. Hope this is OK now. |
@lukaszgryglicki has regenerated our database as of ~15 minutes ago, so this data is as fresh as it comes.
The companies table reports the top 5 contributors to Istio in the last 12 months as:
However, if one exports the data from the Developer activity counts by company view for the same period, the summation is this:
Note how some companies show fewer contributions in the second list, and some have more.
Istio uses this data as part of its governance process, and last week, the order of the top 5 results shown here actually differed depending on which metric you used.
Can you help us understand why these values are different?
The text was updated successfully, but these errors were encountered: