-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stats.kiwix.org lacks granularity AND is too granular #216
Comments
That's because those are two different files. Those files are different version of the same Book (CMS terminology). Only a custom tool could know that those are linked and should produce an aggregated counter. I don't think messing with source logs is a good idea so you're probably left with creating/modifying a tool that works off matomo API/data and produce this. Might be a matomo extension of something separate. |
To clarify, I'm trying to get the number of downloads for wikipedia_en_for_schools_maxi.zim (and the arabic version) over the past two years (1 August 2021 to 31 July 2023). Since the Zimfarm generates a new zim every month, I will also need the total to be calculated. |
If you put "wikipedia_en_for_schools_maxi" as filter you should get your number. I see no result at all, so looks like either nobody has ever downloaded it in the last 12 months or we have somehow a bug or I don't understand how it work. Anyway, I just have downloaded it, so in one hour worse case it should be at least one download. |
I've just checked and it did not pick it up as far as I can tell. |
@rgaudin OK then it looks like a bug, either in rhe log hathering part or in matomo. |
I am currently trying to find the record for this hit in the DB, if it's possible. Will be easier to know what to look at next then |
I found the hit in matomo's DB so we can rule out a download log capture/upload issue. Here's how I found it SELECT * FROM piwik_log_visit WHERE idsite=2 AND location_country="ch" AND visit_first_action_time >= "2023-08-11 21:00:00" AND visit_last_action_time <= "2023-08-11 23:00:00" There were several records. I identified @Kelson with the location, time and OS which gave me SELECT * FROM piwik_log_link_visit_action WHERE idvisit=21264276 There were a few results. I checked the URLs from SELECT * FROM piwik_log_action WHERE idaction=9162997;
So the hit was recorded by matomo. Out of curiosity (expensive query!) SELECT COUNT(*) FROM piwik_log_link_visit_action WHERE idaction_url =9162997;
Not all rows in that table are individual downloads. There are many columns with not obvious names and there's this action concept that is mapped to other tables (and some stuff references one another). But there are records for that ZIM. My opinion is that matomo is a complex tool and we (well you 😀) don't know exactly how to use it. I'd suggest you describe your use case in a matomo forum or support so we know exactly how to get that information you're looking for. |
Looking at stats for download.kiwix.org I can kind of surmise that around 12,000,000 zim files were downloaded over the past year.
The tool, however both fails to aggregate different versions of the same file (e.g. wikipedia_en_all_maxi_2022-05.zim and wikipedia_en_all_maxi_2023-05.zim) and does not show more than the top 500 rows.
We either need a better tool or make sure this one provides feedback that is actionable.
The text was updated successfully, but these errors were encountered: