-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
contents of the "Statistics" tab #67
Comments
Some basic statistics about the data (e.g. number of formats, domains, etc) have been added in 1eb2c3f |
On the one hand, we can list some internal or technical statistics that basically reflect the work performed on extending the SIS: the number of format descriptions, the number of recorded file extensions or media types (not very useful, these two, but they do signal the data content of the SIS). The number of recommendations and recommendations per domain could also, arguably, fall in this category. Also the number of But then we perform a kind of an epistemic step and assume that we've built a model of an aspect of CLARIN, and look for some statistics describing CLARIN. This is where we can count "the most popular formats per domain", and where we can compute the relevant KPI for centres where |
This is just to note that, as mentioned in the ticket referenced right above this comment, we may want to consider splitting this page into something like "SIS statistics" and "Data visualization". The latter meant to provide RI-wide (and maybe cross-RI?) visualizations. |
We're moving this issue to the next milestone. |
Eliza has prepared a list-statistics.xq page where a beginning can be seen, ordered by domains and then by the numbers of recommendations in those domains. |
A lot from this ticket got implemented, though not everything (for one thing, this is a discussion ticket, so it should actually be a hub for action-oriented children but there's only 24 hours in a... working day...). |
While Pooh keeps pondering, this ticket sneaks out of 2.4.0 and onwards, toward a brighter future. |
#180 is a related issue. |
Assigning it to myself for careful re-reading, to see if it can be closed. |
AH, but remember that the issue is linked from https://clarin.ids-mannheim.de/standards/views/list-popular-formats.xq so when it's closed, the link should either go or get replaced by a link to something more permanent. |
Gathering some potentially interesting points from above:
I am going to close this issue, since we need some sort of closure on the goals that have been achieved :-) I'll take the remaining ideas to another ticket. There is now a new issue dedicated to gathering more ideas on what it can mean for a format to be popular (#201 ) and linked from the corresponding subpage. |
Ah, this ticket is closed, but it is directly referenced from the top of https://standards.clarin.eu/sis/views/list-statistics.xq That is not optimal... |
It's time to see what kind of statistics we could derive from the aggregation of the information, and I suggest that we get a proof-of-concept tab going asap (much as the mimeType tab).
The question is, what kind of statistics can we get going now, and what can we work towards?
Our 2019 goal was to see what formats are popular among the centres, relative to the function(s) they are expected to play. That is something that we can get straightforwardly:
Once we get the formal format family tree (or graph) going, we could also:
I guess that, with the information we already have (or can have, easily), we can also
What else? Please share your ideas.
The text was updated successfully, but these errors were encountered: