Wikipedia Gender Index (WIGI), uses Wikidata to produce gender-related statistic on Wikipedia Biographies
##The Data
- The raw data file (csv) one row per human in Wikidata, including their place of, and date of birth, death, ethnicity, and citizenship (if they exist).
- Re-indexed data files, one per each property, by sex (e.g. date of birth by sex)
- And some helper files to aggregate and map place of birth, ethnicity, and citizenship into "world cultures".
- munge and plot the intitial file and make the reindexes
- look at world cultures by date of birth and gender over time and how to aggregate the cultures
- Chi Squared Testing of Gender versus Culture and pretty plots of the same
- How to make data for and test the celebrity hypothesis
- Investigation into the Germanic Nationality Classification Shift
- Aggregating sitelinks into a language-culture female percentage scatter plot
- Modelling female percentage of biographies for prediction
- Scraping out the mechanical turk disagreements for hand coding
- How to make the sitelinks scatter plots
- Comparing WIGI to the world economic forum
##The Writings
- The paper so far on google docs please comment.
- In progress discussion on meta.