-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorporate Social Indicators Feedback in Original Paper #20
Comments
Me and Max already have our replies to the reviewer, but before we share The paper is at
Piotr Konieczny, PhD On 5/14/2015 13:55, Max Klein wrote:
|
@piokon @notconfusing To not to influence other project members I have sent my feedback through mail. |
@piokon @notconfusing |
I think at this point we can share your analysis with the group, so Piotr Konieczny, PhD On 5/15/2015 05:34, Harsh Gupta wrote:
|
I don't believe I ever got them, would you mind sending them here? (Perhaps I am confused about something, but this is a listerv, right?) Piotr Konieczny, PhD On 5/15/2015 19:09, Mohammed Sadat Abdulai wrote:
|
Inclusion of a section header 6.3 that clearly spells out the scope of the study: Scope of the study The units that would be measured are the ratio of female Wikipedia biographies to total Wikipedia biographies against the background of these female personalities, that is, their place of birth and time of birth. When a biography does not contain place of birth, citizenship will be used in its stead if that is available. In addition, time of birth in a broad sense will refer to a specific timeframe within which a person is born but not necessarily the exact date....still expanding! Though it has already been mentioned briefly. I think the systematic biases of Wikipedia need to be made clear to the reader in the ways it can potentially influence “biography” articles on Wikipedia. Also to be included in the scope of the study: Wikipedia suffers from general systematic biases and this directly influences the nature or space of biographies that editors would choose to create or improve as well as the quality of such biography articles. A 2005 University of Würzburg Wikipedia User survey [1], found that “the common characteristics of ‘average Wikipedians’ inevitably color the content of Wikipedia”. The study defined the average Wikipedian on the English Wikipedia for example as Access to internet, which is a major requirement to editing Wikipedia and creating/improving biography articles tend to be at the disposal of people in developed nations. (Nelson, Anne. "Wikipedia Taps College ‘Ambassadors’ to Broaden Editor Base") notes that Eighty percent of Wikipedia page views and 83% of global edits come from the Global North. Most countries in the Southern hemisphere have disproportionately less access to information technology which easily translates to technical inability to contribute to Wikipedia.[2][3][4][5] Indo-European languages most notably from Anglophones countries dominate Wikipedia contributions. The majority of the world's population lives in the Northern Hemisphere, which is mostly Anglophone. No wonder of the over 35 million different language editions of Wikipedia, nearly 45% of them belong to only 8 Indo-European languages (English, Swedish, Dutch, German, French, Russian, Italian, Spanish) with the English Wikipedia which also happens to be the largest, making up 13.9%.[6] Among the biases on Wikipedia are the unavailability of sources in some languages and the high cost involved in accessing quality sources from journals for example. Because reliable sources are required by Wikipedia policy, topics are limited in their contents by the sources available to editors. This is a particularly acute problem for biographies of living persons. Sources published in a medium that is both widely available and familiar to editors, such as a news website, are more likely to be used than those from esoteric or foreign-language publications regardless of their reliability.[7] ...There is a tall list of other biases I have gathered! Because of the tendentious nature of Wikipedia contributions and the in-proportionate distribution of Wikipedia articles, the study expects biography articles on Wikipedia to be influenced by the above-mentioned circumstances of a Wikipedia editor and a spillover of these effects onto the units of measurements used for the analysis. Furthermore, the Wikidata project is in its initial development stages and as of August 2013, the database was 106.6 gigabytes large with seventeen (17) million statements created.[8] The data imported for the analysis therefore is not representative of all biographies presently contained in all Wikipedias. In addition, only 28.34 percent of biography articles as of October 2014 contained the item “country”. This renders any cross-country inferences made based on the data still inconclusive. In response to: Instead of arguing for "an academic index...the authors should use more adequate and moderate arguments such as "a set of indicators that allow the measurement and monitoring the representative inequality of gender in biographies across countries and throughout time". Since we’re not only coming up with a paper but also an automated statistical presentation of gender in articles by certain categories, and in heeding the reviewers advise of using moderate language; I think it is crucial that we emphasize the main argument of the paper to be “to develop a set of indicators to be used as a tool for measuring and monitoring the representative inequality of gender in biographies across countries and throughout time”. In response to: In relation to the celebrity hypothesis, we can produce another heatmap that excludes celebrities at all (per tested celebrity terms) from the data to see if the absence of celebrities has an effect on the number of female biographies. If heated areas are reduced that corroborates the celebrity hypothesis somehow though not conclusively yet. Moving forward, I believe the key terms that are used for the celebrity tests are direct translations of the same terms in English. These professions though common among western Europeans may either not be easily associated with females in other demographics where their roles/ types of jobs in these societies differ, or hold significantly different perceptions among people. For example, a “model” in the Islamic/African/confusion cluster may not hold the same weight or celebrity status as a model in Western Europe. I can do more investigation as to what defines a celebrity in the different cultural clusters, and what it means (who fits the description best) in the different clusters so we do not have to use the same generic English terms. Furthermore, the notion that, obtaining a huge positive percentage in fig.10. (Difference in female ratio by language-unique and language-many articles by language of Wikipedia) indicates a “focus to write more female-oriented local hero articles” can be viewed from other perspectives, and I think that should also be mentioned/explored in the study. The thrust of European political power, commerce, and culture though present is less felt in Confucian societies as compared with its immense presence in Islamic and African societies. In such societies where the exchange (or rather the hand down) of culture is more eminent, we would expect that “local heroes” may not be so local after all. We can find out if biographies from non-European languages that end up having inter-Wiki links are mostly translated into English or European languages. If that is the case then that might explain why the Confucian cluster dominates the chart with language-unique biographies, diluting the strong assertion that it is as a result of a “focus to write more female-oriented local hero articles”. Also, given the period of birth or death, and the characteristics of biography articles such as occupation or cultural cluster, it will be interesting to perform multiclass LR analysis on the data to see if we can unearth a trend of ratio of mean article size by gender for Top 25 Wikipedias by language. I’m refereeing to Fig.11. Lastly, it would be interesting to replicate all the computations and graphs on a different but similar set of clusters to see if there are variations or observable differences in results. Globe cultural clusters based on the effects of globalization are a compelling candidate. Its clusters are Anglo, Confucian Asia, Eastern Europe, Germanic Europe, Latin America, Latin Europe, Middle East, Nordic Europe, Southern Asia, Sub-Sahara Africa.[9] In response to: Indexes are meant to summarize and rank specific observations. Figure 6. is a subsets of Fig.7 since the later adds a third dimension of culture to the existing gender ratios against time. We can further investigate how gender ratios will behave against time, culture, and occupation. The next set of graphs can be combined to produce a plot of percentage of the magnitude of the difference between language-unique and language-many female biographies by culture, to mean article size. If we assign the same weight to both indicators, it is possible to formulate an index that encapsulates all the variables looking at how they relate to each other. References 1 2 Sent from Windows Mail |
Mohammed, that's very interesting. I think the scope of the study should Regarding your analysis of the Confucian hypothesis, I think it may be Piotr Konieczny, PhD On 5/21/2015 20:40, Mohammed Sadat Abdulai wrote:
|
@piokon I am interested in exploring the Celebrity Hypothesis even further and I'd also prefer that we dealt with it rigorously in another paper. In the mean time it may not be too necessary to remove it entirely from this present paper, we can however trim it down by scantly mentioning it. @notconfusing Will time and resources allow us do further tests? |
For all to read, but particularly those working on the future end paper. @Masssly and @hargup, These were the critiques of our original paper in our Revise and Resubmit stage:
The text was updated successfully, but these errors were encountered: