Measuring Gender Inequalities
in Wikipedia
Claudia Wagner
Computational Social Science @
GESIS – Leibniz Institute for the Social Sciences, Germany
Web-Science @
University of Koblenz-Landau, Germany
Who edits Wikipedia?
2
(i) How are notable men and women
presented in Wikipedia?
(ii) How are professions described on
Wikipedia?
3
Notable Men/Women
4k individuals (3% women)
11k individuals (13% women)
110k individuals (11% women)
Are both genders covered equally?
• Hypothesis:
– If Wikipedia functions as a glass ceiling then
the women who are covered will be more
notable. Large gender gap for local heroes,
less gap for superstars.
• But how to assess notability of people?
6
Who makes it into Wikipedia?
7
Angela Merkel Fritz Kuhn
Global Notability (Internal Proxy)
Google Trends
8
Angela Merkel
9
Fritz Kuhn
• Negative Binomial Regression Models
– Outcome Variable:
• Number of language editions (internal notability)
– Dependent Variables:
• Gender, profession and birth decade
10
coef IRR P>|z|
[95.0% Conf. Int.]
female 0.1186 1.13 0.000 0.111 0.126
birth decade -0.0096 0.99 -0.0096 -0.010 -0.009
…. … … … …
Local Heroes
• 45% of men and 40% of women are local heroes.
– Born after 1900: • 5 men for 1 women 16,7% (expected)
• 6 men for 1 women 14,3% (observed)
– Born before 1900: • 12 men for 1 women 7,7% (expected)
• 13 men for 1 women 7,1 % (observed)
11
Interest via Google Search
• On average, women who are depicted in Wikipedia are of interest in more regions (IRR=1.555) and during more months (IRR=1.322) than men
12
How are they depicted?
13
14
After 1900 Before 1900
Linguistic Bias
• Linguistic Intergroup Bias theory: – We generalize positive aspects of people in our
ingroup
– We generalize negative aspects of people in our outgroup
15 Maass A, Salvi D, Arcuri L, Semin GR (1989) Language use in intergroup contexts: the linguistic intergroup bias. J Pers Soc Psychol 57(6):981-993
Structural Differences
16
17
Hyperlink Network
Men are more central
Men are better connected
The k-core is the largest subnetwork comprising only nodes of degree at least k.
20
21
Summary
• Coverage of notable men and women on Wikipedia is good (if we compare with external lists)
• Women are on average more notable according to internal and external criteria
• Less female local heroes than expected
• Topical difference and linguistic bias
• Structural differences
22
Professions in Wikipedia
• List of ~4200 German profession names
– Male, female and neutral name for the same profession
– e.g. Feuerwehrmann, Feuerwehrfrau, Feuerwehrpersonal, Feuerwehrfachkraft, Feuerwehrmann/frau
• Mapping of profession names to Wikipedia
23
Coverage
24 0% 50% 100%
Masculine
Feminine
Neutral
Page
No Page
Redirect
25 https://de.wikipedia.org/wiki/Journalist
Images
26
Relation to Offline Statistics
27
Text
28
Male Bias
Female Bias
Relation to Offline Statistics
29
Conclusions
• Gender-neutral profession descriptions rarely exist on German Wikipedia
• Also professions which are dominated by women nowadays refer mainly to men
• Gender differences in the description of notable men and women
• Some inequalities simply reflect historic differences, others do not – How to decide what is appropriate?
• Guidelines and automatic tools necessary to support editors
30
Joint work with
31
Markus Strohmaier
Fabian Flöck Olga Zagovora David Garcia Mohsen Jadidi
Eduardo Graells Garrido Fil Menczer
Questions?
Top Related