Attenuating Bias in Word Vectors
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Attenuating Bias in Word Vectors
Sunipa Dev Jeff Phillips
University of Utah University of Utah
Abstract and Hendricks et al. (2018) [6] show that machine
arXiv:1901.07656v1 [cs.CL] 23 Jan 2019
learning algorithms and their output show more bias
Word vector representations are well devel- than the data they are generated from.
oped tools for various NLP and Machine Word vector embeddings as used in machine learning
Learning tasks and are known to retain sig- towards applications which significantly affect people’s
nificant semantic and syntactic structure of lives, such as to assess credit [11], predict crime [5],
languages. But they are prone to carrying and other emerging domains such judging loan appli-
and amplifying bias which can perpetrate dis- cations and resumes for jobs or college applications.
crimination in various applications. In this So it is paramount that efforts are made to identify
work, we explore new simple ways to detect and if possible to remove bias inherent in them. Or
the most stereotypically gendered words in an at least, we should attempt minimize the propagation
embedding and remove the bias from them. of bias within them. For instance, in using existing
We verify how names are masked carriers of word embeddings, Bolukbasi et al. (2016) [3] demon-
gender bias and then use that as a tool to strated that women and men are associated with dif-
attenuate bias in embeddings. Further, we ferent professions, with men associated with leader-
extend this property of names to show how ships roles and professions like doctor, programmer
names can be used to detect other types of and women closer to professions like receptionist or
bias in the embeddings such as bias based on nurse. Caliskan et al. (2017) [7] similarly noted how
race, ethnicity, and age. word embeddings show that women are more closely
associated with arts than math while it is the opposite
for men. They also showed how positive and negative
1 BIAS IN WORD VECTORS connotations are associated with European-American
versus African-American names.
Word embeddings are an increasingly popular applica-
Our work simplifies, quantifies, and fine-tunes these
tion of neural networks wherein enormous text corpora
approaches: we show that very simple linear projection
are taken as input and words therein are mapped to
of all words based on vectors captured by common
a vector in some high dimensional space. Two com-
names is an effective and general way to significantly
monly used approaches to implement this are Word-
reduce bias in word embeddings. More specifically:
ToVec [15,16] and GloVe [17]. These word vector repre-
sentations estimate similarity between words based on
the context of their nearby text, or to predict the likeli- 1a. We demonstrate that simple linear projection of
hood of seeing words in the context of another. Richer all word vectors along a bias direction is more
properties were discovered such as synonym similarity, effective than the Hard Debiasing of Bolukbasi et
linear word relationships, and analogies such as man : al. (2016) [3] which is more complex and also
woman :: king : queen. Their use is now standard in partially relies on crowd sourcing.
training complex language models.
1b. We show that these results can be slightly im-
However, it has been observed that word embeddings proved by dampening the projection of words
are prone to express the bias inherent in the data it is which are far from the projection distance.
extracted from [3,4,7]. Further, Zhao et al. (2017) [18]
2. We examine the bias inherent in the standard
Thanks to NSF CCF-1350888, ACI-1443046, CNS-1514520, word pairs used for debiasing based on gender
CNS-1564287, IIS-1816149, and NVidia Corporation. by randomly flipping or swapping these words in
Part of the work by JP was done while visiting the Simons the raw text before creating the embeddings. We
Institute for Theory of Computing.
show that this alone does not eliminate bias inAttenuating Bias in Word Vectors
word embeddings, corroborating that simple lan- {man,woman}, {son,daughter}, {he,she}, {his,her},
{male,female}, {boy,girl}, {himself,herself},
guage modification is not as effective as repairing
{guy,gal}, {father,mother}, {john,mary}
the word embeddings themselves.
Table 1: Gendered Word Pairs
3a. We show that common names with gender associ-
ation (e.g., john, amy) often provides a more effec-
tive gender subspace to debias along than using vB be the top singular vector of Q. We revisit how to
gendered words (e.g., he, she). create such a bias direction in Section 4.
Now given a word vector w ∈ W , we can project it to
3b. We demonstrate that names carry other inher-
its component along this bias direction vB as
ent, and sometimes unfavorable, biases associated
with race, nationality, and age, which also corre- πB (w) = hw, vB ivB .
sponds with bias subspaces in word embeddings.
And that it is effective to use common names
to establish these bias directions and remove this 3.1 Existing Method : Hard Debiasing
bias from word embeddings. The most notable advance towards debiasing embed-
dings along the gender direction has been by Boluk-
2 DATA AND NOTATIONS basi et al. (2016) [3] in their algorithm called Hard
Debiasing (HD). It takes a set of words desired to
We set as default the text corpus of a Wikipedia be neutralized, {w1 , w2 , . . . , wn } = WN ⊂ W , a unit
dump (dumps.wikimedia.org/enwiki/latest/ bias subspace vector vB , and a set of equality sets
enwiki-latest-pages-articles.xml.bz2) with E1 , E2 , . . . , Em .
4.57 billion tokens and we extract a GloVe embedding First, words {w1 , w2 , . . . , wn } ∈ WN are projected or-
from it in D = 300 dimensions per word. We restrict thogonal to the bias direction and normalized
the word vocabulary to the most frequent 100,000
words. We also modify the text corpus and extract wi − wB
wi0 = .
embeddings from it as described later. ||wi − wB ||
So, for each word in the Vocabulary W , we represent
Second, it corrects the locations
P of the vectors in the
the word by the vector wi ∈ RD in the embedding. 1
¯ equality sets. Let µj = |E| e be the mean of
The bias (e.g., gender) subspace is denoted by a set of
1
Pe∈E
m
j
vector B. It is typically considered in this work to be a an equality set, and µ = m j=1 µj be the mean of
single unit vector, vB (explained in detail later). As we of equality set means. Let νj = µ − µj be the offset
will revisit, a single vector is typically sufficient, and of a particular equality set from the mean. Now each
will simplify descriptions. However, these approaches e ∈ Ej in each equality set Ej is first centered using
can be generalized to a set of vectors defining a multi- their average and then neutralized as
dimensional subspace.
πB (e) − vB
q
e0 = νj + 1 − kνj k2 .
kπB (e) − vB k
3 HOW TO ATTENUATE BIAS
Intuitively νj quantifies the amount words in each
Given a word embedding, debiasing typically takes as equality set Ej differ from each other in directions
input a set Ȩ = {E1 , E2 , . . . , Em } of equality sets. apart from the gender direction. This is used to center
An equality set Ej for instance can be a single pair the words in each of these sets.
(e.g., {man, woman}), but could be more words (e.g., This renders word pairs such as man and woman as
{latina, latino, latinx}) that if the bias connota- equidistant from the neutral words wi0 with each word
tion (e.g, gender) is removed, then it would objectively of the pair being centralized and moved to a position
make sense for all of them to be equal. Our data sets opposite the other in the space. This can filter out
will only use word pairs (as a default the ones in Ta- properties either word gained by being used in some
ble 1), and we will describe them as such hereafter for other context, like mankind or humans for the word
simpler descriptions. In particular, we will represent man.
D
each Ej as a set of two vectors e+ −
i , ei ∈ R .
¯ The word set WN = {w1 , w2 , . . . , wn } ⊂ W which
Given such a set Ȩ of equality sets, the bias vector vB is debiased is obtained in two steps. First it seeds
can be formed as follows [3]. For each Ej = {e+ −
j , ej } some words as definitionally gendered via crowd sourc-
+ −
create a vector ~ei = ei − ei between the pairs. Stack ing and using dictionary definitions; the complement
these to form a matrix Q = [~e1 ~e2 . . . ~em ], and let – ones not selected in this step – are set as neutral.Sunipa Dev, Jeff Phillips
man
Next, using this seeding an SVM is trained and used
to predict among all W the set of other biased WB or µ1
vB
neutral words WN . This set WN is taken as desired hw, vB ivB
to be neutral and is debiased. Thus not all words W woman
he
µ
in the vocabulary are debiased in this procedure, only
a select set chosen via crowd-sourcing and definitions, ⌘
and its extrapolation. Also the word vectors in the µ2
equality sets are also handled separately. This makes w
this approach not a fully automatic way to debias the
vector embedding. she
3.2 Alternate and Simple Methods
Figure 1: Illustration of η and β for word vector w.
We next present some simple alternatives to HD which
are simple and fully automatic. These all assume a
bias direction vB . 3.3 Partial Projection
Subtraction. As a simple baseline, for all word
A potential issue with the simple approaches is that
vectors w subtract the gender direction vB from w:
they can significantly change some embedded words
w 0 = w − vB . which are definitionally biased (e.g., the neutral words
WB described by Bolukbasi et al. [3]). [[We note that
this may not *actually* be a problem (see Section 5);
Linear Projection. A better baseline is to project the change may only be associated with the bias, so re-
all words w ∈ W orthogonally to the bias vector vB . moving it would then not change the meaning of those
words in any way except the ones we want to avoid.]]
w0 = w − πB (w) = w − hw, vB ivB . However, these intuitively should be words which have
correlation with the bias vector, but also are far in the
This enforces that the updated set W 0 = {w0 | w ∈ W }
orthogonal direction. In this section we explore how
has no component along vB , and hence the resulting
to automatically attenuate the effect of the projection
span is only D − 1 dimensions. Reducing the total
on these words.
dimension from say 300 to 299 should have minimal
effects of expressiveness or generalizability of the word This stems from the observation that given a bias di-
vector embeddings. rection, the words which are most extreme in this di-
rection (have the largest dot product) sometimes have
Bolukbasi et al. [3] apply this same step to a dictio-
a reasonable biased context, but some do not. These
nary definition based extrapolation and crowd-source-
“false positives” may be large normed vectors which
chosen set of word pairs WN ⊂ W . We quantify in
also happen to have a component in the bias direc-
Section 5 that this single universal projection step de-
tion.
biases better than HD.
We start with a bias direction vB and mean µ derived
For example, consider the bias as gender, and the
from equality pairs (defined the same way as in context
equality set with words man and woman. Linear pro-
of HD). Now given a word vector w we decompose it
jection will subtract from their word embeddings the
into key values along two components, illustrated in
proportion that were along the gender direction vB
Figure 1. First, we write its bias component as
learned from a larger set of equality pairs. It will make
them close-by but not exactly equal. The word man is β(w) = hw, vB i − hµ, vB i.
used in many extra senses than the word woman; it is
used to refer to humankind, to a person in general, This is the difference of w from µ when both are pro-
and in expressions like “oh man”. In contrast a sim- jected onto the bias direction vB .
pler word pair with fewer word senses, like (he - she)
and (him - her), we can expect them to be almost at Second, we write a (residual) orthogonal component
identical positions in the vector space after debiasing,
implying their synonymity. r(w) = w − hw, vB ivB .
Thus, this approach uniformly reduces the component Let η(w) = kr(w)k be its value. It is the orthogonal
of the word along the bias direction without compro- distance from the bias vector vB ; recall we chose vB
mising on the differences that words (and word pairs) to pass through the origin, so the choice of µ does not
have. affect this distance.Attenuating Bias in Word Vectors
Now we will maintain the orthogonal component
Proportion along the 1 dimensional gender subspace
1.00 f
(r(w), which is in a subspace spanned by D − 1 out of f1
0.75 f2
D dimensions) but adjust the bias component β(w) to f3
0.50 P1
make it closer to µ. But the adjustment will depend P2
on the magnitude η(w). As a default we set 0.25
0.00
w0 = µ + r(w)
0.25
so all word vectors retain their orthogonal component, 0.50
but have a fixed and constant bias term. This is func- 0.75
tionally equivalent to the Linear Projection approach;
1.00
the only difference is that instead of having a 0 magni- 2 1 0 1 2
tude along vB (and the orthogonal part unchanged), it Proportion along the other 299 dimensions
instead has a magnitude of constant µ along vB (and
the orthogonal part still unchanged). This adds a Figure 2: The gendered region as per the three varia-
constant to every inner product, and a constant off- tions of projection. Both points P1 and P2 have a dot
set to any linear projection or classifier. If we are product of 1.0 initially with the gender subspace. But
required to work with normalize vectors (we do not their orthogonal distance to it differs, as expressed by
recommend this as the vector length captures veracity their dot product with the other 299 dimensions.
information about its embedding), we can simple set
w0 = r(w)/kr(w)k.
to have more damping (large fi values – they are not
Given this set-up, we now propose three modifications. moved much).
In each set
Given sets S and T , we can define a gain function
w0 = µ + r(w) + β · fi (η(w)) · vB X X
γi,ρ (σ) = β(s)(1−fi (η(s)))−ρ β(t)(1−fi (η(t))),
were fi for i = {1, 2, 3} is a function of only the or- s∈S t∈T
thogonal value η(w). For the default case f (η) = 0
with a regularization term ρ. The gain γ is large when
f1 (η) = σ 2 /(η + 1)2 most bias words in S have very little damping (small
f2 (η) = exp(−η 2 /σ 2 ) fi , large 1−fi ), and the opposite is true for the neutral
f3 (η) = max(0, σ/2η) words in T . We want the neutral words to have large
fi and hence small 1 − fi , so they do not change much.
Here σ is a hyperparameter that controls the impor- To define the gain function, we need sets S and T ; we
tance of η; in Section 3.4 we show that we can just set do so with the bias of interest as gender. The biased set
σ = 1. x S is chosen among a set of 1000 popular names in W
In Figure 2 we see the regions of the (η, β)-space which (based on babynamewizard.com and and SSN
that the functions f , f1 and f2 consider gendered. f databases [1,2]) are strongly associated with a gender.
projects all points onto the y = µ line. But variants The neutral set T is chosen as the most frequent 1000
f1 , f2 , and f3 are represented by curves that dampen words from W , after filtering out obviously gendered
the bias reduction to different degrees as η increases. words like names man and he. We also omit occupation
Points P1 and P2 have the same dot products with words like doctor and others which may carry unin-
the bias direction but different dot products along the tentional gender bias (these are words we would like
other D − 1 dimensions. We can observe the effects to automatically de-bias). The neutral set may not
of each dampening function as η increases from P1 to be perfectly non-gendered, but it provides reasonable
P2. approximation of all non-gendered words.
We find for an array of choices for ρ (we tried ρ = 1,
3.4 SETTING σ = 1. ρ = 10, and ρ = 100), the value σ = 1 approxi-
mately maximizes the gain function γi,ρ (σ) for each
To complete the damping functions f1 , f2 , and f3 , we
i ∈ {1, 2, 3}. So for hereafter we fix σ = 1.
need a value σ. If σ is larger, then more word vectors
have bias completely removed; if σ is smaller, than Although these sets S and T play a role somewhat
more words are unaffected by the projection. The goal similar to the crowd-sourced sets WB and WN from
is that words S which are likely to carry a bias con- HD that we hoped to avoid, the role here is much
notation to have little damping (small fi values) and reduced. This is just to verify that a choice of σ = 1
words T which are unlikely to carry a bias connotation is reasonable, and otherwise they are not used.Sunipa Dev, Jeff Phillips
0.5 0.200 0.175
0.6 0.175 0.6
0.4 0.150
0.4
0.5 0.150 0.5
0.125
0.3 0.3 0.125
0.4 0.100 0.4
0.100
0.3 0.2 0.075 0.3
0.2 0.075
0.2 0.050 0.2
0.050
0.1 0.1
0.1 0.025 0.025 0.1
0.0 0.0 0.0 0.000 0.000 0.0
0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8
0.00 0.50 0.75 1.00
Figure 3: Fractional singular values for avg male - Figure 4: Proportion of singular values along principal
female words (as per Table 1) after flipping with prob- directions (left) using names as indicators, and (right)
ability (from left to right) 0.0 (the original data set), using word pairs from Table 1 as indicators
0.5, 0.75, and 1.0.
4 THE BIAS SUBSPACE
We explore ways of detecting and defining the bias
3.5 Flipping the Raw Text
subspace vB and recovering the most gendered words
in the embedding. Recall as default, we use vB as the
Since the embeddings preserve inner products of the top singular vector of the matrix defined by stacking
data from which it is drawn, we explore if we can make vectors ~ei = e+ −
i −ei of biased word pairs. We primarily
the data itself gender unbiased and then observe how focus on gendered bias, using words in Table 1, and
that change shows up in the embedding. Unbiasing show later how to effectively extend to other biases.
a textual corpus completely can be very intricate and We discuss this in detail in Supplementary Material
complicated since there are a many (sometimes im- C.
plicit) gender indicators in text. Nonetheless, we pro-
pose a simple way of neutralizing bias in textual data Most gendered words. The dot product, hvB , wi
by using word pairs E1 , E2 , . . . Em ; in particular, when of the word vectors w with the gender subspace vB
we observe in raw text on part of a word part, we ran- is a good indicator of how gendered a word is. The
domly flip it to the other pair. For instance for gen- magnitude of the dot product tells us of the length
dered word pairs (e.g., (he - she)) in a string “he was along the gender subspace and the sign tells us whether
a doctor” we may flip to “she was a doctor.” it is more female or male. Some of the words denoted
We implement this procedure over the entire input raw as most gendered are listed in Table 3.
text, and try various probabilities of flipping each ob-
served word, focusing on probabilities 0.5, 0.75 and
4.1 Bias Direction using Names
1.00. The first 0.5-flip probability makes each element
of a word pair equally likely. The last 1.00-flip proba-
When listing gendered words by |hvB , wi|, we observe
bility reverses the roles of those word pairs, and 0.75-
that many gendered words are names. This indicates
flip probability does something in between. We per-
the potential to use names as an alternative (and po-
form this set of experiments on the default Wikipedia
tentially in a more general way) to bootstrap finding
data set and switch between word pairs (say man →
the gender direction.
woman, she → he, etc), from a list larger that Table 3
consisting of 75 word pairs; see Supplementary Mate- From the top 100K words, we extract the 10
rial D.1. most common male {m1 , m2 , . . . , m10 } and female
{s1 , s2 , . . . , s10 } names which are not used in ambigu-
We observe how the proportion along the principal
ous ways (e.g., not the name hope which could also
component changes with this flipping in Figure 3. We
refer to the sentiment). We pair these 10 names from
see that flipping with 0.5 somewhat dampens the dif-
each category (male, female) randomly and compute
ference between the different principal components.
the SVD as before. We observe in 4 that the frac-
On the other hand flipping with probability 1.0 (and
tional singular values show a similar pattern as with
to a lesser extent 0.75) exacerbates the gender compo-
the list of correctly gendered word pairs like (man -
nents rather than dampening it. Now there are two
woman), (he - she), etc. But this way of pairing names
components significantly larger than the others. This
is quite imprecise. These names are not ‘opposites’ of
indicates this flipping is only addressing part of the ex-
each other in the sense that word pairs are. So, we
plicit bias, but missing some implicit bias, and these
modify how we compute vB now so that we can better
effects are now muddled.
use names to detect the bias in the embedding. The
We list some gender biased analogies in the default em- following method gives us this advantage where we do
bedding and how they change with each of the methods not necessarily need word pairs or equality sets as in
described in this section in Table 2. Bolukbasi et al. [3].Attenuating Bias in Word Vectors
Table 2: What analogies look like before and after damping gender by different methods discussed : hard
Debiaisng, flipping words in text corpus, subtraction and projection
analogy head original HD flipping subtraction projection
0.5 0.75 1.0
man : woman :: doctor : nurse surgeon dr dr medicine physician physician
man : woman :: footballer : politician striker midfielder goalkeeper striker politician midfielder
he : she :: strong : weak stronger weak strongly many well stronger
he : she :: captain : mrs lieutenant lieutenant colonel colonel lieutenant lieutenant
john : mary :: doctor : nurse physician medicine surgeon nurse father physician
Table 3: Some of the most gendered words in default Table 4: Gendered occupations as observed in word
embedding; and most gendered adjectives and occupa- embeddings using names as the gender direction indi-
tion words. cator
Gendered Words Female Occ Male Occ Female* Occ Male* Occ
miss herself forefather himself nurse captain policeman policeman
maid heroine nephew congressman maid cop detective cop
motherhood jessica zahir suceeded actress boss character character
adriana seductive him sir housewife officer cop assassin
Female Adjectives Male Adjectives dancer actor assassin bodyguard
glamorous strong nun scientist actor waiter
diva muscular waitress gangster waiter actor
shimmery powerful scientist trucker butler detective
beautiful fast
Female Occupations Male Occupations
nurse soldier 5 QUANTIFYING BIAS
maid captain
housewife officer
prostitute footballer
In this section we develop new measures to quantify
how much bias has been removed from an embedding,
and evaluate the various techniques we have developed
for doing so.
Our gender direction is calculated as,
As one measure, we use the Word Embedding Associ-
ation Test (WEAT) test developed by Caliskan et al.
s−m
vB,names = , (2017) [7] as analogous to the IAT tests to evaluate the
ks − mk association of male and female gendered words with
two categories of target words: career oriented words
1 1
P P
where s = 10 i si and m = 10 i mi . versus family oriented words. We detail WEAT and
list the exact words used (as in [7]) in Supplementary
Using the default Wikipedia dataset, we found that
Material B; smaller values are better.
this is a good approximator of the gender subspace
defined by the first right singular vector calculated us- Bolukbasi et al. [3] evaluated embedding bias use a
ing gendered words from Table 1; there dot product crowsourced judgement of whether an analogy pro-
is 0.809. We find similar large dot product scores for duced by an embedding is biased or not. Our goal
other datasets too. was to avoid crowd sourcing, so we propose two more
automatic tests to qualitatively and uniformly evalu-
Here too we collect all the most gendered words as
ate an embedding for the presence of gender bias.
per the gender direction vB,names determined by these
names. Most gendered words returned are similar as
Embedding Coherence Test (ECT). A way to
using the default vB , like occupational words, adjec-
evaluate how the neutralization technique affects the
tives, and synonyms for each gender. We find names
embedding is to evaluate how the nearest neighbors
to express similar classification of words along male
change for (a) gendered pairs of words Ȩ and (b)
- female vectors with homemaker more female and
indirect-bias-affected words such as those associated
policeman being more male. We illustrate this in more
with sports or occupational words (e.g., football,
detail in Table 4.
captain, doctor). We use the gendered word pairs
Using that direction, we debias by linear projection. in Table 1 for Ȩ and the professions list P =
There is a similar shift in analogy results. We see a {p1 , p2 , . . . , pk } as proposed and used by Bolukbasi
few examples in Table 5. et al. https://github.com/tolga-b/debiaswe (seeSunipa Dev, Jeff Phillips
Table 5: What analogies look like before and after removing the gender direction using names
analogy head original subtraction projection
man : woman :: doctor : nurse physician physician
man : woman :: footballer : politician politician midfielder
he : she :: strong : weak very stronger
he : she :: captain : mrs lieutenant lieutenant
john : mary :: doctor : nurse dr dr
also Supplementary Material D.2) to represent (b). The scores for EQT are typically much smaller than
for ECT. We explain two reasons for this.
S1: For all word pair {e+ −
j , ej } P= Ej ∈ Ȩ we com- First, EQT does not check for if the analogy makes
1 +
pute two means m = |Ȩ | Ej ∈Ȩ ej and s = relative sense, biased or otherwise. So, “man : woman
1
P − :: doctor : nurse” is as wrong as “man : woman ::
|Ȩ| Ej ∈Ȩ ej . We find the cosine similarity of
doctor : chair.” This pushes the score down.
both m and s to all words pi ∈ P . This creates
two vectors um , us ∈ Rk . Second, synonyms in each set si as returned by Word-
¯
Net [8] on the Natural Language Toolkit, NLTK [14]
S2: We transform these similarity vectors to replace do not always contain all possible variants of the
each coordinate by its rank order, and compute word. For example, the words psychiatrist and
the Spearman Coefficient (in [−1, 1], larger is bet- psychologist can be seen as analogous for our pur-
ter) between the rank order of the similarities to poses here but linguistically are removed enough that
words in P . WordNet does not put them as synonyms together.
Hence, even after debiasing, if the analogy returns
Thus, here, we care about the order in which the words “man : woman :: psychiatrist : psychologist“ S1
in P occur as neighbors to each word pair rather than returns 0. Further, since the data also has several
the exact distance. The exact distance between each misspelt words, archeologist is not recognized as a
word pair would depend on the usage of each word and synonym or alternative for the word archaeologist.
thus on all the different dimensions other than the gen- For this too S1 returns a 0.
der subspace too. But the order being relatively the
same, as determined using Spearman Coefficient would The first caveat can be side-stepped by restricting the
indicate the dampening of bias in the gender direction pool of words we search over for the analogous word
(i.e., if doctor by profession is the 2nd closest of all to be from list P . But it is debatable if an embedding
professions to both man and woman, then the embed- should be penalized equally for returning both nurse
ding has a dampened bias for the word doctor in the or chair for the analogy “man : woman :: doctor : ?”
gender direction). Neutralization should ideally bring This measures the quality of analogies, with better
the Spearman coefficient towards 1. quality having a score closer to 1.
Embedding Quality Test (EQT). The demon-
Evaluating embeddings. We mainly run 4 meth-
stration by Bolukbasi et al. [3] about the skewed gen-
ods to evaluate our methods WEAT, EQT, and two
der roles in embeddings using analogies is what we try
variants of ECT: ECT (word pairs) uses Ȩ defined by
to quantify in this test. We attempt to quantify the
words in Table 1 and ECT (names) which uses vectors
improvement in analogies with respect to bias in the
m and s derived by gendered names.
embeddings.
We observe in Table 6 that the ECT score increases
We use the same sets Ȩ and P as in the ECT test.
for all methods in comparison to the non-debiased (the
However, for each profession pi ∈ P we create a list
original) word embedding; the exception is flipping
Si of their plurals and synonyms from WordNet on
with 1.0 probability score for ECT (word pairs) and
NLTK [14].
all flipping variants for ECT (names). Flipping does
nothing to affect the names, so it is not surprising that
S1: For each word pair {e+ −
j , ej } = Ej ∈ Ȩ, and each it does not improve this score; further indicating that
occupation word pi ∈ P , we test if the analogy it is challenging to directly fix bias in raw text before
e+ −
j : ej :: pi returns a word from Si . If yes, we set creating embeddings. Moreover, HD has the lowest
Q(Ej , pi ) = 1, and Q(Ej , pi ) = 0 otherwise. score (of 0.917) whereas projection obtains scores of
S2: Return 0.996 (with vB ) and 0.943 (with vB,names ).
1 1
P the average
P value across all combinations
|Ȩ| k E j ∈ Ȩ p i ∈P Q(Ej , pi ). EQT is a more challenging test, and the original em-Attenuating Bias in Word Vectors
Table 6: Performance on ECT, EQT and WEAT by the different debiasing methods; and performance on
standard similarity and analogy tests.
analogy head original HD flipping subtraction projection
0.5 0.75 1.0 word pairs names word pairs names
ECT (word pairs) 0.798 0.917 0.983 0.984 0.683 0.963 0.936 0.996 0.943
ECT (names) 0.832 0.968 0.714 0.662 0.587 0.923 0.966 0.935 0.999
EQT 0.128 0.145 0.131 0.098 0.085 0.268 0.236 0.283 0.291
WEAT 1.623 1.221 1.164 1.09 1.03 1.427 1.440 1.233 1.219
WSim 0.637 0.537 0.567 0.537 0.536 0.627 0.636 0.627 0.629
Simlex 0.324 0.314 0.317 0.314 0.264 0.302 0.312 0.321 0.321
Google Analogy 0.623 0.561 0.565 0.561 0.321 0.538 0.565 0.565 0.584
bedding only achieves a score of 0.128, and HD only
Table 7: Performance of damped linear projection
obtains 0.145 (that is 12 − 15% of occupation words
using word pairs.
have their related word as nearest neighbor). On the
Tests f f1 f2 f3
other hand, projection increases this percentage to ECT 0.996 0.994 0.995 0.997
28.3% (using vB ) and 29.1% (using vB,names ). Even EQT 0.283 0.280 0.292 0.287
subtraction does nearly as well at between 23 − 27%. WEAT 1.233 1.253 1.245 1.241
Generally, the subtraction always performs slightly WSim 0.627 0.628 0.627 0.627
worse than projection. Simlex 0.321 0.324 0.324 0.324
Google Analogy 0.565 0.571 0.569 0.569
For the WEAT test, the original data has a score of
1.623, and this is decreased the most by all forms of
flipping, down to about 1.1. HD and projection do Analogy test. This test set is devoid of bias and is
about the same with HD obtaining a score of 1.221 and made up of syntactic and semantic analogies. So, a
projection obtaining 1.219 (with vB,names ) and 1.234 score closer to that of the original, biased embedding,
(with vB ); values closer to 0 are better (See Supple- tells us that more structure has been retained by f1 , f2
mentary Material B). and f3 . Overall, any of these approaches could be used
if a user wants to debias while retaining as much struc-
In the bottom of Table 6 we also run these approaches ture as possible, but otherwise linear projection (or f )
on standard similarity and analogy tests for evaluat- is roughly as good as these dampened approaches.
ing the quality of embeddings. We use cosine similar-
ity [13] on WordSimilarity-353 (WSim, 353 word pairs)
[9] and SimLex-999 (Simlex, 999 word pairs) [10], each 6 DETECTING OTHER BIAS
of which evaluates a Spearman coefficient (larger is USING NAMES
better). We also use the Google Analogy Dataset us-
ing the function 3COSADD [12] which takes in three We saw so far how projection combined with finding
words which for a part of the analogy and returns the the gender direction using names works well and works
4th word which fits the analogy the best. as well as projection combined with finding the gender
direction using word pairs.
We observe (as expected) that all that debiasing ap-
proaches reduce these scores. The largest decrease in We explore here a way of extending this approach to
scores (between 1% and 10%) is almost always from detect other kinds of bias where we cannot necessar-
HD. Flipping at 0.5 rate is comparable to HD. And ily find good word pairs to indicate a direction, like
simple linear projection decreases the least (usually Table 1 for gender, but where names are known to
only about 1%, except on analogies where it is 7% belong to certain protected demographic groups. For
(with vB ) or 5% (with vB,names ). example, there is a divide between names that differ-
ent racial groups tend to use more. Caliskan et al. [7]
In Table 7 we also evaluate the damping mechanisms
use a list of names that are more African-American
defined by f1 , f2 , and f3 , using vB . These are very
(AA) versus names that are more European-American
comparable to simple linear projection (represented by
(EA) for their analysis of bias. There are similar lists
f ). The scores for ECT, EQT, and WEAT are all
of names that are distinctly and commonly used by
about the same as simple linear projection, usually
different ethnic, racial (e.g., Asian, African-American)
slightly worse.
and even religious (for e.g., Islamic) groups.
While ECT, EQT and WEAT scores are in a similar
We first try this with two common demographic group
range for all of f , f1 , f2 , and f3 ; the dampened ap-
divides : Hispanic / European-American and African-
proaches f1 , f2 , and f3 performs better on the Google
American / European-American.Sunipa Dev, Jeff Phillips
Hispanic and European-American names. Even
though we begin with most commonly used His-
panic (H) names (Supplementary Material D), this
is tricky as not all names occur as much as Euro-
pean American names and are thus not as well em-
bedded. We use the frequencies from the dataset to
guide us in selecting commonly used names that are
also most frequent in the Wikipedia dataset. Using
the same method as Section 4.1, we determine the
direction, vB,names , which encodes this racial differ-
ence and find the words most commonly aligned with Figure 5: Gender and racial bias in the embedding
it. Other Hispanic and European-American names
are the closest words. But other words like, latino
or hispanic also appear to be close, which affirms Table 8: WEAT positive-negative test scores before
that we are capturing the right subspace. and after debiasing
Before Debiasing After Debiasing
African-American and European-American EA-AA 1.803 0.425
names. We see a similar trend when we use EA-H 1.461 0.480
African-American names and European-American Youth-Aged 0.915 0.704
names (Figure 5). We use the African-American
names used by Caliskan et al. (2017) [7]. We bias, we see that biased words like other names belong-
determine the bias direction by using method in ing to these specific demographic groups, slang words,
Section 4.1. colloquial terms like latinos are removed from the
We plot in Figure 5 a few occupation words along the closest 10% words. This is beneficial since the distin-
axes defined by H-EA and AA-EA bias directions, and guishability of demographic characteristics based on
compare them with those along the male-female axis. names is what shows up in these different ways like in
The embedding is different among the groups, and occupational or financial bias.
likely still generally more subordinate-biased towards
Hispanic and African-American names as it was for
Age-associated names. We observed that names
female. Although footballer is more Hispanic than
can be masked carriers of age too. Using the database
European-American, while maid is more neutral in the
for names through time [1] and extracting the most
racial bias setting than the gender setting. We see this
common names from early 1900s as compared to late
pattern repeated across embeddings and datasets (see
1900s and early 2000s, we find a correlation between
Supplementary Material A).
these names (see Supplementary Material) and age re-
When we switch the type of bias, we also end up find- lated words. In Figure 6, we see a clear correlation be-
ing different patterns in the embeddings. In the case tween age and names. Bias in this case does not show
of both of these racial directions, there is a the split in up in professions as clearly as in gender but in terms of
not just occupation words but other words that are de- association with positive and negative words [7]. We
tected as highly associated with the bias subspace. It again evaluate using a WEAT test in Table 8, the bias
shows up foremost among the closest words of the sub- before and after debiasing the embedding.
space of the bias. Here, we find words like drugs and
illegal close to the H-EA direction while, close to
the AA-EA direction, we retrieve several slang words youth
used to refer to African-Americans. These word as- maturity
sociations with each racial group can be detected by youthadolescence
the WEAT tests (lower means less bias) using positive young
and negative words as demonstrated by Caliskan et al.
(2017) [7]. We evaluate using the WEAT test before senile
venerable
and after linear projection debiasing in Table 8. For
each of these tests, we use half of the names in each cat- old
elderly
egory for finding the bias direction and the other half
for WEAT testing. This selection is done arbitrarily aged
and the scores are averaged over 3 such selections.
More qualitatively, as a result of the dampening of Figure 6: Detecting Age with Names : a plot of age
related terms along names from different centuries.Attenuating Bias in Word Vectors
7 DISCUSSION [10] F Hill, R Reichart, and A Korhonen. Simlex-999 :
Evaluating semantic models with (genuine) sim-
Different types of bias exist in textual data. Some ilarity estimation. In Computational Linguistics,
are easier to detect and evaluate. Some are harder to volume 41, pages 665–695, 2015.
find suitable and frequent indicators for and thus, to
[11] Amir E. Khandani, Adlar J. Kim, and Andrew
dampen. Gendered word pairs and gendered names
Lo. Consumer credit-risk models via machine-
are frequent enough in textual data to allow us to suc-
learning algorithms. Journal of Banking & Fi-
cessfully measure it in different ways and project the
nance, 34(11):2767–2787, 2010.
word embeddings away from the subspace occupied by
gender. Other types of bias don’t always have a list [12] Omer Levy and Yoav Goldberg. Neural word em-
of word pairs to fall back on to do the same. But bedding as implicit matrix factorization. In NIPS,
using names, as we see here, we can measure and de- 2013.
tect the different biases anyway and then project the
embedding away from. In this work we also see how [13] Omer Levy and Yoav Goldberg. Linguistic reg-
a weighted variant of propection removes bias while ularities of sparse and explicit word representa-
retaining best the inherent structure of the word em- tions. In CoNLL, 2014.
bedding.
[14] Edward Loper and Steven Bird. Nltk: The natu-
ral language toolkit. In Proceedings of the ACL-
References
02 Workshop on Effective Tools and Method-
[1] https://www.ssa.gov/oact/babynames/. ologies for Teaching Natural Language Process-
ing and Computational Linguistics - Volume 1,
[2] http://www.babynamewizard.com. ETMTNLP ’02, pages 63–70, Stroudsburg, PA,
USA, 2002. Association for Computational Lin-
[3] T Bolukbasi, K W Chang, J Zou, V Saligrama, guistics.
and A Kalai. Man is to computer programmer
as woman is to homemaker? debiasing word em- [15] Tomas Mikolov, Kai Chen, Greg Corrado, and
beddings. In ACM Transactions of Information Jeffrey Dean. Efficient estimation of word rep-
Systems, 2016. resentations in vector space. Technical report,
arXiv:1301.3781, 2013.
[4] T Bolukbasi, K W Chang, J Zou, V Saligrama,
[16] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg
and A Kalai. Quantifying and reducing bias in
Corrado, and Jeffrey Dean. Distributed represen-
word embeddings. 2016.
tations of words and phrases and their composi-
tionality. In NIPS, pages 3111–3119, 2013.
[5] Tim Brennan, William Dieterich, and Beate
Ehret. Evaluating the predictive validity of the [17] Jeffrey Pennington, Richard Socher, and Christo-
compas risk and needs assessment system. Crim- pher D. Manning. Glove: Global vectors for word
inal Justice and Behavior, 36(1):21–40, 2009. representation. 2014.
[6] Kaylee Burns, Lisa Anne Hendricks, Trevor Dar- [18] Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente
rell, and Anna Rohrbach. Women also snowboard: Ordonez, and Kai-Wei Chang. Men also like shop-
Overcoming bias in captioning models. CoRR, ping: Reducing gender bias amplification using
abs/1803.09797, 2018. corpus-level constraints. CoRR, abs/1707.09457,
2017.
[7] Aylin Caliskan, Joanna J. Bryson, and Arvind
Narayanan. Semantics derived automatically
from language corpora contain human-like biases.
Science, 356(6334):183–186, 2017.
[8] Christiane Fellbaum. WordNet: An Electronic
Lexical Database. Bradford Books, 1998.
[9] L Finkelstein, E Gabrilovich, Y Matias, E Rivlin,
Z Solan, G Wolfman, and etal. Placing search in
context : The concept revisited. In ACM Trans-
actions of Information Systems, volume 20, pages
116–131, 2002.Sunipa Dev, Jeff Phillips
Supplementary material for: male/female words or names.
Attenuating Bias in Word Career : { executive, management, professional, cor-
Embeddings poration, salary, office, business, career }
Family : { home, parents, children, family, cousins,
marriage, wedding, relatives }
A Bias in different embeddings
Male names : { john, paul, mike, kevin, steve, greg,
We explore here how gender bias is expressed across jeff, bill }
different embeddings, datasets and embedding mecha-
Female names : { amy, joan, lisa, sarah, diana, kate,
nisms. Similar patterns are reflected across all as seen
ann, donna }
in Figure 7.
Male words : { male, man, boy, brother, he, him, his,
For this verification of the permeative na-
son }
ture of bias across datasets and embeddings,
we use the GloVe embeddings of a Wikipedia Female words : { female, woman, girl, she, her, hers,
dump (dumps.wikimedia.org/enwiki/latest/ daughter }
enwiki-latest-pages-articles.xml.bz2, 4.7B
tokens) Common Crawl (840B tokens, 2.2M vo-
cab) and Twitter (27B tokens, 1.2M vocab) from
https://nlp.stanford.edu/projects/glove/, C DETECTING THE GENDER
and the WordToVec embedding of Google DIRECTION
News (100B tokens, 3M vocab) from https:
//code.google.com/archive/p/word2vec/. For this, we take a set of gendered word pairs as listed
in Table 3. From our default Wikipedia dataset, using
B WORD EMBEDDING the embedded vectors for these word pairs (i.e., (woman
ASSOCIATION TEST - man), (she - he), etc), we create a basis for the sub-
space F , of dimension 10. We then try to understand
the distribution of variance in this subspace. To do
Word Embedding Association Test (WEAT) was de-
so, we project the entire dataset onto this subspace F ,
fined as an analogue to Implicit Association Test (IAT)
and take the SVD. The top chart in Figure 10 shows
by Caliskan et al. [7]. It checks for human like bias
the singular values of the entire data in this subspace
associated with words in word embeddings. For ex-
F . We observe that there is a dominant first singu-
ample, it found career oriented words (executive, ca-
lar vector/value which is almost twice the size of the
reer, etc) more associated with male names and male
second value. After the this drop, the decay is signifi-
gendered words (’man’,’boy’ etc) than female names
cantly more gradual. This suggests to use only the top
and gendered words and family oriented words (’fam-
singular vector of F as the gender subspace, not 2 or
ily’,’home’ etc) more associated with female names and
more of these vectors.
words than male. We list a set of words used for
WEAT by Calisan et al. and that we used in our work To grasp how much of this variation is from the correla-
below. tion along the gender direction, and how much is just
random variation, we repeat this experiment, again
For two sets of target words X and Y and attribute
in Figure 10, with different ways of creating the sub-
words A and B, the WEAT test statistic is :
P space F . First in chart (b), we generate 10 vectors,
s(X, Y, A, B) = x∈X s(x, A, B) − s(y, A, B) with one word chosen randomly, and one chosen from
the gendered set (e.g., chair-woman). Second in chart
where,
(c), we generate 10 vectors between two random words
s(w, A, B) = meana∈A cos(a, w) − meanb∈B cos(b, w) from our set of the 100,000 most frequent words; these
and, cos(a, b) is the cosine distance between vector a are averaged over 100 random iterations due to higher
and b. variance in the plots. Finally in chart (d), we gen-
erate 10 random unit vectors in R300 . We observe
This score is normalized by std−devw∈X∪Y s(w, A, B). ¯
that the pairs with one gendered vector in each pair
So, closer to 0 this value is, the less bias or preferential
still exhibits a significant drop in singular values, but
association target word groups have to the attribute
not as drastic as with both pairs. The other two ap-
word groups.
proaches have no significant drop since they do not
Here target words are occupation words or ca- in general contain a gendered word with interesting
reer/family oriented words and attributes are subspace. All remaining singular values, and their de-Attenuating Bias in Word Vectors
man he john avg male
captain captain
captain
footballer
doctor attorney programmerfootballer
programmer footballercaptain
programmer programmer doctor
attorney
footballer doctor
attorney attorney doctor
homemaker
receptionist receptionist
dancer dancer homemaker
maid dancerreceptionist homemaker receptionist
homemaker nurse nurse dancer
maid
maid nurse
nurse
maid
woman she mary avg female
(a)
man he john avg male
captain captain
attorney footballer
captain footballer
captainattorney attorney
programmer
footballer footballerdoctor programmerdoctor
doctor programmer dancer
receptionist
programmer homemaker dancer
homemaker nurse homemaker
attorney
nurse receptionist nurse
maid doctor receptionist
dancer nurse
maid maid maid
homemaker
receptionist dancer
woman she mary avg female
(b)
man he john avg male
captain captain
captain captain footballer
footballer programmer
footballer programmerfootballer attorney
programmer programmer
attorney attorney
doctor
doctor
attorney doctor dancer
doctor receptionist
nurse
maid
receptionist receptionist receptionist
homemaker homemaker
dancer dancer dancer
maid homemaker
homemaker
maid nurse nurse
nurse maid
woman she mary avg female
(c)
man he john avg male
captain captain captain
footballer footballer footballer
programmer
attorney
captain doctor
programmer maid programmer
programmer footballer
doctor
doctor attorney
doctor attorney dancernurse
attorney receptionist
homemaker
dancer dancer dancer
receptionist
homemaker maid
nurse receptionist maid
maid nurse receptionist
homemaker nurse
homemaker
woman she mary avg female
(d)
Figure 7: Gender Bias in Different Embeddings : (a) GloVeon Wikipedia, (b) GloVeon Twitter, (c) GloVeon
Common Crawl and (d) WordToVecon GoogleNews datasets.Sunipa Dev, Jeff Phillips
avg male avg male avg male avg male
strong
important importantstrong confident
powerful intelligent intelligentstrong
confidentintelligent powerfulconfident confident shypowerful
uglyintelligent powerfulimportant
strong important
homely shy ugly
uglyshy
homely shy homely homely
uglyglamorous glamorousbeautiful
beautiful beautiful
glamorous glamorous beautiful
avg female avg female avg female avg female
(a) (b) (c) (d)
Figure 8: Bias in adjectives along the gender direction : (a) GloVe on default Wikipedia dataset, (b) GloVe on
Common Crawl (840B token dataset), (c) GloVe on Twitter dataset and (d) WordToVec on Google News dataset
0.5
0.6
0.5 0.4
0.4 0.3
0.3
0.2
European American Names European American Names
0.2
0.1
0.1
0.0 0.0
captain
0 2 4 6 8 0 2 4 6 8
0.35 0.35
0.30 0.30
0.25 0.25
0.20 0.20
0.15
doctor
0.15
0.10
dancer
0.10
0.05 0.05
captainattorney nurse 0.00 0.00
attorney
0 2 4 6 8 0 2 4 6 8
doctor
nurse
footballer Figure 10: Fractional singular values for (a) male-
dancerprogrammer programmerreceptionist female word pairs (b) one gendered word - one random
homemaker
maidreceptionist maid word (c) random word pair (d) random unit vectors
footballer
homemaker
Hispanic Names Hispanic Names cay appears similar to the non-leading ones from the
(a) (b)
European American Names European American Names charts (a) and (b). This further indicates that there
is roughly one important gender direction, and any
related subspace is not significantly different than a
random one in the word-vector embedding.
captain Now, for any word w in vocabulary W of the embed-
doctor ding, we can define wB as the part of w along the
captaindoctor
attorney gender direction.
programmernurse dancer
maid Based on the experiments shown in Figure 10, it is jus-
nurse
dancerreceptionist attorney tified to take the gender direction as the (normalized)
footballer programmer
homemaker maid first right singular vector, vB , or the full data set data
receptionist projected onto the subspace F . Then, the component
footballer
African American Names of a word vector w along vB is simply hw, vB ivB .
African American Names
(c) (d) Calculating this component when the gender subspace
is defined by two or more of the top right singular
Figure 9: Racial bias in different embeddings : Oc- vectors of V can be done similarly.
cupation words along the European American - His-
We should note here that the gender subspace defined
panic axis in GloVe embeddings of (a) Common Crawl
here passes through the origin. Centering the data
and (b) Twitter Dataset and the European American
and using PCA to define the gender subspace lets the
- African American axis in GloVe embeddings of (a)
gender subspace not pass through the origin. We see
Common Crawl and (b) Twitter Dataset
a comparison in the two methods in Section 5 as HD
uses PCA and we use SVD to define the gender direc-
tion.Attenuating Bias in Word Vectors
D Word Lists policeman policewoman
postman postwoman
D.1 Word Pairs used for Flipping postmaster postmistress
priest priestess
actor actress prince princess
author authoress prophet prophetess
bachelor spinster proprietor proprietress
boy girl shepherd shepherdess
brave squaw sir madam
bridegroom bride son daughter
brother sister son-in-law daughter-in-law
conductor conductress step-father step-mother
count countess step-son step-daughter
czar czarina steward stewardess
dad mum sultan sultana
daddy mummy tailor tailoress
duke duchess uncle aunt
emperor empress usher usherette
father mother waiter waitress
father-in-law mother-in-law washerman washerwoman
fiance fiancee widower widow
gentleman lady wizard witch
giant giantess
god goddess
governor matron D.2 Occupation Words
grandfather grandmother
grandson granddaughter detective
he she ambassador
headmaster headmistress coach
heir heiress officer
hero heroine epidemiologist
him her rabbi
himself herself ballplayer
host hostess secretary
hunter huntress actress
husband wife manager
king queen scientist
lad lass cardiologist
landlord landlady actor
lord lady industrialist
male female welder
man woman biologist
manager manageress undersecretary
manservant maidservant captain
masseur masseuse economist
master mistress politician
mayor mayoress baron
milkman milkmaid pollster
millionaire millionairess environmentalist
monitor monitress photographer
monk nun mediator
mr mrs character
murderer murderess housewife
nephew niece jeweler
papa mama physicist
poet poetess hitmanSunipa Dev, Jeff Phillips geologist novelist painter senator employee collector stockbroker goalkeeper footballer singer tycoon acquaintance dad preacher patrolman trumpeter chancellor colonel advocate trooper bureaucrat understudy strategist paralegal pathologist philosopher psychologist councilor campaigner violinist magistrate priest judge cellist illustrator hooker surgeon jurist nurse commentator missionary gardener stylist journalist solicitor warrior scholar cameraman naturalist wrestler artist hairdresser mathematician lawmaker businesswoman psychiatrist investigator clerk curator writer soloist handyman servant broker broadcaster boss fisherman lieutenant landlord neurosurgeon housekeeper protagonist crooner sculptor archaeologist nanny teenager teacher councilman homemaker attorney cop choreographer planner principal laborer parishioner programmer therapist philanthropist administrator waiter skipper barrister aide trader chef swimmer gangster adventurer astronomer monk educator bookkeeper lawyer radiologist midfielder columnist evangelist banker
Attenuating Bias in Word Vectors neurologist technician barber nun policeman instructor assassin alderman marshal analyst waitress chaplain artiste inventor playwright lifeguard electrician bodyguard student bartender deputy surveyor researcher consultant caretaker athlete ranger cartoonist lyricist negotiator entrepreneur promoter sailor socialite dancer architect composer mechanic president entertainer dean counselor comic janitor medic firebrand legislator sportsman salesman anthropologist observer performer pundit crusader maid envoy archbishop trucker firefighter publicist vocalist commander tutor professor proprietor critic restaurateur comedian editor receptionist saint financier butler valedictorian prosecutor inspector sergeant steward realtor confesses commissioner bishop narrator shopkeeper conductor ballerina historian diplomat citizen parliamentarian worker author pastor sociologist serviceman photojournalist filmmaker guitarist sportswriter butcher poet mobster dentist drummer statesman astronaut minister protester dermatologist custodian
Sunipa Dev, Jeff Phillips
maestro African American : { darnell, hakim, jermaine,
pianist kareem, jamal, leroy, tyrone, rasheed, yvette, malika,
pharmacist latonya, jasmine }
chemist Hispanic : { alejandro, pancho, bernardo, pedro,
pediatrician octavio, rodrigo, ricardo, augusto, carmen, katia,
lecturer marcella , sofia }
foreman
cleric
musician D.5 Names used for Age related Bias
cabbie Detection and Dampening
fireman
farmer Aged : { ruth, william, horace, mary, susie, amy,
headmaster john, henry, edward, elizabeth }
soldier Youth : { taylor, jamie, daniel, aubrey, alison,
carpenter miranda, jacob, arthur, aaron, ethan }
substitute
director
cinematographer
warden
marksman
congressman
prisoner
librarian
magician
screenwriter
provost
saxophonist
plumber
correspondent
organist
baker
doctor
constable
treasurer
superintendent
boxer
physician
infielder
businessman
protege
D.3 Names used for Gender Bias Detection
Male : { john, william, george, liam, andrew, michael,
louis, tony, scott, jackson }
Female : { mary, victoria, carolina, maria, anne, kelly,
marie, anna, sarah, jane }
D.4 Names used for Racial Bias Detection
and Dampening
European American : { brad, brendan, geoffrey, greg,
brett, matthew, neil, todd, nancy, amanda, emily,
rachel }You can also read