According to the study, when an expert HR system is trained using a massive corpus of a language (say, English) to learn associations within the language (such as the association of “flower” and “pleasant”), implicit bias encoded in the language will also be encoded by the expert system:
Our findings suggest that if we build an intelligent system that learns enough about the properties of language to be able to understand and produce it, in the process it will also acquire historical cultural associations, some of which can be objectionable….Further concerns may arise as AI is given agency in our society. If machine-learning technologies used for, say, résumé screening were to imbibe cultural stereotypes, it may result in prejudiced outcomes.
More starkly put, while there ought to be no association (technically, no similarity) between two sets of words (“programmer,” “engineer,” “scientist” and “nurse,” “teacher,” “librarian”) and two sets of attributes about those (“man,” “male” and “woman,” female”). The expert system, having examined existing English language material (including materials that encode biases about “male” jobs and “female” jobs) inferred an association between those words — and thus the ability to predict which gender was associated with which job. This empirically confirms some of the algorithmic bias described in Big Data’s Disparate Impact by Solon Barocas and Andrew Selbst (discussed elsewhere on this blog).
In a hiring context, the encoding of such a biased norms could be disastrous.
The light at the end of the tunnel is that what is encoded can be tested and solved for. Its just that we need to know what questions to ask of the systems we are putting in place.
Next, I will write on corrective measures to counter algorithmic bias (see e.g., here and here. Following that, I will condense the collective work of this blog and others and create a lawyer’s guide to procuring big data.