EEOC, Big Data and Disparate Impact: Barking up the Wrong Tree

It has been widely reported that EEOC Assistant Legal Counsel Carol Miaskoff, when addressing a conference on big data, shared her belief that employers should be concerned with the disparate impact of their employment-related data mining and analysis.

I am not convinced that she is right.   I don’t think disparate impact will be the theory on which plaintiffs successfully attack big data in employment — I think it will be on a theory of  intentional discrimination through proof of a discriminatory “pattern and practice.”

Ms. Miaskoff said:

It’s been interesting to me because everyone’s been talking about disparate impact and adverse impact a lot. In the employment space, those are very precise legal terms. And there’s a cause of action for disparate impact. And I would say that that’s the one, frankly, that’s most suited to big data. Because what that’s about is taking a neutral, i.e. like race neutral, gender neutral, et cetera, term that nonetheless disproportionately excludes members of the protected group and– this is the critical piece here– and is not job related consistent with business necessity.

Let’s unpack this.

First, we should dive into disparate impact.  It is well understood that “Title VII prohibits employers from using neutral tests or selection procedures that have the effect of disproportionately excluding persons based on race, color, religion, sex, or national origin, where the tests or selection procedures are not job-related and consistent with business necessity.”  An employer can use one of several statistical models to show that a testing or selection procedure is job-related and consistent with business necessity, that is, necessary to the safe and/or efficient performance of the job.  At that point, the burden shifts to a plaintiff to show that there are other tests or selection procedures that are equally effective but have less of a discriminatory impact.

The problem is: any employer using solid data science should be able to show that variables that correlate highly with job safety and efficiency are those that are statistically defensible.  After all, isn’t that the whole purpose of data science — to tease and out and extrapolate interesting and highly correlative information?  If done right, the data scientist should be able to show a defensible correlation and an exclusion of other factors as being more equally or even more explanatory (the definition of “necessary”). In short, data science seems built to precisely resist the disparate impact framework.

If not disparate treatment, then, absent a new statutory scheme, the remaining avenue for plaintiffs is the intentional discrimination framework – specifically, showing intentional discrimination through proof of a discriminatory “pattern and practice.”

Briefly, intentional “pattern and practice” cases are generally evaluated by the courts using the burden-shifting paradigm adopted by the U.S. Supreme Court in Int’l International Brotherhood of Teamsters v. United States, 431 U.S. 324, 336 (1977). The Teamsters framework charges the plaintiff with the higher initial burden of establishing “that unlawful discrimination has been a regular procedure or policy followed by an employer…” Teamsters, 431 U.S. at 360. Upon that showing, it is assumed “that any particular employment decision, during the period in which the discriminatory policy was in force, was made in pursuit of that policy” and, therefore, “[t]he [plaintiff] need only show that an alleged individual discriminatee unsuccessfully applied for a job.” Id. at 362.

I don’t think it would be difficult for a Plaintiff to show that an employment practice built on data mining is a regular procedure or policy.  The burden then shifts to “the employer to demonstrate that the individual applicant was denied an employment opportunity for lawful reasons.”  Id.   According to the Second Circuit:

Teamsters sets a high bar for the prima facie case the Government or a class must present in a pattern-or-practice case: evidence supporting a rebuttable presumption that an employer acted with the deliberate purpose and intent of discrimination against an entire class. An employer facing that serious accusation must have a broad opportunity to present in rebuttal any relevant evidence that shows that it lacked such an intent.  United States v. City of New York, 717 F.3d 72, 87 (2d Cir. 2013).

This includes demonstrating that the Government’s proof is either “inaccurate or [statistically] insignificant.”  Teamsters, 431 U.S. at 360.

Plaintiffs will try to show that the model used is so fraught with perilous assumptions that it evidences discriminatory intent. Defendants will try to show  that their business model was founded on purely neutral data mining techniques.  In short, the fight will likely focus on whether the assumptions of the big data model in use were reasonable ones.  As I discuss in my article, 10 Questions: Confronting Allegations Based on Big Data, the key fight over allegations predicated on big data is about the choices that go into selecting the data and methodology to test in the first place.

I think we are a long way from a jurisprudence for handling big data claims.  In the meantime, employers would do well to keep pattern-and-practice as top of mind as disparate impact.

Back to Ms. Miaskoff.  The next two (wholly unreported upon) paragraphs of her remarks are telling.  Ms. Miaskoff goes on to say:

Now, in terms of big data, I think this is the rub. This is really what’s very fascinating, that the first step is to show and look at what is the tool. Now, this would apply to recruitment or to selection, but probably perhaps more to selection. What is the tool? Does it cause a disparate impact? And once you get there, just because it causes a disparate impact doesn’t make it illegal discrimination under the employment laws. It’s only illegal if it does not predict, accurately predict, success in the job.

So this raises all kinds of fascinating issues with big data analytics because, indeed, if you do possibly have prejudices built into the data, something might be validated as predicting success on the job. But it might just be predicting that white guys who went to Yale do well in this job. So there’s going to be a lot of interesting thought that needs to be done and technology work, really, around understanding how to validate these kind of concerns.

I think Ms. Miaskoff has put her finger on the very issue I raise in this post: the point of attack is not at the results, it is at the underlying predictive model and the prejudices built into the data.

One thought on “EEOC, Big Data and Disparate Impact: Barking up the Wrong Tree

Leave a comment