The linguistics search engine that overturned the federal mask mandate

Conservative judges are falling for the glamour of big data

The COVID-19 pandemic was still raging when a federal judge in Florida made the fateful decision to type “sanitation” into the search bar of the Corpus of Historical American English.

Many parts of the country had already dropped mask requirements, but a federal mask mandate on planes and other public transportation was still in place. A lawsuit challenging the mandate had come before Judge Kathryn Mizelle, a former clerk for Justice Clarence Thomas. The Biden administration said the mandate was valid, based on a law that authorizes the Centers for Disease Control and Prevention (CDC) to introduce rules around “sanitation” to prevent the spread of disease.

Mizelle took a textualist approach to the question — looking specifically at the meaning of the words in the law. But along with consulting dictionaries, she consulted a database of language, called a corpus, built by a Brigham Young University linguistics professor for other linguists. Pulling every example of the word “sanitation” from 1930 to 1944, she concluded that “sanitation” was used to describe actively making something clean — not as a way to keep something clean. So, she decided, masks aren’t actually “sanitation.”

The mask mandate was overturned, one of the final steps in the defanging of public health authorities, even as infectious disease ran rampant.

A new legal tool

Using corpora to answer legal questions, a strategy often referred to as legal corpus linguistics, has grown increasingly popular in some legal circles within the past decade. It’s been used by judges on the Michigan Supreme Court and the Utah Supreme Court, and, this past March, was referenced by the US Supreme Court during oral arguments for the first time.

“It’s been growing rapidly since 2018,” says Kevin Tobia, a professor at Georgetown Law. “And it’s only going to continue to grow.”

A corpus is a vast database of written language that can include things like books, articles, speeches, and other texts, amounting to hundreds of millions of lines of text or more. Linguists usually use corpora for scholarly projects to break down how language is used and what words are used for.

Linguists are concerned that judges aren’t actually trained well enough to use the tools properly. “It really worries me that naive judges would be spending their lunch hour doing quick-and-dirty searches of corpora, and getting data that is going to inform their opinion,” says Mark Davies, the now-retired Brigham Young University linguistics professor who built both the Corpus of Contemporary American English and the Corpus of Historical American English. These two corpora have become the tools most commonly used by judges who favor legal corpus linguistics.

The use of corpora appeals to a specific set of judges who often lean conservative. In a sense, legal corpus linguistics is a subcategory of textualism, a school of legal thought that calls for decisions to be based on the specific words on the page without considering the intent of the lawmakers or the legislative history leading to its passage. Textualism is most strongly associated with Justice Antonin Scalia and other conservative judges, but its basic premise is prevalent across the legal profession. (“We’re all textualists now,” said Justice Elena Kagan in a speech in 2015).

But legal corpus linguistics lets textualists turn to a very specific kind of tool, one that is cloaked by the allure of technology and big data. The Corpus of Historical American English includes 475 million words; the Corpus of Contemporary American English includes 1 billion. Critics warn that its use by judges could create a misleading veneer of objectivity, when corpora — which don’t give black-and-white answers anyways — require an additional level of skilled interpretation to make good use of them.

“I think this is another tool in the arsenal of judges and legal theorists who don’t want to give reasons on the merits for accepting their interpretations of the law, but rather want to just claim an unassailable correctness because of some objective or external circumstances,” says Anya Bernstein, a professor studying legal interpretation at the University at Buffalo School of Law. “It implies it’s something you can do by pushing a button on a computer.”

A twist on textualism

Textualism has traditionally relied heavily on dictionaries. Where words in a statute aren’t defined, judges fall back on a number of methods of interpretation known as canons of construction. In many instances, judges must look for the “ordinary meaning” of a word — which is where the dictionaries come into play. (Reading Law, a handbook on textualist interpretation by Antonin Scalia and Bryan Garner, includes an appendix dedicated to dictionaries and how to use them.)

“Ordinary meaning” is usually described as the way an average person in the US would understand the meaning of a particular word. Figuring out ordinary meaning, though, is tricky, and there’s no agreed-upon best practice for how to do it. Dictionary definitions and etymologies often aren’t enough. “All of these tools are woefully inadequate,” says James Heilpern, a senior fellow of law and corpus linguistics at BYU’s J. Reuben Clark Law School, who trains judges and attorneys on the practice. A dictionary doesn’t actually offer up an ordinary meaning in its definitions, he says, and the etymology of a word doesn’t necessarily say anything about how it’s been used at other points in time.

Heilpern thinks textualists need better tools to support their approach than just dictionaries, despite judges’ longstanding reliance on them. “We’re going to infuse this tradition with innovation,” he says. “And we’re going to develop tools to do our jobs better.”

But textualists weren’t the ones who built the corpora — linguists were. In a sense, Davies, the linguist who created the Corpus of Contemporary American English and the Corpus of Historical American English, is on the same page. He says that the idea was to create a body of text that could capture the ways words and language are used in the real world.

At the same time, Davies wasn’t expecting his work to end up as a legal tool. “I’m a linguist, so I think everyone’s an egghead like me and wants to look at syntactic constructions,” he says.

But one of Davies’ linguistics students, Stephen Mouritsen, went on to law school. In 2010, Mouritsen published a paper in BYU Law Review arguing corpus linguistics could help judges figure out the “ordinary meaning” of words. Mouritsen later clerked for Justice Thomas Lee of the Utah Supreme Court.

In 2011, corpus linguistics appeared for the first time in a judicial opinion by Justice Lee. It was used in five other opinions between 2011 and 2017, all at the Utah Supreme Court and the Michigan Supreme Court, according to an analysis by Georgetown professor Tobia in the University of Chicago Law Review Online. Its use increased dramatically between 2017 and 2020 — it was cited 24 times in that window by judges at various state and federal courts.

Most of the judges discussing corpus linguistics in their opinions were Republican appointees, according to Tobia’s analysis. And, when Republican-appointed judges talked about corpus linguistics, they were more likely to be in favor of it; the Democratic appointees were more likely to be critical of the tool.

That’s not surprising given that textualism is associated with more conservative approaches to the law. But Heilpern insists that it doesn’t have to be. “I really do believe that corpus linguistics is an entirely apolitical and atheoretical tool,” he says.

“Fatally flawed” uses

Linguists, though, question whether judges are actually using corpus linguistics appropriately. There are, broadly speaking, three categories of corpus linguistic analysis, says Stefan Gries, a professor of linguistics at the University of California, Santa Barbara. The first is the simplest: searching a website like the Corpus of Historical American English. The second uses dedicated corpus linguistics software to do more advanced analysis. In the third, people write their own code to analyze texts.

Most legal corpus linguistics falls into the first category, Gries says. “These judges have been part of a weekend training session on corpus linguistics and the law, and they’ve essentially learned to go to the COHA or COCA websites,” he says. “They enter a search term and do some often very elementary counts, and that’s what they report as a corpus analysis.”

Usually — like Judge Mizelle in Florida — they’re using the website to search for all the times a particular word or phrase appears in a language database, and they then check to see the ways that word or phrase is used. “If 90 percent of the time one sense of the word is being used, that’s probably how people would have understood this particular word,” says James Phillips, an assistant professor at Chapman University’s Fowler School of Law.

But Gries, the linguist, doesn’t agree. The most frequently used application of a word is not always going to be the most “ordinary” or most commonly understood use of that word, Gries says. For example, corpora, specifically the corpora used for legal corpus linguistics, contains millions of words from TV programs, magazines, and newspapers — news sources. A word, then, might be used in a particular way more frequently because that use of the word is more newsworthy, according to one Stanford Law Review article.

Davies is also concerned that there’s not enough attention to the methodology judges are using. Changing up the way a search is done, or the way words are counted in that search, or any number of small tweaks could influence the outcome of the analysis. “If you’d done it in some other way, you would have gotten different results,” he says. Most legal corpus analysis also doesn’t ask multiple people to analyze the same data, which can help reduce bias, according to Tobia’s analysis.

Corpus linguistics can’t actually offer clear-cut answers around language, even if it looks like a more objective way to answer questions. “It looks like it’s this quantification, so it’s somehow more reliable, because everything is numbers. But it’s actually all qualitative underneath,” Bernstein says. People still have to interpret the results that the corpus gives them, and different people might describe them in different ways.

“If you’re a really good linguist, you can convince other people that your descriptions are correct,” she says. “But it’s not like there’s a correct answer here.”

Heilpern says that proponents of corpus linguistics know that it can’t be a perfectly objective tool and that he stresses to lawyers and judges that context and interpretation are key. “Corpus linguistics can simply provide better evidence to the judge in order to make their decision,” he says. “There’s nothing wrong with the judge using it on their own if they know what they’re doing.”

But Bernstein is skeptical that judges have the skills to use corpus linguistics. “They’re saying, ‘And we can totally do it, because we’re extremely refined speakers and our job is interpreting language,”’ she says. “Well, people in comparative literature also interpret language. They do a lot of interpretation of texts, but they’re not linguists.”

It’s particularly concerning to critics because the consequences of an inadequate corpus search affect the real world — not just academia.

Take the mask mandate case, which blocked any mask requirements on planes: one of the last places people were forced to mask up. When those rules went, so did most masking, even though the COVID-19 pandemic continues to burn through the US. The corpus linguistic analysis in that case had major issues, according to Gries, Tobia, and other experts writing in the Columbia Law Review Forum. For example, the judge searched for “sanitary” and “sanitize” along with “sanitation,” even though the words have different uses. These scholars argue that the judge ignored situations where “sanitation” could have been interpreted to mean keeping a place clean.

“The district court’s evaluation of corpora is fatally flawed,” they wrote — and is “opaque and unreproducible.”

Gries and other linguists say they’d be more comfortable if corpus linguistic analysis was presented in court by experts or if both sides of a case did their own analysis and presented it as evidence during a trial. Heilpern says that’d be ideal, as well.

But Gries says he’s seen judges who think that being a judge makes them equally qualified to use a corpus linguistic tool. “I’ve had expert briefs tossed out by judges,” he says. “They don’t use these exact words, but they’re basically saying, ‘I don’t need this shit, I can do it.’”

These types of problems aren’t unique to corpus linguistics — even devout textualists say that the strategies used to find the meanings of words can pose problems if used incorrectly. Scalia and Garner, in Reading Law, warn that “an uncritical approach to dictionaries can mislead judges.” But they’re arguably harder to resist with a big data tool like a corpus, which has the shine of an objective technology. “It’s a way of asserting a single correct answer, even though that’s a fallacy. That’s an illusion,” Bernstein says.

The use of legal corpus linguistics has grown rapidly over the past few years, and Heilpern says it’s still not clear what the future might look like. But he thinks the technique will start to be taught in law schools and that more and more judges will start to be familiar with the approach — pushing lawyers to understand it as well. Davies, for one, says he’s spoken to one Supreme Court justice and that someone he’s worked closely with has spoken with a second.

“I could imagine a day where it’s just a tool that everyone has to be familiar with,” Phillips says. “That’s the ultimate goal, and we’re going to take steps in that direction.”

For Gries — even though he’s a critic of the tool — that’s not necessarily a bad thing. The potential ramifications of the spread of corpus linguistics depend on who’s using it, he says. If it’s attorneys and expert witnesses on opposite sides of an argument, that’s one thing. If it’s judges doing searches in their chambers, that’s another.

“It can be very good but it can also be very dangerous,” Gries says. “It depends on who’s doing it.”