How Did Computers Uncover J.K. Rowling’s Pseudonym?

Forensic linguistics can use powerful programs to track written text back to its author

writing
Belleville News Democrat

A famous British writer is revealed to be the author of an obscure mystery novel. An immigrant is granted asylum when authorities verify he wrote anonymous articles critical of his home country. And a man is convicted of murder when he’s connected to messages painted at the crime scene.

The common element in these seemingly disparate cases is “forensic linguistics”—an investigative technique that helps experts determine authorship by identifying quirks in a writer’s style. Advances in computer technology can now parse text with ever-finer accuracy. Consider the recent outing of Harry Potter author J.K. Rowling as the writer of The Cuckoo’s Calling, a crime novel she published under the pen name Robert Galbraith. England’s Sunday Times, responding to an anonymous tip that Rowling was the book’s real author, hired Duquesne University’s Patrick Juola to analyze the text of Cuckoo, using software that he had spent over a decade refining. One of Juola’s tests examined sequences of adjacent words, while another zoomed in on sequences of characters; a third test tallied the most common words, while a fourth examined the author’s preference for long or short words. Juola wound up with a linguistic fingerprint—hard data on the author’s stylistic quirks.

He then ran the same tests on four other books: The Casual Vacancy, Rowling’s first post-Harry Potter novel, plus three stylistically similar crime novels by other female writers. Juola concluded that Rowling was the most likely author of The Cuckoo’s Calling, since she was the only one whose writing style showed up as the closest or second-closest match in each of the tests. After consulting an Oxford linguist and receiving a concurring opinion, the newspaper confronted Rowling, who confessed.

Juola completed his analysis in about half an hour. By contrast, in the early 1960s, it had taken a team of two statisticians—using what was then a state-of-the-art, high-speed computer at MIT—three years to complete a project to reveal who wrote 12 unsigned Federalist Papers.

Robert Leonard, who heads the forensic linguistics program at Hofstra University, has also made a career out of determining authorship. Certified to serve as an expert witness in 13 states, he has presented evidence in cases such as that of Christopher Coleman, who was arrested in 2009 for murdering his family in Waterloo, Illinois. Leonard testified that Coleman’s writing style matched threats spray-painted at his family’s home (photo, left). Coleman was convicted and is serving a life sentence.

Since forensic linguists deal in probabilities, not certainties, it is all the more essential to further refine this field of study, experts say. “There have been cases where it was my impression that the evidence on which people were freed or convicted was iffy in one way or another,” says Edward Finegan, president of the International Association of Forensic Linguists. Vanderbilt law professor Edward Cheng, an expert on the reliability of forensic evidence, says that linguistic analysis is best used when only a handful of people could have written a given text.

As forensic linguistics continues to make headlines, criminals may realize the importance of choosing their words carefully. And some worry that software also can be used to obscure distinctive written styles. “Anything that you can identify to analyze,” says Juola, “I can identify and try to hide.”

Get the latest Science stories in your inbox.