SMART NEWS

Data Mining the Classics Clusters Women Authors Together, Puts Mellville Out On a Raft

August 27, 2012

Can computers analyze literature? Depends on who you ask. Some literary types are taking offense at a new statistical tool that mines and clusters classic works. But it was one of their own – English professor Matthew Jockers of the University of Nebraska-Lincoln – who devised the new supercomputer-mediated literary analysis. Jockers’ macroanalysis method compares thousands of books in order to identify systems of influence, school of thought or other groupings that human scholars might have missed.

“We need to go beyond our traditional practice of close reading and go out to a different scale,” he told NBC News. “The traditional practice of close reading allows us to look at the bark on the trees, while the macroanalytic allows us to see the whole forest.”

Jockers analyzed thousands of books from the late 18th to 19th centuries for their punctuation, word choice and overarching theme. The results give him a “book signal” to allow each work to be compared and plotted alongside others. Melville, apparently, warrants his own aquatically-themed cluster.

Data Mining the Classics Clusters Women Authors Together, Puts Mellville Out On a Raft — Matthew Jockers / University of Nebraska-Lincoln

A few patterns emerged. Female authors, for example, were grouped together although the computer did not take their gender into account when placing them. This shows that, on a whole, female authorship is detectable by objective measures rather than just human intuition.

The darker-colored areas represent groups of women authors. Matthew Jockers / University of Nebraska-Lincoln

While some scholars feel threatened by the new method, Jockers points out that his high level approach could lend new perspective and prompt fresh investigation into the classics and other literary works. And while his analysis reveals trends such as the female authorship clustering, it does not tease out some intricacies better left to human minds. For example, a few of the best-known works by women, like Jane Austen’s greatest hits, were not nestled in the female-clustered group. Pointing that out and examining the meaning behind it is a job best done by humans, he says.