SMART NEWS

Can a Computer Really Grade an Essay?

One company is developing an essay grading computer program that can take the load off professors and standardized test graders

Rose Eveleth

Contributor

April 8, 2013

It is I, Scangrade the Magnificent, here to grade your essay. Shannan Muskopf

In the future, computers will be our doctors, our soldiers, our firefighters and our teachers. They’ll diagnose diseases, nurture our babies, protect our homes and teach our kids. One company is already developing an essay-grading computer program that can take the load off professors and standardized test graders. But can a computer really rate a carefully crafted essay?

The company, edX, certainly thinks so. They already offer online courses to institutions, along with an artificial intelligence system that will grade student essays. John Markoff at the New York Times explains:

Anant Agarwal, an electrical engineer who is president of EdX, predicted that the instant-grading software would be a useful pedagogical tool, enabling students to take tests and write essays over and over and improve the quality of their answers. He said the technology would offer distinct advantages over the traditional classroom system, where students often wait days or weeks for grades.

“There is a huge value in learning with instant feedback,” Dr. Agarwal said. “Students are telling us they learn much better with instant feedback.”

Whether or not that instant feedback is high quality is another question. Skeptics of these computer graders aren’t hard to find. One group, which calls itself Professionals Against Machine Scoring of Student Essays in High-Stakes Assessment issued a statement and is collecting signatures to speak out against handing the task of grading over to a computer. They write:

Let’s face the realities of automatic essay scoring. Computers cannot “read.” They cannot measure the essentials of effective written communication: accuracy, reasoning, adequacy of evidence, good sense, ethical stance, convincing argument, meaningful organization, clarity, and veracity, among others.

The petition asks for legislators to stop relying on computers for grading and for schools to stop buying into the automated scoring systems.

Agarwal acknowledges that the software could be better and hopes that it will get better at distinguishing a good essay from mediocre one. “This is machine learning and there is a long way to go, but it’s good enough and the upside is huge,” he told the New York Times. Also, he says, anyone who thinks teachers are consistent is fooling herself. “We found that the quality of the grading is similar to the variation you find from instructor to instructor.”

In fact, some studies have suggested that computers and teachers prodce the same sort of variability in scores. One study by Mark Shermis at the University of Akron concluded that “automated essay scoring was capable of producing scores similar to human scores for extended-response writing items with equal performance for both source-based and traditional writing genre.” Shermis’s study, however, was never published in a journal, and other researchers have questioned its claims. Les C. Perelman from MIT wrote a response to the Shermis paper, writing that “a close examination of the paper’s methodology and the datasets used demonstrates that such a claim is not supported by the data in the study.”

The group of professionals also cite several paper suggesting that computers aren’t as good as teachers at evaluating students.

Most likely, this is a question of whether or not these computers are good enough at grading yet, not whether they will ever be. But it’s not just teachers who will get more high tech, students will too. If students learn what the program is looking for, they could simply write a program themselves to in turn write the perfect essay based on the softwares specifications. Perhaps in the future, computerized teachers will be grading computerized students.