Digitizing the Hanging Court
Cutpurses! Blackguards! Fallen women! The Proceedings of the Old Bailey is an epic chronicle of crime and vice in early London. Now anyone with a computer can search all 52 million words
- By Guy Gugliotta
- Smithsonian magazine, April 2007, Subscribe
(Page 3 of 6)
It is now possible to place software "tags" in large bodies of digitized data, allowing researchers to find something simply by asking the computer to retrieve it. Such high-speed searches have been used not only to sort archives but also to search telephone records, catalog fingerprints or accomplish virtually any other task requiring navigation of immense masses of data. But it wasn't that way when Shoemaker and Hitchcock began their careers in the late 1980s.
"When I interviewed for my first lectureship, they asked me if I could teach 'computing in history,'" says Hitchcock. "I said 'yes' because I wanted the job, even though it wasn't true. On the computers of that time they had developed programs that allowed you to flit from page to page. You could see the potential, but not the mechanism."
Hitchcock, who is from San Francisco, and Shoemaker, who grew up in Oregon, met in 1982 as doctoral candidates in the Greater London Record Office in the basement of County Hall. Both were interested in what Hitchcock calls "history from below"—he was writing a dissertation on English workhouses in the 18th century, and Shoemaker was studying the prosecution of petty crime in the Greater London area during the same period. The two helped edit a book of essays published in 1992, then developed a tutorial on 18th-century English towns on CD-ROM in the mid-1990s. Within a few years, the Internet had provided the "mechanism" Hitchcock needed. "The Old Bailey proceedings seemed a natural," he says.
The pair conceived the idea of digitizing them early in 1999, then spent a year doing background research and writing grant proposals. They got $510,000 from the Arts and Humanities Research Council, the British equivalent of the National Endowment for the Humanities, and $680,000 from the New Opportunities Fund, established for "digitization of learning materials." The universities of Sheffield and Hertfordshire contributed staff, equipment and space.
"It was an enormous amount of money, and we were lucky," Shoemaker says. They enlisted Sheffield's Humanities Research Institute to customize software for searching the Proceedings, but first they needed a digitized copy of the text.
There was no easy way to get one. Technology in 2000 wasn't sufficiently sophisticated to scan words off microfilm; even if it had been, the vagaries of 18th-century printed text, rife with broken fonts and ink "bleed-throughs" from the other side of the page, would have made the technique impossible to use.
So the researchers hired someone to take digital photographs of all 60,000 microfilm pages, then sent the images on CD-ROMs to India. There, in a process known as double re-keying, two teams of typists typed the entire manuscript independently, then fed the copies into a computer that highlighted discrepancies, which had to be corrected manually. That took two years and cost nearly half a million dollars. Then Shoemaker and Hitchcock assembled a team of researchers to embed the entire manuscript with over 80 different computer "tags," permitting searches by such categories as first name, surname, age, occupation, crime, crime location, verdict and punishment.
The Proceedings went on-line in stages between 2003 and 2005. The Sheffield techs refine and update the software continually, recently adding links to maps to help people locate crime scenes more effectively. Their next task is to link stolen objects mentioned in the Proceedings to images of them in the Museum of London.
Single Page « Previous 1 2 3 4 5 6 Next »
Subscribe now for more of Smithsonian's coverage on history, science and nature.









Comments