With a camera in every pocket and facial recognition software built into our smartphones and social networks, it’s sometimes easy to forget that taking photos and identifying faces in them was not always so simple. Whether lost, damaged or simply unlabeled, the process of identifying the people in old photos can be tedious. But, as Kurt Luther, assistant professor of computer science at Virginia Tech, found out, the experience can also be quite moving.
Luther was at the "Pennsylvania’s Civil War" exhibit at Pittsburgh’s Senator John Heinz History Center in 2013 when he stumbled across a photo of his great-great-great uncle, Oliver Croxton. He has described looking at the photo, which was the oldest family photo he’d seen, as traveling through time.
Already a history buff with interest in the American Civil War, the moment stuck with Luther and he began to wonder how to bring the same experience to thousands of other history enthusiasts.
“I started learning more about Civil War photography,” says Luther, “and about how to identify [people] using different visual clues, like the uniform, insignia or the photographer’s studio information. Meanwhile, I was doing a lot of research in the area of crowdsourcing as a computer science professor, and thought maybe there’s a way to bring these two things together.”
The result is a free, online software called Civil War Photo Sleuth that uses crowdsourcing and facial recognition to help users identify unknown subjects in Civil War era photographs. Just before its official release in 2018, the technology won both the top prize of $25,000 in Microsoft’s Cloud AI Research Challenge for its utilization of Microsoft’s facial recognition software and the Best Demo Award at the Human Computation and Crowdsourcing 2018 conference. This week, Luther is presenting at the Association for Computing Machinery's Intelligent User Interfaces conference in Los Angeles.
Designed with the help of doctoral and undergraduate students at Virginia Tech, including project lead Vikram Mohanty, and in collaboration with Virginia Tech’s history department, Photo Sleuth uses a multi-pronged approach to suggest the most accurate identifications.
The first crucial step in the process was building a large database of already identified photos. To date, Photo Sleuth has roughly 17,000 identified photos, from national archives like the U.S. Military History Institute as well as private collections, that include not only Civil War soldiers but also civilians and other military personnel of the era.
Luther says that they were fortunate to have the support of an already enthusiastic community of civil war historians with access to these photos, because without a solid base of already identified photos it would have been almost impossible for the software to be useful.
“It’s not like in Field of Dreams,” says Luther, “If we had launched the site with no images and just hoped that users would add them all, we would face the cold start problem where you just don’t have any content.”
The database of identified photos serves an essential role in helping users identify photos they upload themselves. Users manually tag special visual traits, such as coat color, facial hair or military rank insignia, and the photo passes through a facial recognition algorithm to analyze and log unique face ratios, such as distance between facial landmarks like the nose and eyes. Photo Sleuth compares the visual data of the unknown photo to already identified photos in the database and presents the user with what it thinks are the best matches based on facial similarity and information derived from the other metadata, such as soldiers who appear to be in the same unit based on the insignia of their uniforms . While the software takes deliberate steps along the way to limit the possibility of a false identification, Luther says that at the end of the day it is up to the user to make the final identification when presented with the software’s best guesses.
“We were very concerned about preventing false identification,” says Luther, “because when you’re talking about the internet, once you put some wrong information out there it’s very difficult to get rid of it or change it.”
To assure that their software was providing users with the best suggested identifications possible, Luther performed an analysis on the software’s first month of proposed identifications using methods outlined in columns he has contributed to the Civil War history magazine Military Images. Rating the identifications from ‘definitely not a match,’ ‘probably not a match,’ ‘probably a match,’ and ‘definitely a match,’ the analysis found that 85 percent of the proposed identifications were either probably or definitely a match. At the conference this week, Luther says he plans to highlight the findings of the team’s most recent research on Photo Sleuth, including a discovery made by Dave Morin, a collector of New Hampshire Civil War images, about a portrait of an unidentified Union second lieutenant. Photo Sleuth suggested the man in the portrait was William H. Baldwin of the 1st New York Engineers. Morin, who confirmed that Baldwin was a New Hampshire native, says that he never would have found the Granite State engineer in question without the help of Photo Sleuth.
The research also emphasizes the complementary strengths of human historians and the software itself. Despite their best efforts, Luther says that the software can only go so far when identifying correct matches and relies on users to help identify clues that are in the facial algorithm’s blind spots.
“[The algorithm] is trained for general face recognition [on] mostly modern images,” says Luther. “The A.I. has a tough time when a face is turned to the side [in profile]. It’s kind of an unusual portrait by today’s standards, but in the mid-19th century it was common.”
The team also found that users were much more successful than the algorithm alone at identifying other unique makers like beards and scars.
Patrick Lewis, a civil war historian and managing editor of scholarly resources and publication at the Kentucky Historical Society who has not been a part of Photo Sleuth’s development, says that Civil War Photo Sleuth will be a great tool for not only bringing these forgotten stories to life but to help continue to build a collaborative network of civil war historians around the country.
“I like to go in and look at the new Kentucky tagged photos,” says Lewis. “[And ask] who are the people that are out there collecting? Are their individual collectors I should be aware of, and should I get in touch with them to see if they have any other materials that might be of research interest?”
While he has yet to connect with any individual collectors through Photo Sleuth, Lewis says that the Kentucky Historical Society itself has worked to build records of known online archives and that a software like Photo Sleuth would dramatically improve their capability to continue that work.
Going forward, Luther says they’re looking to “double down on [the] human strengths” of the software, including adding a “Second Opinion” option that will let multiple users collaborate on the final identification of a photo, as well as working on expanding physical outreach and community management to grow the userbase of Photo Sleuth. The software will also see some face lifts as well, including a new function that will allow users to upload and identify people in a group photo.
“Our ultimate goal is to identify every unknown Civil War photo,” says Luther, “and get [Photo Sleuth] bigger and better, because 25,000 images is just a drop in the bucket.”