INNOVATION

Software Creates One Picture That Says It All

Researchers at UC Berkeley have created software that averages image searches into one artistic result

September 8, 2014

The AverageExplorer software aggregates thousands of wedding photos into representations of what the average shot looks like. Courtesy UC Berkeley

Every day, users upload more than 350 million photos to Facebook. This influx of images has led analysts to estimate that 10 percent of the world’s 3.5 trillion photos have been taken in the last year. All that data flooding the Web means that if you’re looking for a particular image or object—what does an orange tabby cat look like, for example—you're positively flooded with search results.

Last month, researchers at the University of California, Berkeley unveiled new software, AverageExplorer, that will allow users to see the “average” image that represents what they’re looking for. Rather than a picture worth a thousand words, it’s a picture worth a thousand—or more—pictures.

“When you enter a Google image search, you’ll be sifting through pages and pages of images,” explains Jun-Yan Zhu, UC Berkeley graduate student and lead author of the paper, presented at this year's International Conference and Exhibition on Computer Graphics and Interactive Techniques in Vancouver. “It’s huge and hard to summarize; you can’t get a sense of what’s happening.”

For its initial offering, Zhu and his team collected photographs through Flickr, Google and Bing image searches. The software is low-power enough to run on an average desktop and can crunch some 10,000 images simultaneously.

Users refine their searches in a couple different ways. They can sketch and color a shape, similar to drawing in Adobe Photoshop or Illustrator, to sharpen their average-image result. For example, coloring the background of an average image of the Eiffel Tower will self-select the average image to pull only shots taken at night. Or, you could draw angled lines to control the orientation of a butterfly in the composite.

Bridge of Sighs, From Day to Night — By refining the colors in an AverageExplorer image of the Bridge of Sighs, you can change the scene from day to dusk to night. Courtesy UC Berkeley

Once an average image is created, a process that can take up to a minute, users can further refine the result using what the team calls Explorer Mode. In this mode, clicking on a certain part of an image—say, a cat’s nose—will reveal other common options or refinements for that spot—maybe blue or black noses, or ones that are rounded instead of angular. In a demo video, for example, the team refined an image of children on Santa’s lap by selecting for only images where Santa has one child on each arm.

Where the system will become especially powerful, says Zhu, is as a tool for training computer-vision algorithms, like those employed by Google Goggles or Amazon Firefly apps, which can identify what a camera is pointing at. “In the field of computer vision, people spend lots of money to annotate objects,” he explains. “Now you can apply the annotation to the average image. The idea is that you only need to work on one image to propagate all the images in a data set.”

Finding Cat Breeds — By refining the modes of a search result, researchers can find specific breeds of cat, including (from left to right) Ragdoll, Siamese, Maine Coon and Sphinx. Courtesy UC Berkeley

Creating artwork is the low-hanging fruit for AverageExplorer. The team pulled inspiration from new-media artists like Jason Salavon, who has painstakingly created averaged photographs by hand. It could also be used to create a Facebook plug-in that lets users tinker with the average image of themselves.

The researchers' aspirations are even more broad and impactful. Sociologists could use the system to spot and research social trends; for example, an averaged image could prove that brides most often stand to the right of the groom in wedding portraits. AverageExplorer might also be a useful tool for media analysts trying to dissect television coverage—does Stephen Colbert's posture change when he's talking about George W. Bush versus Barack Obama?

By allowing users to interact intuitively with visual data instead of struggling to enter the correct string of keywords, users will be able to bridge what Zhu’s advisor and AverageExplorer co-creator, Alexei Efros, calls the “language bottleneck.”

The team imagines a suite of custom tools designed for specific, hard-to-articulate tasks. A shopping application, for instance, would allow a user to spider the web for a pair of heels with the exact color, heel shape and height that she’s after. Zhu envisions a tool that integrates with police sketch artists’ workflow, allowing a witness to search facial databases for features that match the perpetrator's and construct a composite portrait.

A basic version of AverageExplorer will be released this fall.

Corinne Iozzio | | READ MORE

Corinne Iozzio is a New York–based technology writer and editor. When she’s not fiddling with LEGOs or Nerf blasters, she covers gadgets and emerging tech for various publications, including Popular Science and Scientific American.