Helping Delivery Robots Find Your Front Door

With a new navigation system from MIT, robots can decipher common landscape features, even in an unfamiliar environment

MIT-context-navigation-01_0.jpg
For last-mile delivery, robots of the future may use a new MIT algorithm to beat a path to your front door. MIT News

Delivery robots, once a sci-fi fantasy, became a reality this year, rolling along university campus sidewalks and suburban California streets, bringing pizza and Amazon packages right to customers’ front doors. They're increasingly being seen as a solution for "last-mile delivery"—the part of the supply chain where goods are moved from a local transportation hub or warehouse to their final destination. This last leg is notoriously inefficient, causing traffic congestion and releasing outsize amounts of pollution. Robots, many think, could be a solution.

But how do robots find the door? It’s not always simple. GPS can take the robot to the right address, but it can’t tell it whether the door is to the left of the garage or at the end of the garden path.

That’s why researchers at MIT have developed a new robot navigation system. The system involves training the robots to recognize environmental features like driveways and mailboxes and to learn which features are likely to lead to a door.

“It’s kind of unreasonable to expect you’d have a detailed map of every single environment your robot was going to operate in,” says Michael Everett, a graduate student in MIT’s department of mechanical engineering who worked on the research. Instead, the team asked, “how do you drive around and find objects when you don’t have a map ahead of time?”

The answer involves using an algorithm that pulls features—"door" or "stairs" or "hedge"—from pictures and makes new maps of the environment as the robot moves. The maps use both the semantic label (ie, "door") and a depth image. The algorithm allows the robots to make decisions based on the maps, which helps them reach their destination more quickly.

The researchers trained the algorithm on satellite maps from Bing. The maps showed 77 houses from three suburban neighborhoods and one urban one. Everett color-coded the maps based on feature—sidewalks yellow, driveways blue, hedges green, doors gray. He trained the program using both complete images of the landscape and images that were partly covered, since a moving robot will often have its view partially obscured by street features, cars or pedestrians.

Everett and his team then developed a “cost-to-go estimator” algorithm for choosing a path of maximum efficiency (and thus minimum "cost"). This algorithm created a second map, this one in greyscale. On the map, darker locations are farther from the goal, lighter locations are closer. A road or sidewalk might be darker, while a driveway would be lighter and lighter the closer it gets to the front door. The front door—the destination–is the lightest. This cost-to-go estimator map helps a robot make informed decisions on the fly.

Planning Beyond the Sensing Horizon Using a Learned Context

The team tested the algorithms using a simulation of a house that hadn’t appeared on the training images. They found that their technique helped find the front door 189 percent faster than traditional navigation algorithms, which rely on complete maps and specific GPS coordinates. While the algorithms that currently drive most delivery robots do generally get them to the destination, they're not always efficient.

"This MIT navigation system is an important step in this overall direction of faster real-time navigation and delivery," says Mohit Bansal, a professor of computer science at the University of North Carolina at Chapel Hill who was not involved in the research.

Bansal says the next hurdle for developers of delivery robot systems will be to enable robots to handle longer commands, including commands with negation (such as "don't go to the side door"). Another challenge will be developing robots that can ask questions if they get lost or confused.

The MIT team hopes that their algorithm could one day be used to help robots find things in completely unfamiliar environments. Imagine a robot that could understand the command “find my shoes” or “take this letter to the nearest post office.”

“My vision there is that all our robots are going to be able to just understand really casual human instructions like, ‘hey, robot, go grab a coffee for me,’” Everett says.

Everett presented his findings earlier this month at the International Conference on Intelligent Robots and Systems in Macau. It was a finalist for a "best paper award" in cognitive robotics, a prize given to promote "advancements of cognitive robotics in industry, home applications, and daily life." The work is partially funded by the Ford Motor Company, which is developing its own delivery robots programs.

Currently, the navigation system works best in environments with a lot of structure. The suburban neighborhoods on the training maps tend to have predictable features–sidewalks leading to driveways leading to front doors.

“If you’ve been to one house, you have a pretty good idea of what the other houses look like,” he says.

This means the navigation system would likely work well in ordered environments like hotel corridors or airport terminals, but perhaps would have more trouble in, say, a historic city center where buildings are built in dramatically different styles.

“At the end of the day, we want to see if the algorithm can handle the uncertainties and noise that the real world has,” Everett says.

We’ll be waiting right here for that robot-fetched cup of coffee.

Get the latest stories in your inbox every weekday.