Think fast, robot

Image courtesy of the researchers

A visual odometry algorithm uses low-latency brightness change events from a Dynamic Vision Sensor (DVS) and the data from a normal camera to provide absolute brightness values. The left photograph shows the camera frame, and the right photograph shows the DVS events (displayed in red and blue) plus grayscale from the camera.

Article Author

Date Published

May 30, 2014

Link to Original Article

MIT News

One of the reasons we don’t yet have self-driving cars and mini-helicopters delivering online purchases is that autonomous vehicles tend not to perform well under pressure. A system that can flawlessly parallel park at 5 mph may have trouble avoiding obstacles at 35 mph.
 
Part of the problem is the time it takes to produce and interpret camera data. An autonomous vehicle using a standard camera to monitor its surroundings might take about a fifth of a second to update its location. That’s good enough for normal operating conditions but not nearly fast enough to handle the unexpected.
 
Andrea Censi, a research scientist in MIT’s Laboratory for Information and Decision Systems, thinks the solution could be to supplement cameras with a new type of sensor called an event-based (or “neuromorphic”) sensor, which can take measurements a million times a second.
 
At this year’s International Conference on Robotics and Automation, Censi and Davide Scaramuzza of the University of Zurich present the first state-estimation algorithm — the type of algorithm robots use to gauge their position — to process data from event-based sensors. A robot running their algorithm could update its location every thousandth of a second or so, allowing it to perform much more nimble maneuvers.
 
“In a regular camera, you have an array of sensors, and then there is a clock,” Censi explains. “If you have a 30-frames-per-second camera, every 33 milliseconds the clock freezes all the values, and then the values are read in order.” With an event-based sensor, by contrast, “each pixel acts as an independent sensor,” Censi says. “When a change in luminance — in either the plus or minus direction — is larger than a threshold, the pixel says, ‘I see something interesting’ and communicates this information as an event. And then it waits until it sees another change.”
In the above animation, the platform velocity is tracked by fusing together the events from the event-based sensor (here displayed in red and blue) and the frames of a normal camera. (Courtesy of the researchers.)
 

Featured event

 
When a standard state-estimation algorithm receives an image from a robot-mounted camera, it first identifies “features”: gradations of color or shade that it takes to be boundaries between objects. Then it selects a subset of those features that it considers unlikely to change much with new perspectives.
 
Thirty milliseconds later, when the camera fires again, the algorithm performs the same type of analysis and starts trying to match features between the two images. This is a trial-and-error process, which can take anywhere from 50 to 250 milliseconds, depending on how dramatically the scene has changed. Once it’s matched features, the algorithm can deduce from their changes in position how far the robot has moved.
 
Censi and Scaramuzza’s algorithm supplements camera data with events reported by an event-based sensor, which was designed by their collaborator Tobi Delbruck of the Institute for Neuroinformatics in Zurich. The new algorithm’s first advantage is that it doesn’t have to identify features: Every event is intrinsically a change in luminance, which is what defines a feature. And because the events are reported so rapidly — every millionth of a second — the matching problem becomes much simpler. There aren’t as many candidate features to consider because the robot can’t have moved very far.
 
Moreover, the algorithm doesn’t try to match all the features in an image at once. For each event, it generates a set of hypotheses about how far the robot has moved, corresponding to several candidate features. After enough events have accumulated, it simply selects the hypothesis that turns up most frequently.
 
In experiments involving a robot with a camera and an event-based sensor mounted on it, their algorithm proved just as accurate as existing state-estimation algorithms.

In the above animation, the temporal resolution of event-based sensors is so fast that it can see the rotation of a quadrotor's propellers. (Courtesy of the researchers.)

Getting onboard

 
One of the inspirations for the new work, Censi says, was a series of recent experiments by Vijay Kumar at the University of Pennsylvania, which demonstrated that quadrotor helicopters — robotic helicopters with four sets of rotors — could perform remarkably nimble maneuvers. But in those experiments, Kumar gauged the robots’ location using a battery of external cameras that captured 1,000 exposures a second. Censi believes that his and Scaramuzza’s algorithm would allow a quadrotor with onboard sensors to replicate Kumar’s results.
 
Now that he and his colleagues have a reliable state-estimation algorithm, Censi says, the next step is to develop a corresponding control algorithm — an algorithm that decides what to do on the basis of the state estimates. That’s the subject of an ongoing collaboration with Emilio Frazzoli, a professor of aeronautics and astronautics at MIT.
 
“This work is very interesting,” says Roland Siegwart, a professor of autonomous systems at the Swiss Federal Institute of Technology in Zurich. “It is, to my knowledge, the first time such a neuromorphic dynamic vision sensor [DVS] has been integrated and evaluated on a mobile robot platform.”
 
“The DVS offers a novel type of sensing modality with very high bandwidth,” Siegwart continues. “This has quite some potential for specific high-speed robot motions, such as dynamic maneuvers with quadrotors with only on-board perception and control.”
 
But whether neuromorphic sensors will prove the most practical means of doing odometry remains to be seen, he cautions. “I have some doubts if a combination of a DVS with a standard camera can outperform, in quality and price, a system that tightly integrates an inertial measurement unit [a well-established technology that uses accelerometers and gyroscopes to gauge motion] with a camera.”

Reprinted with permission of MIT News.