Robot perception enhanced with hearing

Carnegie Mellon University researchers have found that robot perception could improve markedly by adding hearing to the machine’s sensing skills.  

Robot perception
Carnegie Mellon University researchers devised an apparatus called Tilt-Bot to build a collection of actions, video and sound to improve robot perception. Objects were placed in a tray attached to a robot arm, which then moved the tray randomly while recording video and audio (Image: Carnegie Mellon University)

Researchers at CMU's Robotics Institute found that sounds could help a robot differentiate between objects, such as a metal screwdriver and a metal wrench. According to CMU, hearing could similarly help robots determine what type of action caused a sound and help them use sounds to predict the physical properties of new objects.

US robotics teams gear up for DARPA mine challenge

"A lot of preliminary work in other fields indicated that sound could be useful, but it wasn't clear how useful it would be in robotics," said Lerrel Pinto, who recently earned his Ph.D. in robotics at CMU. He and his colleagues found the performance rate was quite high, with robots that used sound successfully classifying objects 76 per cent of the time.

The results were so encouraging, he added, that it might prove useful to equip future robots with instrumented canes, enabling them to tap objects they want to identify.

The researchers presented their findings during the virtual Robotics Science and Systems conference. Other team members included Abhinav Gupta, associate professor of robotics, and Dhiraj Gandhi, a former master's student who is now a research scientist at Facebook Artificial Intelligence Research's Pittsburgh lab.

To perform their study, the researchers created a large dataset, simultaneously recording video and audio of 60 common objects as they slid or rolled around a tray. They have since released this dataset, cataloguing 15,000 interactions, for use by other researchers.

The team captured these interactions using an experimental apparatus they called Tilt-Bot. This is a square tray attached to the arm of a Sawyer robot that spent a few hours moving the tray in random directions with varying levels of tilt as cameras and microphones recorded each action. They also collected some data beyond the tray, using Sawyer to push objects on a surface.

Notably, the team found that a robot could use what it learned about the sound of one set of objects to make predictions about the physical properties of previously unseen objects.

"I think what was really exciting was that when it failed, it would fail on things you expect it to fail on," Pinto said in a statement. For instance, a robot couldn't use sound to tell the difference between a red block or a green block. "But if it was a different object, such as a block versus a cup, it could figure that out."