Computer Scientists at UCL have developed software that enables cameras to capture the 3D shape of objects through a single lens.
The method, unveiled at the 2017 Conference on Computer Vision and Pattern Recognition in Hawaii, allows any camera to map the depth for every pixel it captures and opens up a wide range of applications, from augmented reality in computer games and apps, to robot interaction, and self-driving cars. The software can even convert historical images and videos into 3D.
“Inferring object-range from a simple image by using real-time software has a whole host of potential uses,” explained supervising researcher, Dr Gabriel Brostow. “Depth mapping is critical for self-driving cars to avoid collisions, for example. Currently, car manufacturers use a combination of laser-scanners and/or radar sensors, which have limitations. They all use cameras too, but the individual cameras couldn’t provide meaningful depth information. So far, we’ve optimised the software for images of residential areas, and it gives unparalleled depth mapping, even when objects are on the move.”
The software was developed using machine learning methods and has been trained and tested in outdoor and urban environments. It successfully estimates depths for thin structures such as street signs and poles, as well as people and cars, and quickly predicts a dense depth map for each 512 x 256 pixel image, running at over 25 frames per second.
Currently, depth mapping systems rely on bulky binocular stereo rigs or a single camera paired with a laser or light-pattern projector that don’t work well outdoors because objects move too fast and sunlight dwarfs the projected patterns.
According to UCL, whilst there are other machine-learning based systems also seeking to get depth from single photographs, these are trained in different ways, with some needing elusive high-quality depth information. The new technology doesn’t need real-life depth datasets, and is claimed to outperform all the other systems. Once trained, it runs in the field by processing one normal single-lens photo after another.
“Understanding the shape of a scene from a single image is a fundamental problem. We aren’t the only ones working on it, but we have got the highest quality outdoors results, and are looking to develop it further to work with 360 degree cameras,” said UCL PhD student Clément Godard. “A 360 degree depth map would be fantastically useful – it could drive wearable tech to assist disabled people with navigation, or to map real-life locations for virtual reality gaming, for example.”
The team has patented the technology for commercial use through UCL Business, but has made the code free for academic use. The research was funded by the Engineering and Physical Sciences Research Council.