Researchers at Purdue University have developed a new program that uses artificial intelligence to transform 2D images into 3D shapes.
Known as SurfNet, the program could have applications in robotics and autonomous vehicles, as well as content creation for the emerging 3D industry. It works by mapping 2D images onto a 3D sphere, then opening up that sphere to create a new 2D shape.

SurfNet then applies that technique to huge numbers of 2D images, learning abstractly via machine and deep learning. The system learns the 3D image and the 2D image in pairs, and this enables it to predict other, similar 3D shapes from just a 2D image.
“If you show it hundreds of thousands of shapes of something such as a car, if you then show it a 2D image of a car, it can reconstruct that model in 3D,” said Karthik Ramani, Purdue’s Donald W Feddersen Professor of Mechanical Engineering
“It can even take two 2D images and create a 3D shape between the two, which we call ‘hallucination.'”
According to Ramani, the technique allows for greater accuracy and precision than current 3D deep learning methods that use volumetric pixels, or voxels.
“We use the surfaces instead since it fully defines the shape,” he explained. “It’s kind of an interesting offshoot of this method. Because we are working in the 2D domain to reconstruct the 3D structure, instead of doing 1,000 data points like you would otherwise with other emerging methods, we can do 10,000 points. We are more efficient and compact.”
“This is very similar to how a camera or scanner uses just three colours, red, green and blue—known as RGB—to create a colour image, except we use the XYZ coordinates.”
As the program improves, the researchers hope it can be used to create 3D content using non-specialist 2D equipment, and also allow machines to understand 3D environments from simple 2D images.
This system requires the program to anticipate based on prior knowledge. Thus, it may prove to be a very useful tool in the conversion system, but it cannot be applied to everything. When the requirement is only to produce a “believable” result, it may work well for many objects, but even there, likely the differences from one object to another may often require some human intervention or “tweaking” with some objects. This may be most useful for “entertainment” applications. If, however, the requirement is for an accurate metrological result that can actually be accurately measured in all three axes, such as for medical, industrial, and scientific applications, then this approach may be less useful. Rather than replacing existing methodology, I see this as another, possibly useful, tool in the conversion toolbox.