A new computer algorithm developed at the University of Washington has used hundreds of thousands of tourist photos of Rome to automatically reconstruct the entire city in about a day.
The tool is the most recent in a series developed at the university to harness the increasingly large digital photo collections available on photo-sharing websites. The digital Rome was built from 150,000 tourist photos tagged with the word ‘Rome’ or ‘Roma’ that were downloaded from the photo-sharing website, Flickr.
Computers analysed each image and in 21 hours combined them to create a 3D digital model. With the model, a viewer can fly around Rome’s landmarks, from the Trevi Fountain to the Pantheon to the inside of the Sistine Chapel.
‘How to match these massive collections of images to each other was a challenge,’ said Sameer Agarwal, a UW acting assistant professor of computer science and engineering. ‘Until now,’ he added, ‘even if we had all the hardware we could get our hands on and then some, a reconstruction using this many photos would take forever.’
Earlier versions of the photo-stitching technology are known as Photo Tourism. That technology was licensed in 2006 to Microsoft, which now offers it as a free tool called Photosynth.
‘With Photosynth and Photo Tourism, we basically reconstruct individual landmarks. Here, we’re trying to reconstruct entire cities,’ said Noah Snavely, who developed Photo Tourism as his doctoral work and is now an assistant professor at Cornell University.
In addition to Rome, the team recreated the Croatian coastal city of Dubrovnik, processing 60,000 images in less than 23 hours using a cluster of 350 computers, and Venice, Italy, processing 250,000 images in 65 hours using a cluster of 500 computers. Many historians see Venice as a candidate for digital preservation before water does more damage to the city, the researchers said.
Transitioning from landmarks to cities – going from hundreds of photos to hundreds of thousands of photos – is not trivial. Previous versions of the Photo Tourism software matched each photo to every other photo in the set. But as the number of photos increases the number of matches explodes, increasing with the square of the number of photos. A set of 250,000 images would take at least a year for 500 computers to process, Agarwal said. A million photos would take more than a decade.
The newly developed code works more than one hundred times faster than the previous version. It first establishes likely matches and then concentrates on those parts. The code also uses parallel processing techniques, allowing it to run simultaneously on many computers, or even on remote servers connected through the internet.
The technique could create online maps that offer viewers a virtual-reality experience. The software could build cities for video games automatically, instead of doing so by hand. It also might be used in architecture for digital preservation of cities, or integrated with online maps.
The project website is at http://grail.cs.washington.edu/rome/