Tuesday, 23 September 2014
masthead+quote+image
Advanced search

Algorithm turns bag of crisps into a microphone

Researchers have found a way to effectively use everyday objects caught on camera as microphones, which could help police draw extra information from CCTV footage.

The team from Massachusetts Institute of Technology (MIT), Microsoft and software firm Adobe created an algorithm that created useful audio from the videoed vibrations caused by soundwaves of a glass of water, the leaves of a plant and even a packet of crisps filmed through soundproof glass.

‘We’re recovering sounds from objects,’ said lead researcher Abe Davis in a statement. ‘That gives us a lot of information about the sound that’s going on around the object, but it also gives us a lot of information about the object itself, because different objects are going to respond to sound in different ways.’

Typically this technique requires a high-speed camera recording separate video frames more quickly than the speed of the vibrations, at a rate of thousands of frames per second – much greater than that of ordinary smartphone cameras, for example.

However, the researchers were also able reconstruct a lower-quality audio signal from standard 60 frames per second footage by inferring the missing information. The team claim this could provide a clear-enough signal to indicate the number of speakers being recorded, their gender and possibly even their individual identities.

‘When sound hits an object, it causes the object to vibrate,’ said Davis. ‘The motion of this vibration creates a very subtle visual signal that’s usually invisible to the naked eye. People didn’t realise that this information was there.’

The researchers measured the mechanical properties of the objects they filmed and determined their motions were about a tenth of a micrometer in size, which corresponds to five thousandths of a pixel in a close-up image. But they were able to detect movement in individual pixels by looking at their colour.

A pixel captured at the boundary of an object appears as a mixture of the colours on either side of that boundary. For example, a pixel on the border of a red object against a blue background will appear purple. If the boundary moves slightly then the shade of the pixel will alter according e.g. as it becomes more red or more blue.

Some boundaries in an image are fuzzier than a single pixel in width, however. So the researchers borrowed a technique from earlier work on algorithms that amplify minuscule variations in video to make visible previously undetectable motions: for example, the breathing of a baby filmed in a hospital.

The algorithm uses a series of image filters that separate fluctuations at different boundaries moving in different directions. It then combines this data, giving greater weight to measurements made at very distinct boundaries, to infer the motions of the object as a whole.


Readers' comments (6)

  • Lovely concept: many years ago I recall reading/ listening to a paper about some archaeological research which was attempting to 'extract' the vibrations (made by the voices of the potters) that were present in striations in very old (3,000BC) clay pots. It was hoped that one might be able to actually 'hear' the potters talking whilst smoothing their hands over the wet clay in the pots: firing would 'set' these vibrations for ever. Don't know what happened next?
    Best
    Mike B

    Unsuitable or offensive? Report this comment

  • I definitely remember the "acousto-archaeology" idea as a Daedalus item in Ariadne's column in New Scientist circa 1970-75. "Playing back" plasterwork which had recorded vibrations from the plasterer's trowel - both from antiquity and mediaeval - was also suggested. I don't think either was realistic but perhaps another reader will prove me wrong!

    Unsuitable or offensive? Report this comment

  • If the vibrations of a crisp packet can be spotted on a video-clip....the vibrations of the potter's hands/trowel/ from their voice are surely orders of magnitude greater? Come on MIT, be lateral (or did I mean bi-lateral)
    Best
    Mike B

    Unsuitable or offensive? Report this comment

  • Slightly OT, but the scriptwriters at CSI used acousto-archaeology in the plot line to an episode named 'Committed' (series 5). Needless to say, a clay pot gave up its secrets and the hoodlums were caught.

    Unsuitable or offensive? Report this comment

  • Snoop snoop snoop. The concept behind this microphone isn't exactly new. In chapter 6 of Spy Catcher by Peter Wright it describes microphones formed from a metal object in a room irradiated with a beam of microwaves from outside which reflects back off it. Sound waves cause the object to vibrate and the vibrations modulate the reflected microwave beam. The sound is then recovered by demodulation of the reflected microwave beam.

    The plan by MI5 was to develop a metal ornament then deliver it to the Ambassador of the Soviet Union. A bust of Lenin was unsuitable because the curved surfaces did not effectively reflect the sound waves but a model of the Kremlin might have been suitable.

    Unsuitable or offensive? Report this comment

  • don't forget 'The Stone Tape' - 1972

    Unsuitable or offensive? Report this comment

Have your say

Mandatory
Mandatory
Mandatory
Mandatory

My saved stories (Empty)

You have no saved stories

Save this article