Researchers at the University of Washington have exploited subtleties in human sound recognition to create fine-grain scalable audio encoding. The technology could be used to make listening to music on the Internet as clear as listening to music on the radio.
Fine-grain scalable audio encoding is also said to have the potential for audio fingerprinting, currently a hot issue in the recording industry, according to Les Atlas, professor of electrical engineering at the University of Washington. He developed the technique with graduate student Mark Vinton. Atlas said the technology could help computer users in a couple of ways.
‘If you’re listening to music on your computer and all you have to plug into is a phone line, then this will make it sound quite a bit better, without interruptions,’ he said. ‘And if you have something with more bandwidth, like a DSL line, this would allow you to instantly access stations, much like the push buttons on your FM radio, without waiting for a buffer. It may not sound quite as good for the first few seconds, but would be easily recognisable and in a few seconds would sound as good as a CD.’
Fine-grain scalable works by prioritising the signal according to what’s most important for recognition by the human auditory system. As bandwidth gets pinched, less important information is shed. The signal drops slightly in quality, but the drop is gradual.
‘For a typical person, the change would be almost unnoticeable,’ said Atlas. ‘If you’re a musician, you might note a difference, but it wouldn’t be a grating change.’
The technology is designed to take advantage of a little-explored aspect of the human auditory system. ‘What we’ve confirmed is there appears to be a second dimension in the auditory system,’ Atlas explained. That dimension appears to prioritise sounds according to duration – the longer, slower aspects of a sound appear to be more critical to recognition. The algorithms that make fine-grain scalable work select and prioritise those aspects.
‘It’s not standard frequency that’s critical – it’s modulation, or duration,’ Atlas said. ‘This is the first time we’ve been able to code in that dimension.’
The technology also has promising applications for audio fingerprinting, or the determination of unique identifiers within songs. The fingerprints can be used for copyright management and fraud detection.
Atlas and Vinton said they’re experimenting to see if the technique will work with video as well. And they’ll continue to push the audio applications to make tuning in your favourite radio station on your computer a viable option. ‘We’re reaching the point where we shouldn’t have to have FM radios on the shelves in our office,’ said Atlas.