Comment: Transforming vehicle sound experience with machine learning and precise cabin measurements

Vehicle interiors are one of the modern world’s most challenging environments for delivering high-quality audio, but there are ways to overcome it, says Hendrik Hermann, automotive vice president, Dirac Research.


By combining digital response correction technology with machine learning, we can actually “teach” vehicles how to intelligently combine their speaker outputs for optimised performance.

The challenge

The average vehicle interior is a tight space housing anywhere from four to more than 30 speakers directed at one to seven listeners in different positions, with a wide variety of surfaces featuring different angles, materials and acoustic properties.

Vehicle interiors have multiple speakers directed at different positions, causing misaligned sound waves due to complex surfaces and acoustic properties. While equalisation can enhance sound quality, it cannot fix details of the timing errors between speakers.

Data mining for a solution

Since every vehicle model is unique, what’s required is a sophisticated acoustic measurement system that pairs patented algorithms with a large data set of measurements to create an accurate acoustic fingerprint showing how each speakers’ output is affected by the in-cabin environment, and how it’s perceived at each seat location.

To ensure the most accurate and usable data, Dirac’s system utilises 16 microphones per speaker channel per seating position, resulting in a vast number of measurements, i.e. in a four-seat car with 20 speakers it would capture 1,280 measured impulse responses.

Creating a blank canvas

Since every vehicle is different, delivering similar experiences in them requires tearing down each cabin’s unique characteristics and correcting them so that the sound experience resembles that of a bare studio.

To turn each complex vehicle interior into a blank audio canvas, engineers use the data collected during the measurements. Impulse response correction technology then creates groupings of speakers that support each other’s impulse response characteristics and form what has become known as a ‘super speaker’.

These super speakers can deliver convincing Sound Field Control that allows listeners to perceive bass as coming from right in front of them, even if the subwoofer is mounted far away in the trunk, for example. While experts can guide the process, the brute force calculations must be completed by software tools using patented algorithms because there’s simply too much data for humans to process.

Give the people what they want

While most audio recordings are still conducted and processed in two-channel stereo, some consumers have become accustomed to the immersive experiences only home theaters and multi-channel systems can deliver. For decades, upmixers have provided a rudimentary means of turning two-channel signal into multi-channel output, but virtually all of them produce artifacts in the signal that affect the sound to varying extents.

However, it is possible to remain true to the source material. It requires spatially rearranging the output without adding or subtracting any signal. The multi-channel content then gets distributed or rendered to match the specific hardware system architecture in the vehicle.

Looking beyond the curve

As auto manufacturers look to reduce complexity for entry-level vehicles, the technology described could be vital to optimize soundbar-like designs, at the same time as it supports even more complex systems such as in-ceiling speakers for Dolby ATMOS audio formats in luxury vehicles.

One thing we’re certain of is that audio engineers will remain vital to both product design and its implementation in vehicles and devices. While algorithms and testing can get us about 80 percent of the way to a completed solution, the last 20 percent absolutely requires a professional hand with subjective experience.

Hendrik Hermann, automotive vice president, Dirac Research