Voice-activated digital assistants are spreading from our homes to our cars. Stuart Nathan looks at what it could mean for the future of driving
This feature was brought to you by Nuance software. Not in any commercial sense, like the sponsorship clips at the beginning of soap operas, but very literally. After a mild stroke four years ago, I lost the ability to type with my left hand, and so I use Nuance’s Dragon NaturallySpeaking system to dictate all my writing. The same disability has forced me to give up driving, as I can no longer reliably change gear or operate indicators in a timely fashion. So when Nuance invited me to see the voice-activated system it was developing for the automotive sector, my interest was piqued.
Voice-activated digital assistants are becoming familiar to many of us, with the technology arguably developed and certainly popularised by Google Assistant, Apple’s Siri and the Amazon Alexa system. They are becoming increasingly common around the home, where they can be used for everything from controlling audiovisual systems (“Hey Alexa, play such-and-such a track by so-and-so” is, I suspect, a common refrain) to compiling shopping lists and accessing online information. Despite their possible security risks, digital assistants seem to be infiltrating so many homes that they may inevitably become as much a part of life as domestic staff once were.
The convenience of a digital assistant in the car is undeniable. If the driver can use their voice to operate all the ancillary systems – climate control, audio, sat nav and so on – then these are all things that they don’t have to push a button to switch on and off, enter any information or otherwise take their attention away from what they are supposed to be doing: driving and concentrating on the road ahead and the surrounding situation. It’s no surprise that, now Google, Apple and Amazon have let the genie out of the bottle, the automotive industry wants a slice of the action.
But there are a number of unique factors to take into consideration in developing such a system. One is privacy. A car’s digital systems can contain information that many people would rather keep to themselves: if you use satellite navigation, for example, your car knows where you are and where you have been. If your car connects to your mobile phone, then it also knows who you’ve spoken to and might have access to any data you keep on there. If some overarching system has access to all of these, and its maker can make any claim to that data, then no driver in their right mind would use such a system.
Another issue is that people are increasingly using different digital assistants for different tasks. For some people, Siri is best suited to business or financial-related issues, and they wouldn’t want to ask it how to get to their next meeting destination or weekend getaway.
Equally, if you’re used to asking Alexa for your favourite playlist or to add laundry detergent to your shopping list when you’re in your living room, you probably want to keep doing that from your driving seat.
Such considerations are at the heart of Dragon Drive, the system that Nuance has developed for the automotive sector and which is installed in the new models of Mercedes A-Class and Audi A8 cars. Designed using the same voice interface that Apple licensed to control Siri, the system is designed to allow drivers to speak conversationally and intuitively to their cars’ systems, without having to learn and remember specific control words.
Moreover, the system is designed to work whether or not the car has an available connection to Wi-Fi networks, only connecting to the internet if necessary and therefore keeping as much data as possible on board. The system does not send any information back to Nuance. The natural language system, named “Just Talk” by Nuance, allows the driver to issue commands without any sort of control-freak prefix (like “Hey, Alexa”).
Just Speak is programmed to recognise context: so, for example, saying “I’m too hot” will switch on the air conditioning system localised to the seat where the system’s microphone array detected the speaker. Similarly, in an appropriately equipped car, saying “I’m feeling tense” would activate the in-seat massage system, and “my hands are cold” would turn on the steering wheel heater.
The system is also linked to satellite navigation, and uses available add-ons. Asking it to navigate to a certain destination activates the system and provides turn-by-turn directions. Navigation by postcode is available, as is the system known as “what3words”, which divides the surface of the world up into 3m squares and identifies each one by a unique combination of three words. This can lead to some odd-sounding phrases: should you want to visit The Engineer’s offices, you would say: “Navigate to what three words ‘reform ashes flown’.” (The Eiffel Tower is ‘prices slippery traps’). The system also allows connection to weather services, once again providing contextual information, for example by saying “do I need an umbrella in Manchester?” Or “do I need sunglasses in Rome?”
Also built in is an ‘intelligent arbitration’ system which recognises the context of a question and arbitrates as to whether the in-car systems can answer a query or whether a different digital assistant system needs to be consulted. This would depend on the user’s preferred settings. If, as in the previous example, the user prefers Siri to deal with financial information, asking the system “what’s the latest on Airbus?” would tell it to connect to Siri to retrieve stock market information or the latest business headlines. Similarly, should you remember something to go on the shopping list while driving, mentioning that would activate Alexa.
And the system is already evolving. Although its first iteration is voice controlled only, Nuance is already testing a feature that adds gaze control, so that the driver can access certain functions merely by looking in the right place.
Gaze control unlocks a surprising array of functionality. The system Nuance is currently using is licensed from Tobii Technology, a Swedish start-up previously covered in The Engineer. Its technology exploits two evolutionary quirks unique to humans: one behavioural and one physical.
The behavioural quirk is pointing. Humans are the only apes that indicate to others where their attention is focused. Other primates can learn to do it, but only in captivity and never in nature: some dogs, of course, can also be trained; but humans innately start to point at things, typically at the age of 14 months.
The other may come as a surprise. Humans are the only mammal whose eyes have visible whites all the time when open. Evolutionary biologists believe that this trait developed so that other humans can easily determine where somebody else is looking; it is, therefore, a subset of pointing. All societies point at things, although customs differ: to some, pointing with the finger is rude; some African societies sometimes point with their lips; but everybody points. It is one of those things that defines what is to be human but hardly any of us realise or acknowledge it.
Tobii’s system uses cameras and infrared light projection and mapping to detect the boundary between the white of the eye and the iris, both by looking for the change in colour and the bulge the iris makes at the front of the eye. Developed both to help disabled people use computers and for gaming, it is just one of many gaze-detection systems on the market. The detection device consists of a bar mounted on top of the dashboard facing the driver, and can also determine which way the driver’s face is pointing even if their eyes are obscured by sunglasses.
The demonstration of the system was literally eye-opening. It is linked to cameras installed in the front of the car, so a glance at a building by the roadside and a query of “What’s that building?” triggered a stream of information about the hotel the driver was looking at. “When is that place open?” while looking at a restaurant retrieved the opening hours; the system will also respond to a command of “book a table for two there at 8.30 tonight” by using online table reservation systems.
Other databases could also be used: for nature lovers, an enquiry of “What’s that tree?” could be accommodated by connecting the cameras to a machine vision and AI system equipped to recognise local flora.
Such a system could also be of use to the insurance industry, were access granted to its data. If cars equipped with gaze detection were involved in an accident, it would be possible to determine where the drivers were looking at the precise moment the accident occurred, which could be invaluable in determining who was liable. This, of course, has legal implications, which Nuance is investigating.
Going back to the personal, Nuance’s developers assured The Engineer that it would be possible for Just Speak to activate other electrical systems in the car. For example, “headlights on dipped”, “full beam”, “indicate left” and “indicate right” would all be workable commands if the user requested that these be programmed in. It would have to be a dealer or a manufacturer adjustment, but such functionality would allow me to once again drive safely and with confidence, despite my impaired movement.