Joy without joysticks

University of Washington researchers have developed software that allows users to control a computer using their voice.


The so-called “Vocal Joystick” software detects sounds 100 times a second and instantaneously turns that sound into movement on the screen. Different vowel sounds dictate the direction: “ah,” “ee,” “aw” and “oo” and other sounds move the cursor one of eight directions. Users can transition smoothly from one vowel to another, and louder sounds make the cursor move faster. The sounds “k” and “ch” simulate clicking and releasing the mouse buttons.


Versions of Vocal Joystick exist for browsing the Web, drawing on a screen, controlling a cursor and playing a video game. A version also exists for operating a robotic arm. Jeffrey Bilmes, a UW associate professor of electrical engineering, also believes the technology could be used to control an electronic wheelchair.


Existing substitutes for the handheld mouse include eye trackers, sip-and-puff devices and head-tracking systems. Each technology has drawbacks. Eye-tracking devices are expensive and require that the eye simultaneously take in information and control the cursor, which can cause confusion. Sip-and-puff joysticks held in the mouth must be spit out if the user wants to speak, and can be tiring. Head-tracking devices require neck movement and expensive hardware.


Vocal Joystick requires only a microphone, a computer with a standard sound card and a user who can produce vocal sounds.


‘A lot of people ask: “Why don’t you just use speech recognition?”‘ Bilmes said. ‘But it would be very slow to move a cursor using discrete commands like ‘move right’ or ‘go faster.’ The voice, however, is able to do continuous commands quickly and easily.’ Early tests suggest that an experienced user of Vocal Joystick would have as much control as someone using a handheld device.


The researchers have also shown that the software can be used to control a robotic arm. The pitch of the tone moves the arm up and down; other commands are unchanged. This is the first time that vocal commands have been used to control a three-dimensional object, Bilmes said.


To test the device, the group has been working with about eight spinal-cord injury patients at the UW Medical Center since March.


Bilmes said he hopes people will become more adept at using the system over time. Future research will incorporate more advanced controls that use more aspects of the human voice, such as repeated vocalisations, vibrato, degree of nasality and trills.