Human-Computer Interaction Gets a Helping Hand, Eye and Voice

Computers are one step closer to ‘understanding’ people, thanks to progress in human-computer interaction research at Rutgers University.

Computers are one step closer to ‘understanding’ people, thanks to progress in human-computer interaction research at Rutgers University funded by the National Science Foundation (NSF) Directorate for Computer and Information Science and Engineering.

Keyboard and mouse inputs suffice for many users and PC applications. But NSF-funded researchers in a project called STIMULATE are developing systems that mimic other forms of communication that humans use to interact with each other, including eye contact, touch and voice.

The experimental hardware and software may find uses in medicine, the military and other fields that could benefit from more natural forms of human-computer interaction across distributed networks. NSF funding is about $780,000 for three years.

Computer scientists and electrical engineers at Rutgers have designed Multimodal Input Manager (MIM) hardware that simultaneously receives speech, gaze and tactile signals. Then special software called Fusion Agent assimilates the complex inputs so the computer may respond to subtle signals that humans routinely use to communicate with one another.

A pneumatic ‘force-feedback’ glove, patented by Rutgers, weighs less than three ounces and reads gestures by detecting fingertip positions relative to the palm. It lets the user point at the computer screen, overriding signals from a gaze-tracking camera.

Whereas other gaze trackers require cumbersome headpieces, the MIM’s gimbal-mounted unit sits on the desktop and rotates to detect where the user is looking. After a 10-second initial calibration of the infrared detectors, the user can direct a cursor just by looking at a section of the computer screen.

‘While we don’t foresee that the keyboard and mouse will become obsolete anytime soon,’ said STIMULATE project leader James Flanagan, ‘MIM technology opens possibilities for improving current computer applications and for developing entirely new ones that require more-refined modes of human-computer interaction.’

The software even detects lip movement to steer a microphone array for use in high-noise environments. For groups of users, the array can home in on the vocal source, even if the person speaking moves around the room.

MIM users at multiple locations can simultaneously interact with each other in a unified, 3D-work environment. Using the Java programming language, the project also produced new cWorld (for Collaborative World) software that lets teams of users construct those virtual environments.

‘Human-computer interaction like STIMULATE is a major thrust of NSF’s new Information Technology Research (ITR) program,’ said Michael Lesk, who oversees the $90 million ITR initiative in fiscal year 2000. ‘This is the kind of risk-taking project where success is not guaranteed but potential benefit is enormous.’ Lesk also noted the participation of 22 graduate students and 15 undergraduates in the project.

The MIM has been tested by medical doctors for analyzing images of blood samples, X-rays and MRI tests. A physician can use the tactile, voice-recognition and eye-tracking inputs to rapidly separate distinct image characteristics, then vocally query the database for samples that match.

Another field test of the MIM hardware was a disaster-relief simulation involving Army National Guard officers at Fort Dix, New Jersey. By using the STIMULATE system to interact with remote staff, a command officer was able to rapidly process 2D and 3D representations of logistical, personnel and equipment data. Due to the unsuitability of a keyboard and mouse, these tasks are presently handled with voice-only communications, and data are plotted using acetate map overlays and grease pencils.