MIT team building social robot

A new breed of robots will be able to interact and co-operate with people in a much more human-like way.

Traditionally, autonomous robots have been designed to operate as independently and remotely from humans as possible, often performing tasks in hazardous and hostile environments.

However, a new breed of robots will be able to interact and co-operate with people in a much more human-like way.

That’s the idea, at least, behind a new robotic head called Kismet that developers at US-based MIT are working on.

Kismet picks up both visual and audtory cues from the human that it interacts with and responds in kind with the direction of its gaze, facial expression and posture.

To make it all happen, there’s a lot of computer power that needs to be employed. Kismet’s hardware and software control systems have both been designed for the real-time processing of video as well as audio data.

The high-level perception system, the motivation system, the behaviour system, the motor skill system, and the face motor system execute on four Motorola 68332 microprocessors running L, a multi-threaded Lisp developed in the MIT lab.

Vision processing, visual attention and eye/neck control is performed by nine networked 400 MHz PCs running QNX (a real-time Unix operating system). Expressive speech synthesis and vocal affective intent recognition runs on a dual 450 MHz PC running NT, and the speech recognition system runs on a 500 MHz PC running Linux.

The robot’s vision system consists of four colour CCD cameras mounted on a stereo active vision head. Two wide field of view cameras are mounted centrally and move with respect to the head. These are 0.25 inch CCD lipstick cameras with 2.2 mm lenses manufactured by Elmo Corporation.

They are used to decide what the robot should pay attention to, and to compute a distance estimate. There is also a camera mounted within the pupil of each eye. These are 0.5 inch CCD foveal cameras with an 8 mm focal length lenses, and are used for higher resolution post-attentional processing, such as eye detection.

Kismet has three degrees of freedom to control gaze direction and three degrees of freedom to control its neck. The degrees of freedom are driven by Maxon DC servo motors with high resolution optical encoders for accurate position control. This gives the robot the ability to move and orient its eyes like a human.

A user can influence the robot’s behaviour through speech by wearing a small unobtrusive wireless microphone. This auditory signal is fed into a 500 MHz PC running Linux. The real-time, low-level speech processing and recognition software was developed at MIT by the Spoken Language Systems Group. These auditory signals are sent to a dual 450 MHz PC running NT. The NT machine processes these in real-time to recognise the intent of the human.

Kismet can display a wide assortment of facial expressions to mirror its ’emotional’ state as well as to serve other communicative purposes. Each ear has two degrees of freedom that allows Kismet to perk its ears in an interested fashion, or fold them back in a manner reminiscent of an angry animal. Each eyebrow can lower and furrow in frustration, elevate upwards for surprise, or slant the inner corner of the brow upwards for sadness. Each eyelid can open and close independently, allowing the robot to wink an eye or blink both. The robot has four lip actuators, one at each corner of the mouth, that can be curled upwards for a smile or downwards for a frown. There is also a single degree of freedom jaw.

The robot’s vocalisation capabilities are generated through an articulatory synthesizer. The underlying software (DECtalk v4.5) is based on the Klatt synthesizer which models the physiological characteristics of the human articulatory tract. By adjusting the parameters of the synthesizer it is possible to convey speaker personality (Kismet sounds like a young child) as well as adding emotional qualities to synthesized speech.

The Kismet team has published several articles on the work, most recently in the July/August 2000 issue of IEEE Intelligent Systems. For more information, including video of the robot and technical papers, go to

MIT aren’t the only one’s in the robot head business though. Researchers in the department of mechanical engineering at Japan’s Waseda University have also been developing anthropomorphic robot heads, which are able to communicate with humans by expressing human like facial expressions.

Brainchild of Professor Atsuo Takanishi, the WE-3RIV robot they have developed also uses its eyes, eyebrows, lips as well as facial colour to convey emotions which include happiness, anger, surprise, sadness, fear, disgust, drunkeness, and shame.

On the web