A computer character, capable of realistic emotional expressions may soon be incorporated into working applications.
The research team in the Information Society Technologies (IST) project ERMIS created a prototype able to analyse and respond to user input.
In the analysis phase, the team extracted some 400 features of common speech, then selected around 20-25 as the most important in expressing emotion. These terms were then fed into a neural network architecture that combined all the different speech, paralinguistic and facial communications features. For facial expression, some 19 were selected as the most relevant and were input accordingly.
The results of this analysis were incorporated into a prototype system with several on-screen characters, each of which were capable of reacting to and reproducing the emotional content in speech and facial expressions. By interacting with their human subjects, these computer characters would attempt to make the user angry, happy, sad or even bored. Sometimes with great success, explained Stefanos Kollias of the National Technical University of Athens.
He emphasises, however, that the team did not just focus on extreme emotions. “We tried to develop real-life situations, with the language and facial reactions that expressed everyday emotions over a wide range. For example, feeling positive and eager to participate, or negative and less motivated.”
The result of the ERMIS team’s work is what they call the “sensitive artificial listener,” a computer character that is capable of realistically expressing emotions when communicating with human beings.
The project partners have taken these results and are now analysing them with a view to incorporation into their own products. BT for example is very interested in how the results could be used within its call centre technologies. Nokia, another partner, is investigating the possibility of incorporating such abilities into its multimedia mobile phones. Eyetronics is incorporating what has been learnt into its own 3D virtual models, in order to enhance modelling of facial movements in virtual characters.
“Our work has shown that combined AV [audiovisual] and speech analysis is both feasible and has the potential for incorporation into working applications,” says Kollias. The project results have also led to a follow-on initiative: the four-year HUMAIN (FP6) project.