Intel researchers have released software under an open-source license that allows developers to build computers that see and ‘read lips’ the way humans do to better understand spoken commands.
Today’s speech recognition algorithms work well when background noise is eliminated or a well-tuned headset is used, but their accuracy rapidly degrades when applications have to cope with noisy environments, such as public places.
Combined with face detection algorithms from Intel’s OpenCV computer vision library, Intel’s Audio Visual Speech Recognition (AVSR) software enables computers to detect a speaker’s face and track their mouth movements.
Synchronizing video data with speech identification enables much more accurate speech recognition, enhancing a wide variety of computer applications in noisy environments.
The AVSR software is part of Intel’s OpenCV computer vision library, a toolbox of more than 500 imaging functions that helps researchers develop computer vision applications.
The OpenCV web site is located at www.intel.com/research/mrl/research/opencv/.