Telemedicine could be improved with a method that uses the camera on an electronic device to take pulse and respiration signals from a real-time video of a patient’s face.
The advance by a University of Washington-led team was presented in December, 2020 at the Neural Information Processing Systems conference. The team is now proposing a better method to measure these physiological signals with a system less likely to be hindered by different cameras, lighting conditions or facial features. The researchers will present these findings April 8 at theACM Conference on Health, Inference, and Learning.
"Machine learning is pretty good at classifying images. If you give it a series of photos of cats and then tell it to find cats in other images, it can do it. But for machine learning to be helpful in remote health sensing, we need a system that can identify the region of interest in a video that holds the strongest source of physiological information - pulse, for example - and then measure that over time," said lead author Xin Liu, a UW doctoral student in the Paul G. Allen School of Computer Science & Engineering.
"Every person is different," Liu said in a statement. "So this system needs to be able to quickly adapt to each person's unique physiological signature, and separate this from other variations, such as what they look like and what environment they are in."
The team's system runs on the device instead of in the cloud and uses machine learning to capture subtle changes in how light reflects off a person's face, which is correlated with changing blood flow. Then it converts these changes into both pulse and respiration rate.
The first version of this system was trained with a dataset that contained both videos of people's faces and ‘ground truth’ information: each person's pulse and respiration rate measured by standard instruments in the field. The system then used spatial and temporal information from the videos to calculate both vital signs. It is said to have outperformed similar machine learning systems on videos where subjects were moving and talking.
The system worked well on some datasets but struggled with others that contained different people, backgrounds and lighting. This is a common problem known as ‘overfitting,’ the team said.
The researchers improved the system by having it produce a personalised machine learning model for each individual. Specifically, it helps look for important areas in a video frame that likely contain physiological features correlated with changing blood flow in a face under different contexts, such as different skin tones, lighting conditions and environments. From there, it can focus on that area and measure the pulse and respiration rate.
While this new system outperforms its predecessor when given more challenging datasets, especially for people with darker skin tones, there is still more work to do, the team said.
"We acknowledge that there is still a trend toward inferior performance when the subject's skin type is darker," Liu said. "This is in part because light reflects differently off of darker skin, resulting in a weaker signal for the camera to pick up. Our team is actively developing new methods to solve this limitation."
The researchers are also working on a variety of collaborations with doctors to see how this system performs in the clinic.
"Any ability to sense pulse or respiration rate remotely provides new opportunities for remote patient care and telemedicine. This could include self-care, follow-up care or triage, especially when someone doesn't have convenient access to a clinic," said senior author Shwetak Patel, a professor in both the Allen School and the electrical and computer engineering department. "It's exciting to see academic communities working on new algorithmic approaches to address this with devices that people have in their homes."