Reinforcement learning shows promise for diabetes dosing

A new study from the University of Bristol has shown how the same type of machine learning used to teach self-driving cars and chess algorithms could be used to manage insulin dosing for diabetics.

Adobe Stock

Known as reinforcement learning, the method uses an algorithm that learns from patient records, using decisions made by patients rather than trial and error to inform dosing decisions. The Bristol team found that the approach significantly outperformed commercial blood glucose controllers in terms of safety and effectiveness to refine insulin dosing and controlling blood sugar. The work is published in the Journal of Biomedical Informatics.

“These machine learning driven algorithms have demonstrated superhuman performance in playing chess and piloting self-driving cars, and therefore could feasibly learn to perform highly personalised insulin dosing from pre-collected blood glucose data,” said lead author Harry Emerson from Bristol’s Department of Engineering Mathematics.

“This particular piece of work focuses specifically on offline reinforcement learning, in which the algorithm learns to act by observing examples of good and bad blood glucose control. Prior reinforcement learning methods in this area predominantly utilise a process of trial-and-error to identify good actions, which could expose a real-world patient to unsafe insulin doses.”

Due to the high risk associated with incorrect insulin dosing, experiments were performed using the FDA-approved UVA/Padova simulator, which creates a suite of virtual patients to test type 1 diabetes control algorithms. Offline reinforcement learning algorithms were evaluated against one of the most widely used artificial pancreas control algorithms.

This comparison was conducted across 30 virtual patients of different ages and considered 7,000 days of data, with performance evaluated in accordance with current clinical guidelines. The simulator was also extended to consider realistic implementation challenges, such as measurement errors, incorrect patient information and limited quantities of available data.

It was found that the biggest benefit was for children, who experienced an additional one-and-a-half hours in the target glucose range per day. Children represent a particularly important group as they are often unable to manage their diabetes without assistance and an improvement of this size would result in markedly better long-term health outcomes. 

The ultimate goal is to deploy reinforcement learning in real-world artificial pancreas systems. However, these devices operate with limited patient oversight and as such will require significant evidence of safety and effectiveness to gain regulatory approval.

“The explored method outperforms one of the most widely used commercial artificial pancreas algorithms and demonstrates an ability to leverage a person's habits and schedule to respond more quickly to dangerous events,” said Emerson.