Computers classify injury claims

Researchers at Purdue University are developing computer models to comb through thousands of injury reports from workers’ compensation claims to automatically classify them based on specific words or phrases.

‘One goal is to identify the most important causes of injuries so that efforts could be directed towards reducing the burden of injuries in society,’ said Mark Lehto, an associate professor in Purdue University’s School of Industrial Engineering.

The reports, usually filled out by employers, healthcare professionals or claimants themselves, are currently classified by manual coders hired by users such as the US National Center for Health Statistics, hospital staff or insurance industry handlers who review thousands of ‘injury narratives’ included in reports.

The Purdue engineer and researchers at the Liberty Mutual Research Institute for Safety assigned codes to injury reports from workers’ compensation claims using two different models developed with a technique called ‘Bayesian methods’.

‘The predictions were quite good,’ said Lehto. ‘The results were comparable to the human coders. The accuracy is surprising considering all of the misspellings, run-on words, abbreviations and inconsistent or missing punctuations seen in these workers’ compensation claim narratives.’

The research findings were detailed in a study published in August in the Injury Prevention journal. The paper was written by Lehto and Liberty Mutual research scientists Helen Marucci-Wellman and Helen Corns.

Insurance companies enter, maintain and manage tens of thousands of claims annually. The study examined approaches for the efficient assignment of each claim using a computer approach with one- and two-digit ‘event code’ categories developed by the US Bureau of Labor Statistics.

The researchers used a database of 14,000 claim cases, with 11,000 used to develop the models and 3,000 used to test the models.

‘It’s important to distinguish that we predicted 3,000 cases that were different than the ones used to develop the models,’ said Lehto. ‘These were cases the models hadn’t seen before and the models accurately predicted how these cases would be classified by human coders.’