Comment: how engineers are utilising data-centric AI

What does the evolution towards data-centric AI mean for engineers, asks David Willingham, deep learning product manager, MathWorks


Driven by an increase in the data available to build robust models, and expanding knowledge around the benefits of quality input, many engineers have begun to move away from a model-centric approach in pursuit of greater accuracy across industry applications. As a result, data-centric AI has been rapidly gaining momentum, with more and more engineers tapping into its benefits.

With a model’s performance dependent on the quality of its training data, the recent push towards greater levels of data focus have empowered teams to push for improved model accuracy without the need to constantly tweak model parameters. By improving model accuracy and data quality, data-centric AI has opened the door for new areas of application to explore AI, including 5G communications and medical device imaging.

Successful modelling has always been conditional on standards of input data, but the modern challenge lies in determining how data-centric AI can solve specific application problems, and the techniques and tools available to do so. As research into data-centric AI continues, best practice is expected to evolve to work with a growing list of use-cases.

Implementing data-centric AI

To achieve accurate results, engineers increasingly emphasise improving the quality of data inputted into a model. But as data-centric AI continues to drive improved model outcomes, it's important to note that there are no universal standards for the degree of data needed to maintain a successful AI model. In turn, engineers must remember that data-centric AI is dynamic, and needs will vary based on the application.

Ultimately, this necessitates a multi-faceted approach to data optimisation to ensure accuracy. As more engineers are implementing data-centric AI into their operations, best practices such as reduced order modelling, data synchronisation, digital distortion and image augmentation are being used to improve outcomes.

Reduced order modelling estimates the behaviour of source environments, allowing engineers to study a system’s dominant effect quickly, using minimal computational resources whilst maintaining data quality. For image-based applications (e.g. object classification), engineers can close gaps in training data, by retaking or augmenting original images to develop new copies to ensure ample data volumes for robust model training.

Data synchronisation ensures that the data used aligns with the needs of the application. If engineers build an AI model that makes hourly predictions, it will require hourly data inputs to guide its performance.

As data quality improves, so too will engineers’ ability to tackle bias. Improved data makes it easier to recognise bias, providing engineers with the insights needed to ensure adequate data collection to provide a representative outcome in vital fields like healthcare.

Industry applications

The improved model outcomes that a data-centric approach has brought has thrust data-centric AI into applications across industries. In the field of wireless, data optimisation techniques have changed the way engineers design digital predistortion filters, which proactively modify signals to reach a comfortable noise level in the presence of competing ones.

Medical device imaging is also embracing this area. By pairing image and signal data, engineers can adjust 3D imaging machines to drive more tailored and accurate tumour analysis and lung health measurement, with additional applications for COVID-19 screening.

In automotive engineering, data-centric AI is being applied to build a clearer picture of battery sensor data, such as voltage and average temperature. This enables a better state of charge estimation, which constitutes a vital component in the design and improvement of electric car batteries.

There are a number of experiment-based and data preparation tools that can assist engineers in implementing data-centricity into AI models. Data-centric AI brings code modification to the upfront of the design process, as model code remains mostly constant. A number of modern applications test coding protocols aimed at data optimisation, allowing engineers to evaluate potential AI modelling improvements through data quality adjustments. Engineers have also found value in data preparation apps that enable quick and automated data labelling.

What’s in a data-centric future?

As research into data-centric AI continues, engineers should still be cognisant of the fact that efficient modelling requires close engagement between the data scientists leading modelling efforts and the engineers who drive the data making them work. By showing how data can be enriched to support the production of a model that engineers may not be making, data-centric AI provides a route to collaboration for multi-disciplinary teams.

Engineers across industries are accelerating their use of data-centric AI, leading to improved data quality and model accuracy across an entire spectrum of applications. As data-centric AI is utilised more in the years to come, it has the potential to drive greater collaboration between engineering teams and accelerate the pace and scope of projects.

David Willingham, deep learning product manager, MathWorks