Data can help drive UK rail revolution

PICTURE ALT TEXT HEREThe rail industry has collected a lot of data, but it doesn’t always know what to do with it, says Simon Stoddart, Consultant at Tessella.

In recent years, rail organisations have recognised that data can help them in all sorts of ways, and have started collecting it. A growing number of sensors gather information on acceleration, power use, door closure times, fault codes, and so on, all of which has been added to their store of data about performance of the business (punctuality, spending, maintenance plans, etc).

Hidden within this data lies explanations as to when and why faults occur and what causes delays, as well as the keys to optimising performance and investment in new equipment and infrastructure. The benefits run from ongoing cost saving, to improving schedules, to extending a fleet’s lifetime by a decade. The industry is excited. But it is also struggling to get the hoped-for value from this data.

solar rail

What benefits can data projects deliver to rail?

Good data projects start with a plan. This can be a plan to investigate a specific known problem, or to understand whether unknown problems exist.

The latter might start by picking a broad parameter to investigate, such as power consumption. We could then gather data on location (via GPS), acceleration (using accelerometers) and control system performance.

An investigation like this could create a baseline of power consumption under different conditions (acceleration, travelling over fixed routes, braking) on a given stretch of track. It could then look at variations from this baseline over a long period of time. Since carriages can change order, data should be decoupled to see if a problem is coming from a specific carriage. If one unit demands significantly more power, this might alert us to a problem.

This investigative stage helps identify the need for a more specific investigation, which drills down into what is driving the problem. In other cases, the company may not need this first step, they may know there is a problem, but need an investigation to understand the cause.

Data can also be used to investigate the condition of rolling stock. For example, damage to a wheel will create subtle differences in vibrations. By fitting vibration monitors and creating a vibration profile of different types of wheel damage, we can spot this problem when we see it.

Such projects may be initially run as proof of concepts, demonstrating how historical data can identify problems beyond the capacity of humans. Cost saving action can often then be taken. But the real value comes when these models are productionised – wrapped in user-friendly software and rolled out into the organisation so that they can use real-time data coupled with historic trend data to alert engineers to impending problems.

How to approach a rail data project

The ideal approach for such data projects is to start by identifying the problem that needs to be solved, and then identifying the data that needs to be captured. Once it gets underway, the project should be agile, reviewing progress and opportunities iteratively at each stage.


Many rail companies, as in other industries, started collecting data years ago, though only recently started thinking about what to do with it. Years of data can still be useful, but inevitably mean some datasets will not be ideally suited to the problem they want to solve, or may require significant reformatting and cleaning. This is further complicated because much data comes from 3rd party instruments using different formats and with different approaches to sharing data.

Take an example problem: predicting train door faults. When a door fails, a carriage must be taken out of service for maintenance. Changes in door opening times are a precursor to door failure, so it is possible to spot deterioration patterns well before they become a problem that a customer would notice. The maintenance engineer would be informed which door, and action can be taken at their convenience.

This requires sensors to detect door position, with data recorded at high enough time resolution. At one second resolution, a gradual slowing would be easily missed; at one-tenth of a second it would be quickly clear when something was going wrong. If we plan in advance to deploy the right sensors and capture of the right data for the problem we want to solve, it is relatively straightforward to design suitable algorithms to monitor the doors.

Ideal approaches are not always possible in the real world. For rail companies with several years of data, there are still plenty of opportunities to use it productively. But going forward, this mindset will help them use data more effectively.


Towards a data-driven rail industry

Whilst data collection is becoming embedded, strategic approaches to data projects – which identify problems to be solved and ways to solve them – are only just starting to take hold.

This is not surprising; data is new as a business driver. An industry used to long-term planning and physical infrastructure will understandably take time to adapt to fast-moving, agile data projects, which require new skillsets and mindsets.

But change could come quickly. The benefits of rail data can be understood relatively easily, compared to industries like pharmaceuticals. Small, low-risk proofs-of-concept can be developed quickly which spot potential improvements and map a route forward (or quickly identify red herrings which should be abandoned) before any major investment is needed in scaling them up. That is an important advantage of getting business buy-in.

We find that once we present visualisations of data and insights to senior managers in rail as in other industries, they quickly see the potential and get excited. Even simple statistical correlations, which are a long way from the huge potential of cutting-edge data science, can come as a revelation. As with any new disruptive approach, the challenge is that they don’t know what they don’t know. Once those in charge see what data can do, the benefits of further studies become obvious.

Tessella is a data science, analytics and AI consultancy. It is part of global engineering company Altran.