The key to successful implementation of machine learning is data. A consequence of the low number of failures experienced on the railway network is that training (learning) on such a statistically insignificant dataset is not possible. Heinrich introduced the concept of the safety triangle, See Figure below in which the near miss concept as an indicator of risk was formulated. The idea has been widely applied across a number of industries including rail. The Confidential Incident Reporting & Analysis System (CIRAS) developed at the University of Strathclyde has been implemented in the UK since the 1999 Ladbroke Grove rail crash.
For the near miss approach to make sense there must be direct causal predictors of later, more serious, accidents, this assumption is based upon the common cause hypothesis or the assumption that near misses and accidents have the same relative causal patterns. Wright and Van der Shaaf (2004) tested the hypothesis using data from UK rail industry that included 46 formal inquiry reports, 88 signal passed at danger (SPAD) reports and 106 Confidential Incident Reporting and Analysis System (CIRAS) reports and found that for the rail sector the common cause hypothesis was valid. For model training purposes the GoSAFE RAIL project will develop safety indicators with reference to Directive 2004/49/EC.
The failure mechanism for many failure modes is well understood. For example shallow translational slope failures are becoming a more common problem along many networks. Fourie et al. (1996), Gavin and Xue (2009) and others identify the reduction of soil suction as a result of rainfall infiltration as being the primary cause of these failures. Martinovic et al (2016) analyse data from 500 minor slope failures on the Irish Rail network and using machine learning techniques establish a link between past rainfall and failures, the impact of increased rainfall (or climate change) can then be considered using climate models. In the GoSAFE rail project the researchers will develop these techniques for other failure modes; notably for bridges, tunnels and level crossings.
In addition to the issue related to lack of failure incidents to allow for model training (in an AI sense), another challenge for the application of machine learning on rail networks is that there is unlikely to be a link between consequences and events of a given scale (e.g. a major rainfall event). This is because Railway IM’s routinely use speed limitations to minimise the risk of a train derailment etc. So whilst there may be a link between incident severities there will not be the same link with consequence as the operators manage this risk by operational means (restricting capacity). For example the link between the rainfall and shallow landslides is clear. When periods of high rainfall are predicted IM’s use colour coded rainfall thresholds to control line speeds. Thus the hazard is elevated, however the lower speed reduces the likelihood that a train will collide with a landslide and therefore the risk is reduced. This is an effective solution from a safety (or risk) perspective but inefficient in terms of network capacity. The GoSAFE RAIL project will provide solutions in terms of understanding the hazard better and incorporating better estimates of the risk of failure with the microsimulation of network flows to optimise the risk vs. capacity issue.