The student David González Jiménez obtained an EXCELLENT grade


The student David González Jiménez obtained an EXCELLENT grade


The student David González Jiménez obtained an EXCELLENT grade



  • Thesis title: A Hybrid Methodology for Fault Detection and Diagnosis in Railway Traction Systems: Integrating Data-driven and Physics-based models


  • Presidency: Jose Alfonso Antonino Daviu (Universitat Politècnica de València)
  • Vocal: Txomin Nieva Fatela (CAF POWER & AUTOMATION)
  • Vocal: Daniel Moríñigo Sotelo (Universidad de Valladolid)
  • Vocal: Oliver Wallscheid (Universität Paderborn)
  • Secretary: Fernando Garramiola Alday (Mondragon Unibertsitatea)


This PhD Thesis focuses on the analysis of physics-based and field data models to design fault detection and diagnosis (FDD) strategies in railway traction systems. In general terms, it addresses a significant challenge in industrial AI, which is the lack of quality and availability of field data, mainly due to the absence of data related to different failure modes of traction equipment. The research delves into the opportunities and challenges associated with the hybridization of synthetic data generated through simulation tools with field data from railway applications, aiming to improve the accuracy and effectiveness of data-driven FDD strategies.

The state-of-the-art review demonstrates that current research has implemented hybrid strategies based on data and physics-based models as a clear alternative to overcome the lack of real field data in industrial equipment. However, in most of these approaches, simulation models are generated ad-hoc for synthetic data creation. Furthermore, these studies lack a standardized workflow and do not specifically address the railway sector. Therefore, the main scientific contribution of this work lies in the design of a standardized methodology that allows the integration of simulation platforms previously validated by companies with real field data when available. In summary, this methodology proposes guidelines to reuse and extend the use of simulation platforms based on physics-based models used throughout the life cycle of railway traction equipment to address the challenge of data scarcity and design health management strategies for company assets.

This new methodology, called SI-CRISP-DM, derived from the CRISP-DM standard, is implemented in three different use cases defined by the industrial partner collaborating in this thesis. The first use case examines the applicability of these guidelines in a Data Mining project in the railway sector. Specifically, a strategy is designed to detect and diagnose low-frequency oscillation events in traction equipment powered by AC catenaries. Due to the lack of quality data to detect these anomalous events with traditional techniques, a data-driven approach is proposed using real field data to train supervised classification algorithms through machine learning.

The second and third use cases demonstrate the potential of the SI-CRISP-DM methodology by reusing different physics-based models to generate synthetic data and hybridize them with real field data when available. In the second use case, a strategy is developed to detect thermal anomalies in the traction inverter using machine learning by modifying a simulation model called HDET to generate synthetic data. This tool has been previously used by the industrial partner in the design stages of the traction equipment. Simultaneously, in the third use case, a Hardware-in-the-Loop platform is used as a source of synthetic data, allowing the use of commercial controllers and more faithfully emulating the traction equipment. This ensures the data quality for training machine learning algorithms.