An explainable Machine Learning model for Large-Scale Travelling Ionospheric Disturbances forecasting
Large-Scale Travelling Ionospheric Disturbances (LSTIDs) are wave-like ionospheric fluctuations, generally triggered by geomagnetic storms, which play a critical role in space weather dynamics. In this work, we present a machine learning model able to forecast the occurrence of LSTIDs over the Europ...
Saved in:
Main Authors: | , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
EDP Sciences
2025-01-01
|
Series: | Journal of Space Weather and Space Climate |
Subjects: | |
Online Access: | https://www.swsc-journal.org/articles/swsc/full_html/2025/01/swsc240048/swsc240048.html |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Large-Scale Travelling Ionospheric Disturbances (LSTIDs) are wave-like ionospheric fluctuations, generally triggered by geomagnetic storms, which play a critical role in space weather dynamics. In this work, we present a machine learning model able to forecast the occurrence of LSTIDs over the European continent up to three hours in advance. The model is based on CatBoost, a gradient boosting framework. It is trained on a human-validated LSTID catalogue with the various physical drivers, including ionogram information, geomagnetic, and solar activity indices. There are three forecasting modes depending on the demanded scenarios with varying relative costs of false positives and false negatives. It is crucial to make the model predictions explainable, so that the output contribution of each physical factor input is visualised through the game-theoretic SHapley Additive exPlanation (SHAP) formalism. The validation procedure consists of a global-level evaluation and interpretation step, firstly, followed by an event-level validation against independent detection methods, which highlights the model’s predictive robustness and suggests its potential for real-time space weather forecasting. Depending on the operating mode, we report an improvement ranging from +72% to +93% over the performance of a rule-based benchmark. Our study concludes with a comprehensive analysis of future research directions and actions to be taken towards full operability. We discuss probabilistic forecasting approaches from a cost-sensitive learning perspective, along with performance-centric model monitoring. Finally, through the lens of the conformal prediction framework, we further comment on the uncertainty quantification for end-user risk management and mitigation. |
---|---|
ISSN: | 2115-7251 |