Stacking Ensemble Learning and SHAP-Based Insights for Urban Air Quality Forecasting: Evidence from Shenyang and Global Implications

Air pollution poses a significant global challenge, impacting human health and environmental sustainability worldwide. Accurate air quality forecasting is essential for effective mitigation strategies, particularly in rapidly urbanizing regions. This study focuses on Shenyang, China, as a representa...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhaoxin Xu, Huajian Zhang, Andong Zhai, Chunyu Kong, Jinping Zhang
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Atmosphere
Subjects:
Online Access:https://www.mdpi.com/2073-4433/16/7/776
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839616515196846080
author Zhaoxin Xu
Huajian Zhang
Andong Zhai
Chunyu Kong
Jinping Zhang
author_facet Zhaoxin Xu
Huajian Zhang
Andong Zhai
Chunyu Kong
Jinping Zhang
author_sort Zhaoxin Xu
collection DOAJ
description Air pollution poses a significant global challenge, impacting human health and environmental sustainability worldwide. Accurate air quality forecasting is essential for effective mitigation strategies, particularly in rapidly urbanizing regions. This study focuses on Shenyang, China, as a representative case to analyze air quality dynamics and develop a high-precision forecasting tool. Using a comprehensive six-year dataset (2020–2025) of daily air quality and meteorological measurements, a rigorous preprocessing pipeline was applied to ensure data integrity. Five gradient-boosted decision-tree models were trained and combined through a ridge-regularized stacking ensemble to enhance the predictive accuracy. The ensemble achieved an R<sup>2</sup> of 94.17% and a mean absolute percentage error of 7.79%, outperforming individual models. The feature importance analysis revealed that ozone, PM<sub>10</sub>, and PM<sub>2.5</sub> concentrations are the dominant drivers of daily air quality fluctuations. The resulting forecasting system delivers robust, interpretable predictions across seasonal variations, offering a valuable decision support tool for urban air quality management. This framework demonstrates how advanced machine learning techniques can be applied in a Chinese urban context to inform global air pollution mitigation efforts.
format Article
id doaj-art-0f6c1b12a85b45a8b7eeba62fca33a86
institution Matheson Library
issn 2073-4433
language English
publishDate 2025-06-01
publisher MDPI AG
record_format Article
series Atmosphere
spelling doaj-art-0f6c1b12a85b45a8b7eeba62fca33a862025-07-25T13:13:21ZengMDPI AGAtmosphere2073-44332025-06-0116777610.3390/atmos16070776Stacking Ensemble Learning and SHAP-Based Insights for Urban Air Quality Forecasting: Evidence from Shenyang and Global ImplicationsZhaoxin Xu0Huajian Zhang1Andong Zhai2Chunyu Kong3Jinping Zhang4School of Mechanical and Power Engineering, Shenyang University of Chemical Technology, Shenyang 110142, ChinaSchool of Mechanical and Power Engineering, Shenyang University of Chemical Technology, Shenyang 110142, ChinaDepartment of Electrical Engineering and Information Technology, Shandong University of Science and Technology, Qingdao 266042, ChinaSchool of Mechanical and Power Engineering, Shenyang University of Chemical Technology, Shenyang 110142, ChinaSchool of Mechanical and Electrical Engineering, Yunnan Open University, Kunming 650223, ChinaAir pollution poses a significant global challenge, impacting human health and environmental sustainability worldwide. Accurate air quality forecasting is essential for effective mitigation strategies, particularly in rapidly urbanizing regions. This study focuses on Shenyang, China, as a representative case to analyze air quality dynamics and develop a high-precision forecasting tool. Using a comprehensive six-year dataset (2020–2025) of daily air quality and meteorological measurements, a rigorous preprocessing pipeline was applied to ensure data integrity. Five gradient-boosted decision-tree models were trained and combined through a ridge-regularized stacking ensemble to enhance the predictive accuracy. The ensemble achieved an R<sup>2</sup> of 94.17% and a mean absolute percentage error of 7.79%, outperforming individual models. The feature importance analysis revealed that ozone, PM<sub>10</sub>, and PM<sub>2.5</sub> concentrations are the dominant drivers of daily air quality fluctuations. The resulting forecasting system delivers robust, interpretable predictions across seasonal variations, offering a valuable decision support tool for urban air quality management. This framework demonstrates how advanced machine learning techniques can be applied in a Chinese urban context to inform global air pollution mitigation efforts.https://www.mdpi.com/2073-4433/16/7/776AQI prediction modelKalman filteringspecial engineeringK-fold cross validationmulti model stacking fusion
spellingShingle Zhaoxin Xu
Huajian Zhang
Andong Zhai
Chunyu Kong
Jinping Zhang
Stacking Ensemble Learning and SHAP-Based Insights for Urban Air Quality Forecasting: Evidence from Shenyang and Global Implications
Atmosphere
AQI prediction model
Kalman filtering
special engineering
K-fold cross validation
multi model stacking fusion
title Stacking Ensemble Learning and SHAP-Based Insights for Urban Air Quality Forecasting: Evidence from Shenyang and Global Implications
title_full Stacking Ensemble Learning and SHAP-Based Insights for Urban Air Quality Forecasting: Evidence from Shenyang and Global Implications
title_fullStr Stacking Ensemble Learning and SHAP-Based Insights for Urban Air Quality Forecasting: Evidence from Shenyang and Global Implications
title_full_unstemmed Stacking Ensemble Learning and SHAP-Based Insights for Urban Air Quality Forecasting: Evidence from Shenyang and Global Implications
title_short Stacking Ensemble Learning and SHAP-Based Insights for Urban Air Quality Forecasting: Evidence from Shenyang and Global Implications
title_sort stacking ensemble learning and shap based insights for urban air quality forecasting evidence from shenyang and global implications
topic AQI prediction model
Kalman filtering
special engineering
K-fold cross validation
multi model stacking fusion
url https://www.mdpi.com/2073-4433/16/7/776
work_keys_str_mv AT zhaoxinxu stackingensemblelearningandshapbasedinsightsforurbanairqualityforecastingevidencefromshenyangandglobalimplications
AT huajianzhang stackingensemblelearningandshapbasedinsightsforurbanairqualityforecastingevidencefromshenyangandglobalimplications
AT andongzhai stackingensemblelearningandshapbasedinsightsforurbanairqualityforecastingevidencefromshenyangandglobalimplications
AT chunyukong stackingensemblelearningandshapbasedinsightsforurbanairqualityforecastingevidencefromshenyangandglobalimplications
AT jinpingzhang stackingensemblelearningandshapbasedinsightsforurbanairqualityforecastingevidencefromshenyangandglobalimplications