Worldwide estimation of monthly global and diffuse horizontal irradiation via machine learning
Accurate prediction of solar irradiation is critical for photovoltaic system design, energy forecasting, and planning. This study evaluates the performance of seven machine learning models in predicting Global Horizontal Irradiation (GHI) and Diffuse Horizontal Irradiation (DHI) using a monthly-reso...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2025-07-01
|
Series: | Energy Conversion and Management: X |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2590174525003010 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Accurate prediction of solar irradiation is critical for photovoltaic system design, energy forecasting, and planning. This study evaluates the performance of seven machine learning models in predicting Global Horizontal Irradiation (GHI) and Diffuse Horizontal Irradiation (DHI) using a monthly-resolution dataset sourced from the Photovoltaic Geographical Information System (PVGIS), covering 721 globally distributed locations. Five input features used per task from a six-feature dataset. The Extreme Gradient Boosting (XGB) model achieved the highest overall performance, with test set coefficient of determination (R2) values of 96.88% for GHI and 97.51% for DHI. Seasonal analysis showed the highest accuracy between April and June, with increased errors during the first and last quarters of the year. A full feature combination analysis evaluated all 31 possible input subsets for each prediction task. Results confirmed that including all features produced the best performance but also revealed that the most influential inputs depend on the prediction target. For DHI prediction, GHI was more important than temperature, while for GHI, excluding either had minimal impact. Latitude and the month number consistently appeared in top-performing combinations, highlighting the importance of spatial and seasonal inputs. Satellite-based validation across three cities showed that model accuracy was highly location dependent and demonstrated the value of evaluating multiple performance metrics. In ground-based validation using in-situ measurements from Amman, Jordan, model rankings shifted, with the Random Forest model achieving the highest accuracy (95.09% R2) despite limited inputs. These findings support global machine learning models while emphasizing the need for regional assessment. |
---|---|
ISSN: | 2590-1745 |