Hybrid model for cleaning abnormal data of wind turbine power curve based on machine learning approaches

This paper addresses important challenges in wind energy prediction caused by outliers in wind data, which distort the wind turbine power curve and lead to inaccurate performance assessments and suboptimal operation strategies. The major difficulty here is detecting and eliminating these outliers fr...

Full description

Saved in:
Bibliographic Details
Main Authors: Abdelwahab Ayash Subuh, S. Hr. Aghay Kaboli, Muhammad Waqar, François Vallée
Format: Article
Language:English
Published: Elsevier 2025-09-01
Series:e-Prime: Advances in Electrical Engineering, Electronics and Energy
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2772671125001500
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839655898523369472
author Abdelwahab Ayash Subuh
S. Hr. Aghay Kaboli
Muhammad Waqar
François Vallée
author_facet Abdelwahab Ayash Subuh
S. Hr. Aghay Kaboli
Muhammad Waqar
François Vallée
author_sort Abdelwahab Ayash Subuh
collection DOAJ
description This paper addresses important challenges in wind energy prediction caused by outliers in wind data, which distort the wind turbine power curve and lead to inaccurate performance assessments and suboptimal operation strategies. The major difficulty here is detecting and eliminating these outliers from complex wind datasets, as inaccurate data can significantly impact forecasting and related activities. To overcome this challenge, the paper proposes a hybrid model combining fuzzy C-means clustering, Mahalanobis distance, and Artificial Neural Networks (ANN) to detect and remove outliers far more accurately than any individual method or other traditional hybrid method, decreasing false alarms and misses. It improves data quality and boosts the reliability of turbine performance analysis, resource assessment, and forecasting, supporting more efficient and sustainable wind-power operations. The results show (1) that the proposed hybrid model achieves 15.4 % more accuracy than the other traditional hybrid models in detecting and removing outliers. (2) The proposed hybrid model gives an overall ≈ 116.1 % improvement in outlier-detection accuracy over the individual models. (3) Adding the ANN to the proposed hybrid model boosts the outlier-detection accuracy to about a 69.5 % relative improvement. (4) Detecting and cleaning outliers by the proposed hybrid model cuts the RMSE from 2.38 to 1.27, reducing prediction error by 46.6 %. (5) The advanced hybrid model used in this study for comparison purposes achieves nearly identical accuracy to the proposed hybrid model; it reduces RMSE by ∼0.015 and MAPE by ∼0.04 pp and boosts R² by ∼0.001 while maintaining almost perfect outlier detection (99 % vs. 100 %). Although the advanced model offers a marginal edge in reconstruction quality, the lightweight, scalable proposed hybrid model remains better appropriate for real-world deployment due to its lower computational overhead and more straightforward maintenance.
format Article
id doaj-art-1cb94bfdb655417a8c2f2a5eca75d85c
institution Matheson Library
issn 2772-6711
language English
publishDate 2025-09-01
publisher Elsevier
record_format Article
series e-Prime: Advances in Electrical Engineering, Electronics and Energy
spelling doaj-art-1cb94bfdb655417a8c2f2a5eca75d85c2025-06-25T04:52:47ZengElseviere-Prime: Advances in Electrical Engineering, Electronics and Energy2772-67112025-09-0113101043Hybrid model for cleaning abnormal data of wind turbine power curve based on machine learning approachesAbdelwahab Ayash Subuh0S. Hr. Aghay Kaboli1Muhammad Waqar2François Vallée3Power Systems and Markets Research Group, University of Mons, 7000 Mons, Belgium; Corresponding author.Power Systems and Markets Research Group, University of Mons, 7000 Mons, BelgiumState company of electricity production/Northern region, IraqPower Systems and Markets Research Group, University of Mons, 7000 Mons, BelgiumThis paper addresses important challenges in wind energy prediction caused by outliers in wind data, which distort the wind turbine power curve and lead to inaccurate performance assessments and suboptimal operation strategies. The major difficulty here is detecting and eliminating these outliers from complex wind datasets, as inaccurate data can significantly impact forecasting and related activities. To overcome this challenge, the paper proposes a hybrid model combining fuzzy C-means clustering, Mahalanobis distance, and Artificial Neural Networks (ANN) to detect and remove outliers far more accurately than any individual method or other traditional hybrid method, decreasing false alarms and misses. It improves data quality and boosts the reliability of turbine performance analysis, resource assessment, and forecasting, supporting more efficient and sustainable wind-power operations. The results show (1) that the proposed hybrid model achieves 15.4 % more accuracy than the other traditional hybrid models in detecting and removing outliers. (2) The proposed hybrid model gives an overall ≈ 116.1 % improvement in outlier-detection accuracy over the individual models. (3) Adding the ANN to the proposed hybrid model boosts the outlier-detection accuracy to about a 69.5 % relative improvement. (4) Detecting and cleaning outliers by the proposed hybrid model cuts the RMSE from 2.38 to 1.27, reducing prediction error by 46.6 %. (5) The advanced hybrid model used in this study for comparison purposes achieves nearly identical accuracy to the proposed hybrid model; it reduces RMSE by ∼0.015 and MAPE by ∼0.04 pp and boosts R² by ∼0.001 while maintaining almost perfect outlier detection (99 % vs. 100 %). Although the advanced model offers a marginal edge in reconstruction quality, the lightweight, scalable proposed hybrid model remains better appropriate for real-world deployment due to its lower computational overhead and more straightforward maintenance.http://www.sciencedirect.com/science/article/pii/S2772671125001500Wind turbine power curveOutliersFuzzy c-means clusteringMahalanobis distanceArtificial neural networksSupport vector regression
spellingShingle Abdelwahab Ayash Subuh
S. Hr. Aghay Kaboli
Muhammad Waqar
François Vallée
Hybrid model for cleaning abnormal data of wind turbine power curve based on machine learning approaches
e-Prime: Advances in Electrical Engineering, Electronics and Energy
Wind turbine power curve
Outliers
Fuzzy c-means clustering
Mahalanobis distance
Artificial neural networks
Support vector regression
title Hybrid model for cleaning abnormal data of wind turbine power curve based on machine learning approaches
title_full Hybrid model for cleaning abnormal data of wind turbine power curve based on machine learning approaches
title_fullStr Hybrid model for cleaning abnormal data of wind turbine power curve based on machine learning approaches
title_full_unstemmed Hybrid model for cleaning abnormal data of wind turbine power curve based on machine learning approaches
title_short Hybrid model for cleaning abnormal data of wind turbine power curve based on machine learning approaches
title_sort hybrid model for cleaning abnormal data of wind turbine power curve based on machine learning approaches
topic Wind turbine power curve
Outliers
Fuzzy c-means clustering
Mahalanobis distance
Artificial neural networks
Support vector regression
url http://www.sciencedirect.com/science/article/pii/S2772671125001500
work_keys_str_mv AT abdelwahabayashsubuh hybridmodelforcleaningabnormaldataofwindturbinepowercurvebasedonmachinelearningapproaches
AT shraghaykaboli hybridmodelforcleaningabnormaldataofwindturbinepowercurvebasedonmachinelearningapproaches
AT muhammadwaqar hybridmodelforcleaningabnormaldataofwindturbinepowercurvebasedonmachinelearningapproaches
AT francoisvallee hybridmodelforcleaningabnormaldataofwindturbinepowercurvebasedonmachinelearningapproaches