Data Quality Improvement Method for Power Equipment Condition Based on Stacked Denoising Autoencoders Improved by Particle Swarm Optimization

Big data related to power equipment condition is experiencing explosive growth. However, equipment failures and personnel errors result in dirty data, having a negative effect on data quality and subsequent analysis results. Therefore, data cleaning is of great significance. Most existing research f...

Full description

Saved in:
Bibliographic Details
Main Author: JI Rong, HOU Huijuan, SHENG Gehao, ZHANG Lijing, SHU Bo, JIANG Xiuchen
Format: Article
Language:Chinese
Published: Editorial Office of Journal of Shanghai Jiao Tong University 2025-06-01
Series:Shanghai Jiaotong Daxue xuebao
Subjects:
Online Access:https://xuebao.sjtu.edu.cn/article/2025/1006-2467/1006-2467-59-6-780.shtml
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Big data related to power equipment condition is experiencing explosive growth. However, equipment failures and personnel errors result in dirty data, having a negative effect on data quality and subsequent analysis results. Therefore, data cleaning is of great significance. Most existing research focuses on direct identification and elimination of abnormal data, which compromises the integrity of the data. In order to solve this problem, a data cleaning method based on improved stack noise reduction autoencoder is proposed in this paper. First, particle swarm optimization is used to optimize the hyperparameters of the stack noise reduction autoencoder. Then, the characteristics of the autoencoder is used to extract and restore the data features to clean the data. The method improves data quality of power equipment condition by repairing isolated data points and filling in missing data, which is simple and efficient for improving the accuracy and integrity of the data set. Finally, the historical operation data of power equipment is taken as an example. The simulation results show that the proposed method outperforms other classical methods providing good cleaning results for data sets with different abnormal degrees in different running states. The proposed method offers an effective solution for improving the quality of power equipment status data effectively.
ISSN:1006-2467