Optimized Two-Stage Anomaly Detection and Recovery in Smart Grid Data Using Enhanced DeBERTa-v3 Verification System

The increasing sophistication of cyberattacks on smart grid infrastructure demands advanced anomaly detection and recovery systems that balance high recall rates with acceptable precision while providing reliable data restoration capabilities. This study presents an optimized two-stage anomaly detec...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiao Liao, Wei Cui, Min Zhang, Aiwu Zhang, Pan Hu
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/25/13/4208
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The increasing sophistication of cyberattacks on smart grid infrastructure demands advanced anomaly detection and recovery systems that balance high recall rates with acceptable precision while providing reliable data restoration capabilities. This study presents an optimized two-stage anomaly detection and recovery system combining an enhanced TimerXL detector with a DeBERTa-v3-based verification and recovery mechanism. The first stage employs an optimized increment-based detection algorithm achieving 95.0% for recall and 54.8% for precision through multidimensional analysis. The second stage leverages a modified DeBERTa-v3 architecture with comprehensive 25-dimensional feature engineering per variable to verify potential anomalies, improving the precision to 95.1% while maintaining 84.1% for recall. Key innovations include (1) a balanced loss function combining focal loss (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>α</mi></semantics></math></inline-formula> = 0.65, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>γ</mi></semantics></math></inline-formula> = 1.2), Dice loss (weight = 0.5), and contrastive learning (weight = 0.03) to reduce over-rejection by 73.4%; (2) an ensemble verification strategy using multithreshold voting, achieving 91.2% accuracy; (3) optimized sample weighting prioritizing missed positives (weight = 10.0); (4) comprehensive feature extraction, including frequency domain and entropy features; and (5) integration of a generative time series model (TimER) for high-precision recovery of tampered data points. Experimental results on 2000 hourly smart grid measurements demonstrate an F1-score of 0.873 ± 0.114 for detection, representing a 51.4% improvement over ARIMA (0.576), 621% over LSTM-AE (0.121), 791% over standard Anomaly Transformer (0.098), and 904% over TimesNet (0.087). The recovery mechanism achieves remarkably precise restoration with a mean absolute error (MAE) of only 0.0055 kWh, representing a 99.91% improvement compared to traditional ARIMA models and 98.46% compared to standard Anomaly Transformer models. We also explore an alternative implementation using the Lag-LLaMA architecture, which achieves an MAE of 0.2598 kWh. The system maintains real-time capability with a 66.6 ± 7.2 ms inference time, making it suitable for operational deployment. Sensitivity analysis reveals robust performance across anomaly magnitudes (5–100 kWh), with the detection accuracy remaining above 88%.
ISSN:1424-8220