Outlier detection method based on K-means
In industry, electric power, transportation and other fields, anomalies are often the precursors of problems or failures in the system. Through anomaly identification techniques, system abnormal behavior can be detected in time to prevent or quickly respond to potential failures and improve system r...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | Chinese |
Published: |
National Computer System Engineering Research Institute of China
2025-05-01
|
Series: | Dianzi Jishu Yingyong |
Subjects: | |
Online Access: | http://www.chinaaet.com/article/3000171650 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In industry, electric power, transportation and other fields, anomalies are often the precursors of problems or failures in the system. Through anomaly identification techniques, system abnormal behavior can be detected in time to prevent or quickly respond to potential failures and improve system reliability and stability. Current anomaly identification algorithms usually need to introduce expert information (e.g., suitable parameter values), but in many identification scenarios, the data distribution as well as the cause of anomaly occurrence are unknown, resulting in unreliable expert information. Therefore, it is significant to design an anomaly identification algorithm that does not require the intervention of expert information. In this paper, an adaptive anomaly identification algorithm is designed. Specifically, it identifies numerous small clusters by K-means, and then counts the distribution probability of the number of objects in each cluster to generate a probability distribution graph. From the probability distribution graph, it can be clearly observed which clusters contain significantly smaller numbers of objects than other clusters, and thus they are recognized as anomalous clusters in which the objects are recognized as anomalies. In other words, the probability distribution graph replaces expert information and assists the user in identifying valid anomalies when the distribution as well as the cause is unknown. |
---|---|
ISSN: | 0258-7998 |