A Multi-Objective Particle Swarm Optimization Approach for Optimizing K-Means Clustering Centroids
The K-Means algorithm is a popular unsupervised learning method used for data clustering. However, its performance heavily depends on centroid initialization and the distribution shape of the data, making it less effective for datasets with complex or non-linear cluster structures. This study evalua...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Ikatan Ahli Informatika Indonesia
2025-06-01
|
Series: | Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) |
Subjects: | |
Online Access: | https://jurnal.iaii.or.id/index.php/RESTI/article/view/6533 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The K-Means algorithm is a popular unsupervised learning method used for data clustering. However, its performance heavily depends on centroid initialization and the distribution shape of the data, making it less effective for datasets with complex or non-linear cluster structures. This study evaluates the performance of the standard K-Means algorithm and proposes a Multiobjective Particle Swarm Optimization K-Means (MOPSO+K-Means) approach to improve clustering accuracy. The evaluation was conducted on five benchmark datasets: Atom, Chainlink, EngyTime, Target, and TwoDiamonds. Experimental results show that K-Means is effective only on datasets with clearly separated clusters, such as EngyTime and TwoDiamonds, achieving accuracies of 95.6% and 100%, respectively. In contrast, MOPSO+K-Means achieved a substantial accuracy improvement on the complex Target dataset, increasing from 0.26% to 59.2%. The TwoDiamonds dataset achieved the most desirable trade-off: it had the lowest SSW (1323.32), relatively high SSB (2863.34), and lowest standard deviation values, indicating compact clusters, good separation, and high consistency across runs. These findings highlight the potential of swarm-based optimization to achieve consistent and accurate clustering results on datasets with varying structural complexity. |
---|---|
ISSN: | 2580-0760 |