Secure K-Means Clustering Scheme for Confidential Data Based on Paillier Cryptosystem
In this paper, we propose a secure homomorphic K-means clustering protocol based on the Paillier cryptosystem to address the urgent need for privacy-preserving clustering techniques in sensitive domains such as healthcare and finance. The protocol uses the additive homomorphism property of the Paill...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-06-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/15/12/6918 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In this paper, we propose a secure homomorphic K-means clustering protocol based on the Paillier cryptosystem to address the urgent need for privacy-preserving clustering techniques in sensitive domains such as healthcare and finance. The protocol uses the additive homomorphism property of the Paillier cryptosystem to perform K-means clustering on the encrypted data, which ensures the confidentiality of the data during the whole calculation process. The protocol consists of three main components: secure computation distance (SCD) protocol, secure cluster assignment (SCA) protocol and secure cluster center update (SUCC) protocol. The SCD protocol securely computes the squared Euclidean distance between the encrypted data point and the encrypted cluster center. The SCA protocol securely assigns data points to clusters based on these cryptographic distances. Finally, the SUCC protocol securely updates the cluster centers without leaking the actual data points as well as the number of intermediate sums. Through security analysis and experimental verification, the effectiveness and practicability of the protocol are proved. This work provides a practical solution for secure clustering based on homomorphic encryption and contributes to the research in the field of privacy-preserving data mining. Although this protocol solves the key problems of secure distance computation, cluster assignment and centroid update, there are still areas for further research. These include optimizing the computational efficiency of the protocol, exploring other homomorphic encryption schemes that may provide better performance, and extending the protocol to handle more complex clustering algorithms. |
---|---|
ISSN: | 2076-3417 |