Improvement of Differential Privacy K-means Clustering Algorithm

In order to address the issues of arbitrary center selection and unreasonable privacy budget allocation leading to poor clustering performance in differential privacy K-means clustering algorithm, a new center selection scheme is designed based on two principles for initial center selection. By calc...

Full description

Saved in:
Bibliographic Details
Main Authors: GUO Rumin, CHEN Xuebin, SHAN Liyang
Format: Article
Language:Chinese
Published: Harbin University of Science and Technology Publications 2024-08-01
Series:Journal of Harbin University of Science and Technology
Subjects:
Online Access:https://hlgxb.hrbust.edu.cn/#/digest?ArticleID=2345
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In order to address the issues of arbitrary center selection and unreasonable privacy budget allocation leading to poor clustering performance in differential privacy K-means clustering algorithm, a new center selection scheme is designed based on two principles for initial center selection. By calculating the minimum privacy budget required for each iteration based on the mean square error between centroids in the original K-means algorithm and the ones in the differential privacy K-means algorithm, a new privacy budget allocation scheme is established in combination with binary search. Comparative experiments on three different feature datasets are conducted to evaluate the improved algorithm. The improved algorithm achieves a 14% increase in F-measure value, not only reducing the impact of added noise on clustering performance but also ensuring the usability of clustering results.
ISSN:1007-2683