Machine Learning Techniques in Chronic Kidney Diseases: A Comparative Study of Classification Model Performance

The kidneys are vital organs responsible for filtering and eliminating toxins from the body. Chronic kidney disease (CKD) is becoming increasingly prevalent, affecting not only older adults but also younger populations. To minimize kidney damage for those at risk, an accurate assessment and monitori...

Full description

Saved in:
Bibliographic Details
Main Authors: Nguyen Dong Phuong, Nguyen Trung Tuyen, Vu Thi Thai Linh, Nghi N Nguyen, Thanh Q Nguyen
Format: Article
Language:English
Published: SAGE Publishing 2025-07-01
Series:Bioinformatics and Biology Insights
Online Access:https://doi.org/10.1177/11779322251356563
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The kidneys are vital organs responsible for filtering and eliminating toxins from the body. Chronic kidney disease (CKD) is becoming increasingly prevalent, affecting not only older adults but also younger populations. To minimize kidney damage for those at risk, an accurate assessment and monitoring of CKD are crucial. Machine learning models can assist physicians in this task by providing fast and accurate detection. As a result, many health care systems have adopted machine learning, especially for disease diagnosis. In this study, we developed a system to support the diagnosis of CKD. The data were collected from the UCL machine learning database, with missing values filled using the “mean/mode” and the “random sampling method.” After data processing, we applied the polynomial technique to generate additional features, allowing the models to be better generalized. Then, we utilized feature-based stratified splitting with K-means and implemented 6 machine learning algorithms (Random Forest, Support Vector Machine [SVM], Naive Bayes, Logistic Regression, K-Nearest Neighbor [KNN], and XGBoost) to compare their performance based on accuracy. Among them, Random Forest, XGBoost, SVM, and logistic regression achieved the highest accuracy of 100%, followed by Naive Bayes (97%) and KNN (93%).
ISSN:1177-9322