A Novel Approach Utilizing Bagging, Histogram Gradient Boosting, and Advanced Feature Selection for Predicting the Onset of Cardiovascular Diseases

Cardiovascular diseases (CVDs) rank among the leading global causes of mortality, underscoring the necessity for early detection and effective management. This research presents a novel prediction model for CVDs utilizing a bagging algorithm that incorporates histogram gradient boosting as the estim...

Full description

Saved in:
Bibliographic Details
Main Authors: Norma Latif Fitriyani, Muhammad Syafrudin, Nur Chamidah, Marisa Rifada, Hendri Susilo, Dursun Aydin, Syifa Latif Qolbiyani, Seung Won Lee
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/13/13/2194
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839631698104418304
author Norma Latif Fitriyani
Muhammad Syafrudin
Nur Chamidah
Marisa Rifada
Hendri Susilo
Dursun Aydin
Syifa Latif Qolbiyani
Seung Won Lee
author_facet Norma Latif Fitriyani
Muhammad Syafrudin
Nur Chamidah
Marisa Rifada
Hendri Susilo
Dursun Aydin
Syifa Latif Qolbiyani
Seung Won Lee
author_sort Norma Latif Fitriyani
collection DOAJ
description Cardiovascular diseases (CVDs) rank among the leading global causes of mortality, underscoring the necessity for early detection and effective management. This research presents a novel prediction model for CVDs utilizing a bagging algorithm that incorporates histogram gradient boosting as the estimator. This study leverages three preprocessed cardiovascular datasets, employing the Local Outlier Factor technique for outlier removal and the information gain method for feature selection. Through rigorous experimentation, the proposed model demonstrates superior performance compared to conventional machine learning approaches, such as Logistic Regression, Support Vector Classification, Gaussian Naïve Bayes, Multi-Layer Perceptron, k-nearest neighbors, Random Forest, AdaBoost, gradient boosting, and histogram gradient boosting. Evaluation metrics, including precision, recall, F1 score, accuracy, and AUC, yielded impressive results: 93.90%, 98.83%, 96.30%, 96.25%, and 0.9916 for dataset I; 94.17%, 99.05%, 96.54%, 96.48%, and 0.9931 for dataset II; and 89.81%, 82.40%, 85.91%, 86.66%, and 0.9274 for dataset III. The findings indicate that the proposed prediction model has the potential to facilitate early CVD detection, thereby enhancing preventive strategies and improving patient outcomes.
format Article
id doaj-art-d5e10fec2e2c4291b1dec0f42bfd9b7c
institution Matheson Library
issn 2227-7390
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj-art-d5e10fec2e2c4291b1dec0f42bfd9b7c2025-07-11T14:40:42ZengMDPI AGMathematics2227-73902025-07-011313219410.3390/math13132194A Novel Approach Utilizing Bagging, Histogram Gradient Boosting, and Advanced Feature Selection for Predicting the Onset of Cardiovascular DiseasesNorma Latif Fitriyani0Muhammad Syafrudin1Nur Chamidah2Marisa Rifada3Hendri Susilo4Dursun Aydin5Syifa Latif Qolbiyani6Seung Won Lee7Department of Artificial Intelligence and Data Science, Sejong University, Seoul 05006, Republic of KoreaDepartment of Artificial Intelligence and Data Science, Sejong University, Seoul 05006, Republic of KoreaDepartment of Mathematics, Faculty of Science and Technology, Airlangga University, Surabaya 60115, IndonesiaDepartment of Mathematics, Faculty of Science and Technology, Airlangga University, Surabaya 60115, IndonesiaDepartment of Cardiology and Vascular Medicine, Faculty of Medicine, Airlangga University, Surabaya 60286, IndonesiaDepartment of Statistics, Faculty of Science, Muğla Sıtkı Koçman University, Muğla 48000, TurkeyDepartment of Community Development, Universitas Sebelas Maret, Surakarta 57126, IndonesiaDepartment of Precision Medicine, Sungkyunkwan University School of Medicine, Suwon 16419, Republic of KoreaCardiovascular diseases (CVDs) rank among the leading global causes of mortality, underscoring the necessity for early detection and effective management. This research presents a novel prediction model for CVDs utilizing a bagging algorithm that incorporates histogram gradient boosting as the estimator. This study leverages three preprocessed cardiovascular datasets, employing the Local Outlier Factor technique for outlier removal and the information gain method for feature selection. Through rigorous experimentation, the proposed model demonstrates superior performance compared to conventional machine learning approaches, such as Logistic Regression, Support Vector Classification, Gaussian Naïve Bayes, Multi-Layer Perceptron, k-nearest neighbors, Random Forest, AdaBoost, gradient boosting, and histogram gradient boosting. Evaluation metrics, including precision, recall, F1 score, accuracy, and AUC, yielded impressive results: 93.90%, 98.83%, 96.30%, 96.25%, and 0.9916 for dataset I; 94.17%, 99.05%, 96.54%, 96.48%, and 0.9931 for dataset II; and 89.81%, 82.40%, 85.91%, 86.66%, and 0.9274 for dataset III. The findings indicate that the proposed prediction model has the potential to facilitate early CVD detection, thereby enhancing preventive strategies and improving patient outcomes.https://www.mdpi.com/2227-7390/13/13/2194machine learningbagging algorithmhistogram gradient boostinglocal outlier factorinformation gain
spellingShingle Norma Latif Fitriyani
Muhammad Syafrudin
Nur Chamidah
Marisa Rifada
Hendri Susilo
Dursun Aydin
Syifa Latif Qolbiyani
Seung Won Lee
A Novel Approach Utilizing Bagging, Histogram Gradient Boosting, and Advanced Feature Selection for Predicting the Onset of Cardiovascular Diseases
Mathematics
machine learning
bagging algorithm
histogram gradient boosting
local outlier factor
information gain
title A Novel Approach Utilizing Bagging, Histogram Gradient Boosting, and Advanced Feature Selection for Predicting the Onset of Cardiovascular Diseases
title_full A Novel Approach Utilizing Bagging, Histogram Gradient Boosting, and Advanced Feature Selection for Predicting the Onset of Cardiovascular Diseases
title_fullStr A Novel Approach Utilizing Bagging, Histogram Gradient Boosting, and Advanced Feature Selection for Predicting the Onset of Cardiovascular Diseases
title_full_unstemmed A Novel Approach Utilizing Bagging, Histogram Gradient Boosting, and Advanced Feature Selection for Predicting the Onset of Cardiovascular Diseases
title_short A Novel Approach Utilizing Bagging, Histogram Gradient Boosting, and Advanced Feature Selection for Predicting the Onset of Cardiovascular Diseases
title_sort novel approach utilizing bagging histogram gradient boosting and advanced feature selection for predicting the onset of cardiovascular diseases
topic machine learning
bagging algorithm
histogram gradient boosting
local outlier factor
information gain
url https://www.mdpi.com/2227-7390/13/13/2194
work_keys_str_mv AT normalatiffitriyani anovelapproachutilizingbagginghistogramgradientboostingandadvancedfeatureselectionforpredictingtheonsetofcardiovasculardiseases
AT muhammadsyafrudin anovelapproachutilizingbagginghistogramgradientboostingandadvancedfeatureselectionforpredictingtheonsetofcardiovasculardiseases
AT nurchamidah anovelapproachutilizingbagginghistogramgradientboostingandadvancedfeatureselectionforpredictingtheonsetofcardiovasculardiseases
AT marisarifada anovelapproachutilizingbagginghistogramgradientboostingandadvancedfeatureselectionforpredictingtheonsetofcardiovasculardiseases
AT hendrisusilo anovelapproachutilizingbagginghistogramgradientboostingandadvancedfeatureselectionforpredictingtheonsetofcardiovasculardiseases
AT dursunaydin anovelapproachutilizingbagginghistogramgradientboostingandadvancedfeatureselectionforpredictingtheonsetofcardiovasculardiseases
AT syifalatifqolbiyani anovelapproachutilizingbagginghistogramgradientboostingandadvancedfeatureselectionforpredictingtheonsetofcardiovasculardiseases
AT seungwonlee anovelapproachutilizingbagginghistogramgradientboostingandadvancedfeatureselectionforpredictingtheonsetofcardiovasculardiseases
AT normalatiffitriyani novelapproachutilizingbagginghistogramgradientboostingandadvancedfeatureselectionforpredictingtheonsetofcardiovasculardiseases
AT muhammadsyafrudin novelapproachutilizingbagginghistogramgradientboostingandadvancedfeatureselectionforpredictingtheonsetofcardiovasculardiseases
AT nurchamidah novelapproachutilizingbagginghistogramgradientboostingandadvancedfeatureselectionforpredictingtheonsetofcardiovasculardiseases
AT marisarifada novelapproachutilizingbagginghistogramgradientboostingandadvancedfeatureselectionforpredictingtheonsetofcardiovasculardiseases
AT hendrisusilo novelapproachutilizingbagginghistogramgradientboostingandadvancedfeatureselectionforpredictingtheonsetofcardiovasculardiseases
AT dursunaydin novelapproachutilizingbagginghistogramgradientboostingandadvancedfeatureselectionforpredictingtheonsetofcardiovasculardiseases
AT syifalatifqolbiyani novelapproachutilizingbagginghistogramgradientboostingandadvancedfeatureselectionforpredictingtheonsetofcardiovasculardiseases
AT seungwonlee novelapproachutilizingbagginghistogramgradientboostingandadvancedfeatureselectionforpredictingtheonsetofcardiovasculardiseases