Auxiliary Diagnosis of Pulmonary Nodules’ Benignancy and Malignancy Based on Machine Learning: A Retrospective Study

Wanling Wang,1 Bingqing Yang,2 Huan Wu,1 Hebin Che,1 Yue Tong,1 Bozun Zhang,1 Hongwu Liu,3,* Yuanyuan Chen1,* 1Medical Innovation Research Department of PLA General Hospital, Beijing, People’s Republic of China; 2Goodwill Hessian Health Technology Co. Ltd, Beijing, People’s R...

Full description

Saved in:
Bibliographic Details
Main Authors: Wang W, Yang B, Wu H, Che H, Tong Y, Zhang B, Liu H, Chen Y
Format: Article
Language:English
Published: Dove Medical Press 2025-06-01
Series:Journal of Multidisciplinary Healthcare
Subjects:
Online Access:https://www.dovepress.com/auxiliary-diagnosis-of-pulmonary-nodules-benignancy-and-malignancy-bas-peer-reviewed-fulltext-article-JMDH
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Wanling Wang,1 Bingqing Yang,2 Huan Wu,1 Hebin Che,1 Yue Tong,1 Bozun Zhang,1 Hongwu Liu,3,* Yuanyuan Chen1,* 1Medical Innovation Research Department of PLA General Hospital, Beijing, People’s Republic of China; 2Goodwill Hessian Health Technology Co. Ltd, Beijing, People’s Republic of China; 3Department of Pulmonary and Critical Care Medicine, the Seventh Medical Center of Chinese PLA General Hospital, Beijing, People’s Republic of China*These authors contributed equally to this workCorrespondence: Yuanyuan Chen, Email charry135@163.com Hongwu Liu, Email liuhw1005@163.comBackground: Lung cancer, one of the most lethal malignancies globally, often presents insidiously as pulmonary nodules. Its nonspecific clinical presentation and heterogeneous imaging characteristics hinder accurate differentiation between benign and malignant lesions, while biopsy’s invasiveness and procedural constraints underscore the critical need for non-invasive early diagnostic approaches.Methods: In this retrospective study, we analyzed outpatient and inpatient records from the First Medical Center of Chinese PLA General Hospital between 2011 and 2021, focusing on pulmonary nodules measuring 5– 30mm on CT scans without overt signs of malignancy. Pathological examination served as the reference standard. Comparative experiments evaluated SVM, RF, XGBoost, FNN, and Atten_FNN using five-fold cross-validation to assess AUC, sensitivity, and specificity. The dataset was split 70%/30%, and stratified five-fold cross-validation was applied to the training set. The optimal model was interpreted with SHAP to identify the most influential predictive features.Results: This study enrolled 3355 patients, including 1156 with benign and 2199 with malignant pulmonary nodules. The Atten_FNN model demonstrated superior performance in five-fold cross-validation, achieving an AUC of 0.82, accuracy of 0.75, sensitivity of 0.77, and F1 score of 0.80. SHAP analysis revealed key predictive factors: demographic variables (age, sex, BMI), CT-derived features (maximum nodule diameter, morphology, density, calcification, ground-glass opacity), and laboratory biomarkers (neuroendocrine markers, carcinoembryonic antigen).Conclusion: This study integrates electronic medical records and pathology data to predict pulmonary nodule malignancy using machine/deep learning models. SHAP-based interpretability analysis uncovered key clinical determinants. Acknowledging limitations in cross-center generalizability, we propose the development of a multimodal diagnostic systems that combines CT imaging and radiomics, to be validated in multi-center prospective cohorts to facilitate clinical translation. This framework establishes a novel paradigm for early precision diagnosis of lung cancer.Keywords: pulmonary nodules, benignancy, malignancy, machine learning, risk factors
ISSN:1178-2390