A Hybrid Sequential Feature Selection Approach for Identifying New Potential mRNA Biomarkers for Usher Syndrome Using Machine Learning

Usher syndrome, a rare genetic disorder causing both hearing and vision loss, presents significant diagnostic and therapeutic challenges due to its complex genetic basis. The identification of reliable biomarkers for early detection and intervention is crucial for improving patient outcomes. In this...

Full description

Saved in:
Bibliographic Details
Main Authors: Rama Krishna Thelagathoti, Wesley A. Tom, Dinesh S. Chandel, Chao Jiang, Gary Krzyzanowski, Appolinaire Olou, M. Rohan Fernando
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Biomolecules
Subjects:
Online Access:https://www.mdpi.com/2218-273X/15/7/963
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Usher syndrome, a rare genetic disorder causing both hearing and vision loss, presents significant diagnostic and therapeutic challenges due to its complex genetic basis. The identification of reliable biomarkers for early detection and intervention is crucial for improving patient outcomes. In this study, we present a machine learning-based hybrid sequential feature selection approach to identify key mRNA biomarkers associated with Usher syndrome. Beginning with a dataset of 42,334 mRNA features, our approach successfully reduced dimensionality and identified 58 top mRNA biomarkers that distinguish Usher syndrome from control samples. We employed a combination of feature selection techniques, including variance thresholding, recursive feature elimination, and Lasso regression, integrated within a nested cross-validation framework. The selected biomarkers were further validated using multiple machine learning models, including Logistic Regression, Random Forest, and Support Vector Machines, demonstrating robust classification performance. To assess the biological relevance of the computationally identified mRNA biomarkers, we experimentally validated candidates from the top 10 selected mRNAs using droplet digital PCR (ddPCR). The ddPCR results were consistent with expression patterns observed in the integrated transcriptomic metadata, reinforcing the credibility of our machine learning-driven biomarker discovery framework. Our findings highlight the potential of machine learning-driven biomarker discovery to enhance the detection of Usher syndrome.
ISSN:2218-273X