Basrah Score: a novel machine learning-based score for differentiating iron deficiency anemia and beta thalassemia trait using RBC indices

Iron deficiency anemia (IDA) and beta-thalassemia trait (BTT) are prevalent causes of microcytic anemia, often presenting overlapping hematological features that pose diagnostic challenges and necessitate prompt and precise management. Traditional discrimination indices—such as the Mentzer Index, Ih...

Full description

Saved in:
Bibliographic Details
Main Authors: Salma A. Mahmood, Asaad A. Khalaf, Saad S. Hamadi
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-08-01
Series:Frontiers in Big Data
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fdata.2025.1634133/full
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Iron deficiency anemia (IDA) and beta-thalassemia trait (BTT) are prevalent causes of microcytic anemia, often presenting overlapping hematological features that pose diagnostic challenges and necessitate prompt and precise management. Traditional discrimination indices—such as the Mentzer Index, Ihsan's formula, and the England and Fraser criteria—have been extensively applied in both research and clinical settings; however, their diagnostic performance varies considerably across different populations and datasets. This study proposes a novel and interpretable diagnostic model, the Basrah Score, developed using Elastic Net Logistic Regression (ENLR). This machine learning–based approach yields a flexible discrimination function that adapts to variations in clinical and environmental factors. The model was trained and validated on a local dataset of 2,120 individuals (1,080 with IDA and 1,040 with BTT), and was benchmarked against eight conventional indices. The Basrah Score demonstrated superior diagnostic performance, with an accuracy of 96.7%, a sensitivity of 95.0%, and a specificity of 98.6%. These results underscore the importance of incorporating advanced pre-processing techniques, class balancing, hyperparameter optimization, and rigorous cross-validation to ensure the robustness of diagnostic models. Overall, this research highlights the potential of integrating interpretable machine learning models with established clinical parameters to improve diagnostic accuracy in hematological disorders, particularly in resource-constrained settings.
ISSN:2624-909X