Leveraging Large Language Models for Clinical Trial Eligibility Criteria Classification

The advent of transformer technology and large language models (LLMs) has further broadened the already extensive application field of artificial intelligence (AI). A large portion of medical records is stored in text format, such as clinical trial texts. Part of these texts is information regarding...

Full description

Saved in:
Bibliographic Details
Main Authors: Sujan Ray, Arpita Nath Sarker, Neelakshi Chatterjee, Kowshik Bhowmik, Sayantan Dey
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Digital
Subjects:
Online Access:https://www.mdpi.com/2673-6470/5/2/12
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839654268853813248
author Sujan Ray
Arpita Nath Sarker
Neelakshi Chatterjee
Kowshik Bhowmik
Sayantan Dey
author_facet Sujan Ray
Arpita Nath Sarker
Neelakshi Chatterjee
Kowshik Bhowmik
Sayantan Dey
author_sort Sujan Ray
collection DOAJ
description The advent of transformer technology and large language models (LLMs) has further broadened the already extensive application field of artificial intelligence (AI). A large portion of medical records is stored in text format, such as clinical trial texts. Part of these texts is information regarding eligibility criteria. We aimed to harness the immense capabilities of an LLM by fine-tuning an open-source LLM (Llama-2) to develop a classifier from the clinical trial data. We were interested in investigating whether a fine-tuned LLM could better decide the eligibility criteria from the clinical trial text and compare the results with a more traditional method. Such an investigation can help us determine the extent to which we can rely on text-based applications developed from large language models and possibly open new avenues of application in the medical domain. Our results are comparable to the best-performing methods for this task. Since we used state-of-the-art technology, this research has the potential to open new avenues in the field of LLM application in the healthcare sector.
format Article
id doaj-art-dbc0224bd8c44c3a9c0f7b9dd8535288
institution Matheson Library
issn 2673-6470
language English
publishDate 2025-04-01
publisher MDPI AG
record_format Article
series Digital
spelling doaj-art-dbc0224bd8c44c3a9c0f7b9dd85352882025-06-25T13:42:38ZengMDPI AGDigital2673-64702025-04-01521210.3390/digital5020012Leveraging Large Language Models for Clinical Trial Eligibility Criteria ClassificationSujan Ray0Arpita Nath Sarker1Neelakshi Chatterjee2Kowshik Bhowmik3Sayantan Dey4Computer Science and Engineering (EECS), University of Cincinnati, Cincinnati, OH 45221, USADepartment of Biology, University of West Georgia, Carrollton, GA 30118, USADepartment of Biostatistics, Health Informatics and Data Science, University of Cincinnati, Cincinnati, OH 45221, USAComputer Science and Engineering (EECS), University of Cincinnati, Cincinnati, OH 45221, USAComputer Science and Engineering (EECS), University of Cincinnati, Cincinnati, OH 45221, USAThe advent of transformer technology and large language models (LLMs) has further broadened the already extensive application field of artificial intelligence (AI). A large portion of medical records is stored in text format, such as clinical trial texts. Part of these texts is information regarding eligibility criteria. We aimed to harness the immense capabilities of an LLM by fine-tuning an open-source LLM (Llama-2) to develop a classifier from the clinical trial data. We were interested in investigating whether a fine-tuned LLM could better decide the eligibility criteria from the clinical trial text and compare the results with a more traditional method. Such an investigation can help us determine the extent to which we can rely on text-based applications developed from large language models and possibly open new avenues of application in the medical domain. Our results are comparable to the best-performing methods for this task. Since we used state-of-the-art technology, this research has the potential to open new avenues in the field of LLM application in the healthcare sector.https://www.mdpi.com/2673-6470/5/2/12artificial intelligenceclinical trialdeep neural networksfine-tuninglarge language modelsmachine learning
spellingShingle Sujan Ray
Arpita Nath Sarker
Neelakshi Chatterjee
Kowshik Bhowmik
Sayantan Dey
Leveraging Large Language Models for Clinical Trial Eligibility Criteria Classification
Digital
artificial intelligence
clinical trial
deep neural networks
fine-tuning
large language models
machine learning
title Leveraging Large Language Models for Clinical Trial Eligibility Criteria Classification
title_full Leveraging Large Language Models for Clinical Trial Eligibility Criteria Classification
title_fullStr Leveraging Large Language Models for Clinical Trial Eligibility Criteria Classification
title_full_unstemmed Leveraging Large Language Models for Clinical Trial Eligibility Criteria Classification
title_short Leveraging Large Language Models for Clinical Trial Eligibility Criteria Classification
title_sort leveraging large language models for clinical trial eligibility criteria classification
topic artificial intelligence
clinical trial
deep neural networks
fine-tuning
large language models
machine learning
url https://www.mdpi.com/2673-6470/5/2/12
work_keys_str_mv AT sujanray leveraginglargelanguagemodelsforclinicaltrialeligibilitycriteriaclassification
AT arpitanathsarker leveraginglargelanguagemodelsforclinicaltrialeligibilitycriteriaclassification
AT neelakshichatterjee leveraginglargelanguagemodelsforclinicaltrialeligibilitycriteriaclassification
AT kowshikbhowmik leveraginglargelanguagemodelsforclinicaltrialeligibilitycriteriaclassification
AT sayantandey leveraginglargelanguagemodelsforclinicaltrialeligibilitycriteriaclassification