gamUnet: designing global attention-based CNN architectures for enhanced oral cancer detection and segmentation
IntroductionOral squamous cell carcinoma (OSCC) is a significant global health burden, where timely and accurate diagnosis is essential for improved patient outcomes. Conventional diagnosis relies on manual evaluation of hematoxylin and eosin (H&E)-stained slides, a time-consuming process re...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2025-07-01
|
Series: | Frontiers in Medicine |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fmed.2025.1582439/full |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1839619582555324416 |
---|---|
author | Jinyang Zhang Hongxin Ding Hongxin Ding Runchuan Zhu Weibin Liao Weibin Liao Junfeng Zhao Junfeng Zhao Min Gao Xiaoyun Zhang |
author_facet | Jinyang Zhang Hongxin Ding Hongxin Ding Runchuan Zhu Weibin Liao Weibin Liao Junfeng Zhao Junfeng Zhao Min Gao Xiaoyun Zhang |
author_sort | Jinyang Zhang |
collection | DOAJ |
description | IntroductionOral squamous cell carcinoma (OSCC) is a significant global health burden, where timely and accurate diagnosis is essential for improved patient outcomes. Conventional diagnosis relies on manual evaluation of hematoxylin and eosin (H&E)-stained slides, a time-consuming process requiring specialized expertise and prone to variability. While deep learning methods, especially convolutional neural networks (CNNs), have advanced automated analysis of histopathological images for cancerous tissues in various body parts, OSCC presents unique challenges. Its infiltrative growth patterns and poorly defined boundaries, coupled with the complex architecture of the oral cavity, make accurate segmentation particularly difficult. Traditional CNNs which sturggle to capture critical global contextual information often fail to distinguish the complex tissue structures in OSCC images.MethodsTo address these challenges, we propose a novel architecture called gamUnet, which integrates the Global Attention Mechanism (GAM) to enhance the model's ability to capture global cross-modal information. This allows the model to focus on key diagnostic regions while retaining detailed spatial information. Additionally, we introduce an extended model, gamResNet, to further improve OSCC detection performance. Both architectures show significant improvements in handling the unique challenges of oral cancer images.ResultsExtensive experiments on public datasets show that our GAM-enhanced architecture significantly outperforms conventional models, achieving superior accuracy, robustness, and efficiency in OSCC diagnosis.DiscussionOur approach provides an effective tool for clinicians in diagnosing OSCC, reducing diagnostic variability, and ultimately contributing to improved patient care and treatment planning. |
format | Article |
id | doaj-art-f8bc0ffd25f5451b8748a0272f0e61c0 |
institution | Matheson Library |
issn | 2296-858X |
language | English |
publishDate | 2025-07-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Medicine |
spelling | doaj-art-f8bc0ffd25f5451b8748a0272f0e61c02025-07-23T05:35:38ZengFrontiers Media S.A.Frontiers in Medicine2296-858X2025-07-011210.3389/fmed.2025.15824391582439gamUnet: designing global attention-based CNN architectures for enhanced oral cancer detection and segmentationJinyang Zhang0Hongxin Ding1Hongxin Ding2Runchuan Zhu3Weibin Liao4Weibin Liao5Junfeng Zhao6Junfeng Zhao7Min Gao8Xiaoyun Zhang9School of Computer Science, Peking University, Beijing, ChinaSchool of Computer Science, Peking University, Beijing, ChinaKey Laboratory of High Confidence Software Technologies, Ministry of Education, Peking University, Beijing, ChinaSchool of Computer Science, Peking University, Beijing, ChinaSchool of Computer Science, Peking University, Beijing, ChinaKey Laboratory of High Confidence Software Technologies, Ministry of Education, Peking University, Beijing, ChinaSchool of Computer Science, Peking University, Beijing, ChinaKey Laboratory of High Confidence Software Technologies, Ministry of Education, Peking University, Beijing, ChinaSchool and Hospital of Stomatology, Peking University, Beijing, ChinaSchool and Hospital of Stomatology, Peking University, Beijing, ChinaIntroductionOral squamous cell carcinoma (OSCC) is a significant global health burden, where timely and accurate diagnosis is essential for improved patient outcomes. Conventional diagnosis relies on manual evaluation of hematoxylin and eosin (H&E)-stained slides, a time-consuming process requiring specialized expertise and prone to variability. While deep learning methods, especially convolutional neural networks (CNNs), have advanced automated analysis of histopathological images for cancerous tissues in various body parts, OSCC presents unique challenges. Its infiltrative growth patterns and poorly defined boundaries, coupled with the complex architecture of the oral cavity, make accurate segmentation particularly difficult. Traditional CNNs which sturggle to capture critical global contextual information often fail to distinguish the complex tissue structures in OSCC images.MethodsTo address these challenges, we propose a novel architecture called gamUnet, which integrates the Global Attention Mechanism (GAM) to enhance the model's ability to capture global cross-modal information. This allows the model to focus on key diagnostic regions while retaining detailed spatial information. Additionally, we introduce an extended model, gamResNet, to further improve OSCC detection performance. Both architectures show significant improvements in handling the unique challenges of oral cancer images.ResultsExtensive experiments on public datasets show that our GAM-enhanced architecture significantly outperforms conventional models, achieving superior accuracy, robustness, and efficiency in OSCC diagnosis.DiscussionOur approach provides an effective tool for clinicians in diagnosing OSCC, reducing diagnostic variability, and ultimately contributing to improved patient care and treatment planning.https://www.frontiersin.org/articles/10.3389/fmed.2025.1582439/fulloral squamous cell carcinoma (OSCC)segmentationimage processingimage classificationconvolutional neural networksdeep learning–artificial intelligence |
spellingShingle | Jinyang Zhang Hongxin Ding Hongxin Ding Runchuan Zhu Weibin Liao Weibin Liao Junfeng Zhao Junfeng Zhao Min Gao Xiaoyun Zhang gamUnet: designing global attention-based CNN architectures for enhanced oral cancer detection and segmentation Frontiers in Medicine oral squamous cell carcinoma (OSCC) segmentation image processing image classification convolutional neural networks deep learning–artificial intelligence |
title | gamUnet: designing global attention-based CNN architectures for enhanced oral cancer detection and segmentation |
title_full | gamUnet: designing global attention-based CNN architectures for enhanced oral cancer detection and segmentation |
title_fullStr | gamUnet: designing global attention-based CNN architectures for enhanced oral cancer detection and segmentation |
title_full_unstemmed | gamUnet: designing global attention-based CNN architectures for enhanced oral cancer detection and segmentation |
title_short | gamUnet: designing global attention-based CNN architectures for enhanced oral cancer detection and segmentation |
title_sort | gamunet designing global attention based cnn architectures for enhanced oral cancer detection and segmentation |
topic | oral squamous cell carcinoma (OSCC) segmentation image processing image classification convolutional neural networks deep learning–artificial intelligence |
url | https://www.frontiersin.org/articles/10.3389/fmed.2025.1582439/full |
work_keys_str_mv | AT jinyangzhang gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation AT hongxinding gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation AT hongxinding gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation AT runchuanzhu gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation AT weibinliao gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation AT weibinliao gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation AT junfengzhao gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation AT junfengzhao gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation AT mingao gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation AT xiaoyunzhang gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation |