gamUnet: designing global attention-based CNN architectures for enhanced oral cancer detection and segmentation

IntroductionOral squamous cell carcinoma (OSCC) is a significant global health burden, where timely and accurate diagnosis is essential for improved patient outcomes. Conventional diagnosis relies on manual evaluation of hematoxylin and eosin (H&E)-stained slides, a time-consuming process re...

Full description

Saved in:
Bibliographic Details
Main Authors: Jinyang Zhang, Hongxin Ding, Runchuan Zhu, Weibin Liao, Junfeng Zhao, Min Gao, Xiaoyun Zhang
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-07-01
Series:Frontiers in Medicine
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fmed.2025.1582439/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839619582555324416
author Jinyang Zhang
Hongxin Ding
Hongxin Ding
Runchuan Zhu
Weibin Liao
Weibin Liao
Junfeng Zhao
Junfeng Zhao
Min Gao
Xiaoyun Zhang
author_facet Jinyang Zhang
Hongxin Ding
Hongxin Ding
Runchuan Zhu
Weibin Liao
Weibin Liao
Junfeng Zhao
Junfeng Zhao
Min Gao
Xiaoyun Zhang
author_sort Jinyang Zhang
collection DOAJ
description IntroductionOral squamous cell carcinoma (OSCC) is a significant global health burden, where timely and accurate diagnosis is essential for improved patient outcomes. Conventional diagnosis relies on manual evaluation of hematoxylin and eosin (H&E)-stained slides, a time-consuming process requiring specialized expertise and prone to variability. While deep learning methods, especially convolutional neural networks (CNNs), have advanced automated analysis of histopathological images for cancerous tissues in various body parts, OSCC presents unique challenges. Its infiltrative growth patterns and poorly defined boundaries, coupled with the complex architecture of the oral cavity, make accurate segmentation particularly difficult. Traditional CNNs which sturggle to capture critical global contextual information often fail to distinguish the complex tissue structures in OSCC images.MethodsTo address these challenges, we propose a novel architecture called gamUnet, which integrates the Global Attention Mechanism (GAM) to enhance the model's ability to capture global cross-modal information. This allows the model to focus on key diagnostic regions while retaining detailed spatial information. Additionally, we introduce an extended model, gamResNet, to further improve OSCC detection performance. Both architectures show significant improvements in handling the unique challenges of oral cancer images.ResultsExtensive experiments on public datasets show that our GAM-enhanced architecture significantly outperforms conventional models, achieving superior accuracy, robustness, and efficiency in OSCC diagnosis.DiscussionOur approach provides an effective tool for clinicians in diagnosing OSCC, reducing diagnostic variability, and ultimately contributing to improved patient care and treatment planning.
format Article
id doaj-art-f8bc0ffd25f5451b8748a0272f0e61c0
institution Matheson Library
issn 2296-858X
language English
publishDate 2025-07-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Medicine
spelling doaj-art-f8bc0ffd25f5451b8748a0272f0e61c02025-07-23T05:35:38ZengFrontiers Media S.A.Frontiers in Medicine2296-858X2025-07-011210.3389/fmed.2025.15824391582439gamUnet: designing global attention-based CNN architectures for enhanced oral cancer detection and segmentationJinyang Zhang0Hongxin Ding1Hongxin Ding2Runchuan Zhu3Weibin Liao4Weibin Liao5Junfeng Zhao6Junfeng Zhao7Min Gao8Xiaoyun Zhang9School of Computer Science, Peking University, Beijing, ChinaSchool of Computer Science, Peking University, Beijing, ChinaKey Laboratory of High Confidence Software Technologies, Ministry of Education, Peking University, Beijing, ChinaSchool of Computer Science, Peking University, Beijing, ChinaSchool of Computer Science, Peking University, Beijing, ChinaKey Laboratory of High Confidence Software Technologies, Ministry of Education, Peking University, Beijing, ChinaSchool of Computer Science, Peking University, Beijing, ChinaKey Laboratory of High Confidence Software Technologies, Ministry of Education, Peking University, Beijing, ChinaSchool and Hospital of Stomatology, Peking University, Beijing, ChinaSchool and Hospital of Stomatology, Peking University, Beijing, ChinaIntroductionOral squamous cell carcinoma (OSCC) is a significant global health burden, where timely and accurate diagnosis is essential for improved patient outcomes. Conventional diagnosis relies on manual evaluation of hematoxylin and eosin (H&E)-stained slides, a time-consuming process requiring specialized expertise and prone to variability. While deep learning methods, especially convolutional neural networks (CNNs), have advanced automated analysis of histopathological images for cancerous tissues in various body parts, OSCC presents unique challenges. Its infiltrative growth patterns and poorly defined boundaries, coupled with the complex architecture of the oral cavity, make accurate segmentation particularly difficult. Traditional CNNs which sturggle to capture critical global contextual information often fail to distinguish the complex tissue structures in OSCC images.MethodsTo address these challenges, we propose a novel architecture called gamUnet, which integrates the Global Attention Mechanism (GAM) to enhance the model's ability to capture global cross-modal information. This allows the model to focus on key diagnostic regions while retaining detailed spatial information. Additionally, we introduce an extended model, gamResNet, to further improve OSCC detection performance. Both architectures show significant improvements in handling the unique challenges of oral cancer images.ResultsExtensive experiments on public datasets show that our GAM-enhanced architecture significantly outperforms conventional models, achieving superior accuracy, robustness, and efficiency in OSCC diagnosis.DiscussionOur approach provides an effective tool for clinicians in diagnosing OSCC, reducing diagnostic variability, and ultimately contributing to improved patient care and treatment planning.https://www.frontiersin.org/articles/10.3389/fmed.2025.1582439/fulloral squamous cell carcinoma (OSCC)segmentationimage processingimage classificationconvolutional neural networksdeep learning–artificial intelligence
spellingShingle Jinyang Zhang
Hongxin Ding
Hongxin Ding
Runchuan Zhu
Weibin Liao
Weibin Liao
Junfeng Zhao
Junfeng Zhao
Min Gao
Xiaoyun Zhang
gamUnet: designing global attention-based CNN architectures for enhanced oral cancer detection and segmentation
Frontiers in Medicine
oral squamous cell carcinoma (OSCC)
segmentation
image processing
image classification
convolutional neural networks
deep learning–artificial intelligence
title gamUnet: designing global attention-based CNN architectures for enhanced oral cancer detection and segmentation
title_full gamUnet: designing global attention-based CNN architectures for enhanced oral cancer detection and segmentation
title_fullStr gamUnet: designing global attention-based CNN architectures for enhanced oral cancer detection and segmentation
title_full_unstemmed gamUnet: designing global attention-based CNN architectures for enhanced oral cancer detection and segmentation
title_short gamUnet: designing global attention-based CNN architectures for enhanced oral cancer detection and segmentation
title_sort gamunet designing global attention based cnn architectures for enhanced oral cancer detection and segmentation
topic oral squamous cell carcinoma (OSCC)
segmentation
image processing
image classification
convolutional neural networks
deep learning–artificial intelligence
url https://www.frontiersin.org/articles/10.3389/fmed.2025.1582439/full
work_keys_str_mv AT jinyangzhang gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation
AT hongxinding gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation
AT hongxinding gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation
AT runchuanzhu gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation
AT weibinliao gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation
AT weibinliao gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation
AT junfengzhao gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation
AT junfengzhao gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation
AT mingao gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation
AT xiaoyunzhang gamunetdesigningglobalattentionbasedcnnarchitecturesforenhancedoralcancerdetectionandsegmentation