CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer

Underwater acoustic target classification has become a key area of research for marine vessel classification, where machine learning (ML) models are leveraged to identify targets automatically. The major challenge is inserting area-specific understanding into ML frameworks to extract features that e...

Full description

Saved in:
Bibliographic Details
Main Authors: Najamuddin NAJAMUDDIN, Usman Ullah SHEIKH, Ahmad Zuri SHA’AMERI
Format: Article
Language:English
Published: Institute of Fundamental Technological Research Polish Academy of Sciences 2025-06-01
Series:Archives of Acoustics
Subjects:
Online Access:https://acoustics.ippt.pan.pl/index.php/aa/article/view/4197
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839595964006924288
author Najamuddin NAJAMUDDIN
Usman Ullah SHEIKH
Ahmad Zuri SHA’AMERI
author_facet Najamuddin NAJAMUDDIN
Usman Ullah SHEIKH
Ahmad Zuri SHA’AMERI
author_sort Najamuddin NAJAMUDDIN
collection DOAJ
description Underwater acoustic target classification has become a key area of research for marine vessel classification, where machine learning (ML) models are leveraged to identify targets automatically. The major challenge is inserting area-specific understanding into ML frameworks to extract features that effectively distinguish between different vessel types. In this study, we propose a model that uses the coherently averaged power spectral estimation (CAPSE) algorithm. Vessel frequency spectra is first computed through the CAPSE analysis, capturing key machinery characteristics. Further, the features are processed via a vision transformer (ViT) network. This method enables the model to learn more complex relationships and patterns within the data, thereby improving the classification performance. This is accomplished by using self-attention mechanisms to capture global dependencies between features, enabling the model to focus on relationships throughout the entire input. The results, evaluated on standard DeepShip and ShipsEar datasets, show that the proposed model achieved a classification accuracy of 97.98 % and 99.19 % while utilizing just 1.90 million parameters, outperforming other models such as ResNet18 and UATR-Transformer in terms of both accuracy and computational efficiency. This work offers an improvement to the development of efficient marine vessel classification systems for underwater acoustics applications, demonstrating that high performance can be achieved with reduced computational complexity.
format Article
id doaj-art-e934e062283c4ee9b1cea20c8fbfbb86
institution Matheson Library
issn 0137-5075
2300-262X
language English
publishDate 2025-06-01
publisher Institute of Fundamental Technological Research Polish Academy of Sciences
record_format Article
series Archives of Acoustics
spelling doaj-art-e934e062283c4ee9b1cea20c8fbfbb862025-08-02T22:22:15ZengInstitute of Fundamental Technological Research Polish Academy of SciencesArchives of Acoustics0137-50752300-262X2025-06-0150216117110.24425/aoa.2025.1536623749CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision TransformerNajamuddin NAJAMUDDIN0Usman Ullah SHEIKH1Ahmad Zuri SHA’AMERI2Faculty of Electrical Engineering, Universiti Teknologi Malaysia, UTM SkudaiFaculty of Electrical Engineering, Universiti Teknologi Malaysia, UTM SkudaiFaculty of Electrical Engineering, Universiti Teknologi Malaysia, UTM SkudaiUnderwater acoustic target classification has become a key area of research for marine vessel classification, where machine learning (ML) models are leveraged to identify targets automatically. The major challenge is inserting area-specific understanding into ML frameworks to extract features that effectively distinguish between different vessel types. In this study, we propose a model that uses the coherently averaged power spectral estimation (CAPSE) algorithm. Vessel frequency spectra is first computed through the CAPSE analysis, capturing key machinery characteristics. Further, the features are processed via a vision transformer (ViT) network. This method enables the model to learn more complex relationships and patterns within the data, thereby improving the classification performance. This is accomplished by using self-attention mechanisms to capture global dependencies between features, enabling the model to focus on relationships throughout the entire input. The results, evaluated on standard DeepShip and ShipsEar datasets, show that the proposed model achieved a classification accuracy of 97.98 % and 99.19 % while utilizing just 1.90 million parameters, outperforming other models such as ResNet18 and UATR-Transformer in terms of both accuracy and computational efficiency. This work offers an improvement to the development of efficient marine vessel classification systems for underwater acoustics applications, demonstrating that high performance can be achieved with reduced computational complexity.https://acoustics.ippt.pan.pl/index.php/aa/article/view/4197underwater acoustic targetscapsevision transformercnnlofar gram
spellingShingle Najamuddin NAJAMUDDIN
Usman Ullah SHEIKH
Ahmad Zuri SHA’AMERI
CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer
Archives of Acoustics
underwater acoustic targets
capse
vision transformer
cnn
lofar gram
title CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer
title_full CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer
title_fullStr CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer
title_full_unstemmed CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer
title_short CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer
title_sort capse vit a lightweight framework for underwater acoustic vessel classification using coherent spectral estimation and modified vision transformer
topic underwater acoustic targets
capse
vision transformer
cnn
lofar gram
url https://acoustics.ippt.pan.pl/index.php/aa/article/view/4197
work_keys_str_mv AT najamuddinnajamuddin capsevitalightweightframeworkforunderwateracousticvesselclassificationusingcoherentspectralestimationandmodifiedvisiontransformer
AT usmanullahsheikh capsevitalightweightframeworkforunderwateracousticvesselclassificationusingcoherentspectralestimationandmodifiedvisiontransformer
AT ahmadzurishaameri capsevitalightweightframeworkforunderwateracousticvesselclassificationusingcoherentspectralestimationandmodifiedvisiontransformer