CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer

Underwater acoustic target classification has become a key area of research for marine vessel classification, where machine learning (ML) models are leveraged to identify targets automatically. The major challenge is inserting area-specific understanding into ML frameworks to extract features that e...

Full description

Saved in:

Bibliographic Details
Main Authors:	Najamuddin NAJAMUDDIN, Usman Ullah SHEIKH, Ahmad Zuri SHA’AMERI
Format:	Article
Language:	English
Published:	Institute of Fundamental Technological Research Polish Academy of Sciences 2025-06-01
Series:	Archives of Acoustics
Subjects:	underwater acoustic targets capse vision transformer cnn lofar gram
Online Access:	https://acoustics.ippt.pan.pl/index.php/aa/article/view/4197
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1839595964006924288
author	Najamuddin NAJAMUDDIN Usman Ullah SHEIKH Ahmad Zuri SHA’AMERI
author_facet	Najamuddin NAJAMUDDIN Usman Ullah SHEIKH Ahmad Zuri SHA’AMERI
author_sort	Najamuddin NAJAMUDDIN
collection	DOAJ
description	Underwater acoustic target classification has become a key area of research for marine vessel classification, where machine learning (ML) models are leveraged to identify targets automatically. The major challenge is inserting area-specific understanding into ML frameworks to extract features that effectively distinguish between different vessel types. In this study, we propose a model that uses the coherently averaged power spectral estimation (CAPSE) algorithm. Vessel frequency spectra is first computed through the CAPSE analysis, capturing key machinery characteristics. Further, the features are processed via a vision transformer (ViT) network. This method enables the model to learn more complex relationships and patterns within the data, thereby improving the classification performance. This is accomplished by using self-attention mechanisms to capture global dependencies between features, enabling the model to focus on relationships throughout the entire input. The results, evaluated on standard DeepShip and ShipsEar datasets, show that the proposed model achieved a classification accuracy of 97.98 % and 99.19 % while utilizing just 1.90 million parameters, outperforming other models such as ResNet18 and UATR-Transformer in terms of both accuracy and computational efficiency. This work offers an improvement to the development of efficient marine vessel classification systems for underwater acoustics applications, demonstrating that high performance can be achieved with reduced computational complexity.
format	Article
id	doaj-art-e934e062283c4ee9b1cea20c8fbfbb86
institution	Matheson Library
issn	0137-5075 2300-262X
language	English
publishDate	2025-06-01
publisher	Institute of Fundamental Technological Research Polish Academy of Sciences
record_format	Article
series	Archives of Acoustics
spelling	doaj-art-e934e062283c4ee9b1cea20c8fbfbb862025-08-02T22:22:15ZengInstitute of Fundamental Technological Research Polish Academy of SciencesArchives of Acoustics0137-50752300-262X2025-06-0150216117110.24425/aoa.2025.1536623749CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision TransformerNajamuddin NAJAMUDDIN0Usman Ullah SHEIKH1Ahmad Zuri SHA’AMERI2Faculty of Electrical Engineering, Universiti Teknologi Malaysia, UTM SkudaiFaculty of Electrical Engineering, Universiti Teknologi Malaysia, UTM SkudaiFaculty of Electrical Engineering, Universiti Teknologi Malaysia, UTM SkudaiUnderwater acoustic target classification has become a key area of research for marine vessel classification, where machine learning (ML) models are leveraged to identify targets automatically. The major challenge is inserting area-specific understanding into ML frameworks to extract features that effectively distinguish between different vessel types. In this study, we propose a model that uses the coherently averaged power spectral estimation (CAPSE) algorithm. Vessel frequency spectra is first computed through the CAPSE analysis, capturing key machinery characteristics. Further, the features are processed via a vision transformer (ViT) network. This method enables the model to learn more complex relationships and patterns within the data, thereby improving the classification performance. This is accomplished by using self-attention mechanisms to capture global dependencies between features, enabling the model to focus on relationships throughout the entire input. The results, evaluated on standard DeepShip and ShipsEar datasets, show that the proposed model achieved a classification accuracy of 97.98 % and 99.19 % while utilizing just 1.90 million parameters, outperforming other models such as ResNet18 and UATR-Transformer in terms of both accuracy and computational efficiency. This work offers an improvement to the development of efficient marine vessel classification systems for underwater acoustics applications, demonstrating that high performance can be achieved with reduced computational complexity.https://acoustics.ippt.pan.pl/index.php/aa/article/view/4197underwater acoustic targetscapsevision transformercnnlofar gram
spellingShingle	Najamuddin NAJAMUDDIN Usman Ullah SHEIKH Ahmad Zuri SHA’AMERI CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer Archives of Acoustics underwater acoustic targets capse vision transformer cnn lofar gram
title	CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer
title_full	CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer
title_fullStr	CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer
title_full_unstemmed	CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer
title_short	CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer
title_sort	capse vit a lightweight framework for underwater acoustic vessel classification using coherent spectral estimation and modified vision transformer
topic	underwater acoustic targets capse vision transformer cnn lofar gram
url	https://acoustics.ippt.pan.pl/index.php/aa/article/view/4197
work_keys_str_mv	AT najamuddinnajamuddin capsevitalightweightframeworkforunderwateracousticvesselclassificationusingcoherentspectralestimationandmodifiedvisiontransformer AT usmanullahsheikh capsevitalightweightframeworkforunderwateracousticvesselclassificationusingcoherentspectralestimationandmodifiedvisiontransformer AT ahmadzurishaameri capsevitalightweightframeworkforunderwateracousticvesselclassificationusingcoherentspectralestimationandmodifiedvisiontransformer

CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer

Similar Items