CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer
Underwater acoustic target classification has become a key area of research for marine vessel classification, where machine learning (ML) models are leveraged to identify targets automatically. The major challenge is inserting area-specific understanding into ML frameworks to extract features that e...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Institute of Fundamental Technological Research Polish Academy of Sciences
2025-06-01
|
Series: | Archives of Acoustics |
Subjects: | |
Online Access: | https://acoustics.ippt.pan.pl/index.php/aa/article/view/4197 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1839595964006924288 |
---|---|
author | Najamuddin NAJAMUDDIN Usman Ullah SHEIKH Ahmad Zuri SHA’AMERI |
author_facet | Najamuddin NAJAMUDDIN Usman Ullah SHEIKH Ahmad Zuri SHA’AMERI |
author_sort | Najamuddin NAJAMUDDIN |
collection | DOAJ |
description | Underwater acoustic target classification has become a key area of research for marine vessel classification, where machine learning (ML) models are leveraged to identify targets automatically. The major challenge is inserting area-specific understanding into ML frameworks to extract features that effectively distinguish between different vessel types. In this study, we propose a model that uses the coherently averaged power spectral estimation (CAPSE) algorithm. Vessel frequency spectra is first computed through the CAPSE analysis, capturing key machinery characteristics. Further, the features are processed via a vision transformer (ViT) network. This method enables the model to learn more complex relationships and patterns within the data, thereby improving the classification performance. This is accomplished by using self-attention mechanisms to capture global dependencies between features, enabling the model to focus on relationships throughout the entire input. The results, evaluated on standard DeepShip and ShipsEar datasets, show that the proposed model achieved a classification accuracy of 97.98 % and 99.19 % while utilizing just 1.90 million parameters, outperforming other models such as ResNet18 and UATR-Transformer in terms of both accuracy and computational efficiency. This work offers an improvement to the development of efficient marine vessel classification systems for underwater acoustics applications, demonstrating that high performance can be achieved with reduced
computational complexity. |
format | Article |
id | doaj-art-e934e062283c4ee9b1cea20c8fbfbb86 |
institution | Matheson Library |
issn | 0137-5075 2300-262X |
language | English |
publishDate | 2025-06-01 |
publisher | Institute of Fundamental Technological Research Polish Academy of Sciences |
record_format | Article |
series | Archives of Acoustics |
spelling | doaj-art-e934e062283c4ee9b1cea20c8fbfbb862025-08-02T22:22:15ZengInstitute of Fundamental Technological Research Polish Academy of SciencesArchives of Acoustics0137-50752300-262X2025-06-0150216117110.24425/aoa.2025.1536623749CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision TransformerNajamuddin NAJAMUDDIN0Usman Ullah SHEIKH1Ahmad Zuri SHA’AMERI2Faculty of Electrical Engineering, Universiti Teknologi Malaysia, UTM SkudaiFaculty of Electrical Engineering, Universiti Teknologi Malaysia, UTM SkudaiFaculty of Electrical Engineering, Universiti Teknologi Malaysia, UTM SkudaiUnderwater acoustic target classification has become a key area of research for marine vessel classification, where machine learning (ML) models are leveraged to identify targets automatically. The major challenge is inserting area-specific understanding into ML frameworks to extract features that effectively distinguish between different vessel types. In this study, we propose a model that uses the coherently averaged power spectral estimation (CAPSE) algorithm. Vessel frequency spectra is first computed through the CAPSE analysis, capturing key machinery characteristics. Further, the features are processed via a vision transformer (ViT) network. This method enables the model to learn more complex relationships and patterns within the data, thereby improving the classification performance. This is accomplished by using self-attention mechanisms to capture global dependencies between features, enabling the model to focus on relationships throughout the entire input. The results, evaluated on standard DeepShip and ShipsEar datasets, show that the proposed model achieved a classification accuracy of 97.98 % and 99.19 % while utilizing just 1.90 million parameters, outperforming other models such as ResNet18 and UATR-Transformer in terms of both accuracy and computational efficiency. This work offers an improvement to the development of efficient marine vessel classification systems for underwater acoustics applications, demonstrating that high performance can be achieved with reduced computational complexity.https://acoustics.ippt.pan.pl/index.php/aa/article/view/4197underwater acoustic targetscapsevision transformercnnlofar gram |
spellingShingle | Najamuddin NAJAMUDDIN Usman Ullah SHEIKH Ahmad Zuri SHA’AMERI CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer Archives of Acoustics underwater acoustic targets capse vision transformer cnn lofar gram |
title | CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer |
title_full | CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer |
title_fullStr | CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer |
title_full_unstemmed | CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer |
title_short | CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer |
title_sort | capse vit a lightweight framework for underwater acoustic vessel classification using coherent spectral estimation and modified vision transformer |
topic | underwater acoustic targets capse vision transformer cnn lofar gram |
url | https://acoustics.ippt.pan.pl/index.php/aa/article/view/4197 |
work_keys_str_mv | AT najamuddinnajamuddin capsevitalightweightframeworkforunderwateracousticvesselclassificationusingcoherentspectralestimationandmodifiedvisiontransformer AT usmanullahsheikh capsevitalightweightframeworkforunderwateracousticvesselclassificationusingcoherentspectralestimationandmodifiedvisiontransformer AT ahmadzurishaameri capsevitalightweightframeworkforunderwateracousticvesselclassificationusingcoherentspectralestimationandmodifiedvisiontransformer |