Real-Time Transformer Detection of Underwater Objects Based on Lightweight Gated Convolutional Network

To address the challenges in underwater object detection algorithms, including difficult image feature processing, redundant model architectures, and excessive parameter numbers, this paper proposed a real-time Transformer detection method for underwater objects based on a lightweight gated convolut...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuhui LI, Huixia CUI, Yaomin LI, Senping JIA
Format: Article
Language:Chinese
Published: Science Press (China) 2025-04-01
Series:水下无人系统学报
Subjects:
Online Access:https://sxwrxtxb.xml-journal.net/cn/article/doi/10.11993/j.issn.2096-3920.2024-0182
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:To address the challenges in underwater object detection algorithms, including difficult image feature processing, redundant model architectures, and excessive parameter numbers, this paper proposed a real-time Transformer detection method for underwater objects based on a lightweight gated convolutional network. This method first constructed a convolutional gated linear unit based on the gating mechanism to dynamically modulate feature transmission. Furthermore, on this basis, a gated channel interaction module was proposed to fully decouple the token mixer from the channel mixer. Additionally, for the token mixer, the structural reparameterization technique was introduced to significantly reduce the computational cost of the model during inference. The hybrid encoder conducted the intra-scale information exchange and multi-scale feature fusion of the three features extracted by the gated backbone network, thus realizing the high fusion of shallow high-frequency information and deep semantic spatial information. The proposed model carried out a large number of experiments on different modal datasets. The results show that the model’s mAP@0.5 reaches 0.849, the overall number of parameters is 23.3×106, and the FPS detection frame rate is 136.8. While maintaining excellent detection accuracy, this model achieves a smaller number of model parameters and higher detection speed, with better overall performance than other models. The results reveal that compared to a series of excellent object detection models, the proposed model features sound detection performance and efficient real-time detection.
ISSN:2096-3920