Birdsong Recognition Based on Attention Hash Algorithm Combined with Contrastive Loss

Aiming at the problems of length misalignment, redundancy, noise and large intra-class differences in birdsong data collected in the natural environment, an automatic birdsong recognition model composed of a two-stage hash algorithm based on multi- level attention and a lightweight classifier based...

Full description

Saved in:
Bibliographic Details
Main Authors: WANG Yuwei, CHEN Aibin, ZHOU Guoxiong, ZHANG Zhiqiang
Format: Article
Language:Chinese
Published: Harbin University of Science and Technology Publications 2024-12-01
Series:Journal of Harbin University of Science and Technology
Subjects:
Online Access:https://hlgxb.hrbust.edu.cn/#/digest?ArticleID=2383
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Aiming at the problems of length misalignment, redundancy, noise and large intra-class differences in birdsong data collected in the natural environment, an automatic birdsong recognition model composed of a two-stage hash algorithm based on multi- level attention and a lightweight classifier based on fusion contrastive loss is proposed. The first stage of the hash algorithm solves the problem of redundancy and noise by firstly dividing the logarithmic Mel spectrogram and calculating the self-attention between each fragment, extracting the calculated multi-level self-attention weight matrix, and then using the weight matrix weighted by the custom noise suppression coefficient to trim the redundancy and noise fragments in the input. The second stage of the hash algorithm solves the problem of misalignment of input dimensions, specifically by using a correlation weight matrix constructed by multi-level attention to screen input fragment to achieve dimension normalization. Aiming at the problem of large intra-class differences, a comprehensive loss function of fusion contrastive loss is proposed, which improve the ability to extract generalized features. The proposed model achieves the best performance of 92. 49% on the self-built dataset of 14 kinds of bird songs, and the recognition accuracy of 94. 38% and 97. 74% on the public datasets BirdsData and BIRDS, respectively, surpassing the existing methods.
ISSN:1007-2683