CAs-Net: A Channel-Aware Speech Network for Uyghur Speech Recognition

This paper proposes a Channel-Aware Speech Network (CAs-Net) for low-resource speech recognition tasks, aiming to improve recognition performance for languages such as Uyghur under complex noisy conditions. The proposed model consists of two key components: (1) the Channel Rotation Module (CIM), whi...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jiang Zhang, Miaomiao Xu, Lianghui Xu, Yajing Ma
Format:	Article
Language:	English
Published:	MDPI AG 2025-06-01
Series:	Sensors
Subjects:	low-resource speech recognition channel modeling multi-scale convolution
Online Access:	https://www.mdpi.com/1424-8220/25/12/3783
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This paper proposes a Channel-Aware Speech Network (CAs-Net) for low-resource speech recognition tasks, aiming to improve recognition performance for languages such as Uyghur under complex noisy conditions. The proposed model consists of two key components: (1) the Channel Rotation Module (CIM), which reconstructs each frame’s channel vector into a spatial structure and applies a rotation operation to explicitly model the local structural relationships within the channel dimension, thereby enhancing the encoder’s contextual modeling capability; and (2) the Multi-Scale Depthwise Convolution Module (MSDCM), integrated within the Transformer framework, which leverages multi-branch depthwise separable convolutions and a lightweight self-attention mechanism to jointly capture multi-scale temporal patterns, thus improving the model’s perception of compact articulation and complex rhythmic structures. Experiments conducted on a real Uyghur speech recognition dataset demonstrate that CAs-Net achieves the best performance across multiple subsets, with an average Word Error Rate (WER) of 5.72%, significantly outperforming existing approaches. These results validate the robustness and effectiveness of the proposed model under low-resource and noisy conditions.
ISSN:	1424-8220

CAs-Net: A Channel-Aware Speech Network for Uyghur Speech Recognition

Similar Items