An improved deep learning approach for speech enhancement

Single-channel speech enhancement refers to the task of improving the quality and intelligibility of a speech signal in a noisy environment. Time-domain and time-frequency-domain methods are two main categories of approaches for speech enhancement. In this paper, we propose a approach based on a cro...

Full description

Saved in:

Bibliographic Details
Main Authors:	Malek Miled, Mohamed Anouar Ben Messaoud
Format:	Article
Language:	English
Published:	Universidade do Porto 2023-11-01
Series:	U.Porto Journal of Engineering
Subjects:	Speech Enhancement Empirical Mode Decomposition Principal Component Analysis Learning Model
Online Access:	https://journalengineering.fe.up.pt/index.php/upjeng/article/view/1531
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Single-channel speech enhancement refers to the task of improving the quality and intelligibility of a speech signal in a noisy environment. Time-domain and time-frequency-domain methods are two main categories of approaches for speech enhancement. In this paper, we propose a approach based on a cross-domain framework. This framework utilizes our knowledge of the spectrogram and overcomes some of the limitations faced by time-frequency domain methods. First, we apply the intrinsic mode functions of the empirical mode decomposition and an improved version of principal component analysis. Then, we design a cross-domain learning framework to determine the correlations along the frequency and time axes. At low SNR = -5 dB, the effectiveness of our proposed approach is demonstrated by its performance based on objective and subjective measures. With average scores of -0.49, 2.47, 2.44, and 0.68 for SegSNR, PESQ, Cov, and STOI, respectively. The results highlight the success of our approach in addressing low SNR conditions.
ISSN:	2183-6493

An improved deep learning approach for speech enhancement

Similar Items