Deep learning microphone array speech enhancement for multiple speaker separation

With the increase of human-computer voice interaction scenes in recent years, using microphone array speech enhancement to improve speech quality has become one of the research hotspots. Different from the ambient noise, the interfering speaker′s speech and the target speaker are the same speech sig...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhang Jiayang, Tong Feng, Chen Dongsheng, Huang Huixiang
Format: Article
Language:Chinese
Published: National Computer System Engineering Research Institute of China 2022-05-01
Series:Dianzi Jishu Yingyong
Subjects:
Online Access:http://www.chinaaet.com/article/3000149429
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:With the increase of human-computer voice interaction scenes in recent years, using microphone array speech enhancement to improve speech quality has become one of the research hotspots. Different from the ambient noise, the interfering speaker′s speech and the target speaker are the same speech signal in the multiple speaker separation scene, showing similar time-frequency characteristics, which poses a higher challenge to the traditional microphone array speech enhancement technology. For the multiple speaker separation scenario, the spatial response cost function of microphone array is constructed and optimized based on deep learning network. The desired spatial transmission characteristics of microphone array are designed through deep learning model training, so as to improve the separation effect by improving the beamforming performance. Simulation and experimental results show that this method effectively improves the performance of multiple speaker separation.
ISSN:0258-7998