Deep learning microphone array speech enhancement for multiple speaker separation

With the increase of human-computer voice interaction scenes in recent years, using microphone array speech enhancement to improve speech quality has become one of the research hotspots. Different from the ambient noise, the interfering speaker′s speech and the target speaker are the same speech sig...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhang Jiayang, Tong Feng, Chen Dongsheng, Huang Huixiang
Format: Article
Language:Chinese
Published: National Computer System Engineering Research Institute of China 2022-05-01
Series:Dianzi Jishu Yingyong
Subjects:
Online Access:http://www.chinaaet.com/article/3000149429
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839639514241302528
author Zhang Jiayang
Tong Feng
Chen Dongsheng
Huang Huixiang
author_facet Zhang Jiayang
Tong Feng
Chen Dongsheng
Huang Huixiang
author_sort Zhang Jiayang
collection DOAJ
description With the increase of human-computer voice interaction scenes in recent years, using microphone array speech enhancement to improve speech quality has become one of the research hotspots. Different from the ambient noise, the interfering speaker′s speech and the target speaker are the same speech signal in the multiple speaker separation scene, showing similar time-frequency characteristics, which poses a higher challenge to the traditional microphone array speech enhancement technology. For the multiple speaker separation scenario, the spatial response cost function of microphone array is constructed and optimized based on deep learning network. The desired spatial transmission characteristics of microphone array are designed through deep learning model training, so as to improve the separation effect by improving the beamforming performance. Simulation and experimental results show that this method effectively improves the performance of multiple speaker separation.
format Article
id doaj-art-2f4345c44e4d4e6f9c38a5fac0087d16
institution Matheson Library
issn 0258-7998
language zho
publishDate 2022-05-01
publisher National Computer System Engineering Research Institute of China
record_format Article
series Dianzi Jishu Yingyong
spelling doaj-art-2f4345c44e4d4e6f9c38a5fac0087d162025-07-04T08:29:17ZzhoNational Computer System Engineering Research Institute of ChinaDianzi Jishu Yingyong0258-79982022-05-01485313610.16157/j.issn.0258-7998.2124043000149429Deep learning microphone array speech enhancement for multiple speaker separationZhang Jiayang0Tong Feng1Chen Dongsheng2Huang Huixiang3Key Laboratory of Underwater Acoustic Communication and Marine Information Technology Ministry of Education, Xiamen University,Xiamen 361005,ChinaKey Laboratory of Underwater Acoustic Communication and Marine Information Technology Ministry of Education, Xiamen University,Xiamen 361005,ChinaKey Laboratory of Underwater Acoustic Communication and Marine Information Technology Ministry of Education, Xiamen University,Xiamen 361005,ChinaKey Laboratory of Underwater Acoustic Communication and Marine Information Technology Ministry of Education, Xiamen University,Xiamen 361005,ChinaWith the increase of human-computer voice interaction scenes in recent years, using microphone array speech enhancement to improve speech quality has become one of the research hotspots. Different from the ambient noise, the interfering speaker′s speech and the target speaker are the same speech signal in the multiple speaker separation scene, showing similar time-frequency characteristics, which poses a higher challenge to the traditional microphone array speech enhancement technology. For the multiple speaker separation scenario, the spatial response cost function of microphone array is constructed and optimized based on deep learning network. The desired spatial transmission characteristics of microphone array are designed through deep learning model training, so as to improve the separation effect by improving the beamforming performance. Simulation and experimental results show that this method effectively improves the performance of multiple speaker separation.http://www.chinaaet.com/article/3000149429deep learningmicrophone arraybeamforminglstm
spellingShingle Zhang Jiayang
Tong Feng
Chen Dongsheng
Huang Huixiang
Deep learning microphone array speech enhancement for multiple speaker separation
Dianzi Jishu Yingyong
deep learning
microphone array
beamforming
lstm
title Deep learning microphone array speech enhancement for multiple speaker separation
title_full Deep learning microphone array speech enhancement for multiple speaker separation
title_fullStr Deep learning microphone array speech enhancement for multiple speaker separation
title_full_unstemmed Deep learning microphone array speech enhancement for multiple speaker separation
title_short Deep learning microphone array speech enhancement for multiple speaker separation
title_sort deep learning microphone array speech enhancement for multiple speaker separation
topic deep learning
microphone array
beamforming
lstm
url http://www.chinaaet.com/article/3000149429
work_keys_str_mv AT zhangjiayang deeplearningmicrophonearrayspeechenhancementformultiplespeakerseparation
AT tongfeng deeplearningmicrophonearrayspeechenhancementformultiplespeakerseparation
AT chendongsheng deeplearningmicrophonearrayspeechenhancementformultiplespeakerseparation
AT huanghuixiang deeplearningmicrophonearrayspeechenhancementformultiplespeakerseparation