Survey of deep fake audio generation and detection techniques

With the rapid development of deep learning technology, fake audio generation technology, especially deep fake audio generation, has become a major challenge in the field of digital media security. The audio generated by these technologies is highly realistic and difficult to distinguish from real s...

Full description

Saved in:
Bibliographic Details
Main Authors: ZENG Zhiping, ZHANG Xulong, QU Xiaoyang, XIAO Chunguang, WANG Jianzong
Format: Article
Language:Chinese
Published: China InfoCom Media Group 2025-01-01
Series:大数据
Subjects:
Online Access:http://www.j-bigdataresearch.com.cn/zh/article/111999019/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:With the rapid development of deep learning technology, fake audio generation technology, especially deep fake audio generation, has become a major challenge in the field of digital media security. The audio generated by these technologies is highly realistic and difficult to distinguish from real speech, which puts forward higher requirements for social security and personal privacy protection. In view of this, the research of deep forgery audio authentication technology was particularly urgent, although some progress has been made in this field, but with the continuous emergence of data volume, computing power and new models, new challenges also follow. Firstly, the evolution of fake audio generation techniques was outlined, transitioning from traditional audio editing to deep learning-based generation models. Subsequently, an in-depth analysis was conducted on both acoustic feature-based and end-to-end model-based fake audio detection strategies, delving into details such as deep acoustic feature detection, pre-trained neural network feature detection, end-to-end model optimization, generalization enhancement techniques, and the enhancement of real-time detection. Lastly, future work was envisioned, highlighting the key research directions that require attention, in order to advance the development of deep fake audio detection techniques, more effectively counter increasingly complex fake audio threats, and ensure the security and reliability of digital media.
ISSN:2096-0271