FakeMusicCaps: A Dataset for Detection and Attribution of Synthetic Music Generated via Text-to-Music Models

Text-to-music (TTM) models have recently revolutionized the automatic music generation research field, specifically by being able to generate music that sounds more plausible than all previous state-of-the-art models and by lowering the technical proficiency needed to use them. For these reasons, th...

Full description

Saved in:

Bibliographic Details
Main Authors:	Luca Comanducci, Paolo Bestagini, Stefano Tubaro
Format:	Article
Language:	English
Published:	MDPI AG 2025-07-01
Series:	Journal of Imaging
Subjects:	music generation text-to-music audio forensics DeepFake
Online Access:	https://www.mdpi.com/2313-433X/11/7/242
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1839615859749814272
author	Luca Comanducci Paolo Bestagini Stefano Tubaro
author_facet	Luca Comanducci Paolo Bestagini Stefano Tubaro
author_sort	Luca Comanducci
collection	DOAJ
description	Text-to-music (TTM) models have recently revolutionized the automatic music generation research field, specifically by being able to generate music that sounds more plausible than all previous state-of-the-art models and by lowering the technical proficiency needed to use them. For these reasons, they have readily started to be adopted for commercial uses and music production practices. This widespread diffusion of TTMs poses several concerns regarding copyright violation and rightful attribution, posing the need of serious consideration of them by the audio forensics community. In this paper, we tackle the problem of detection and attribution of TTM-generated data. We propose a dataset, FakeMusicCaps, that contains several versions of the music-caption pairs dataset MusicCaps regenerated via several state-of-the-art TTM techniques. We evaluate the proposed dataset by performing initial experiments regarding the detection and attribution of TTM-generated audio considering both closed-set and open-set classification.
format	Article
id	doaj-art-a68183a7f37d4e52a23bcd1fe88a9744
institution	Matheson Library
issn	2313-433X
language	English
publishDate	2025-07-01
publisher	MDPI AG
record_format	Article
series	Journal of Imaging
spelling	doaj-art-a68183a7f37d4e52a23bcd1fe88a97442025-07-25T13:26:33ZengMDPI AGJournal of Imaging2313-433X2025-07-0111724210.3390/jimaging11070242FakeMusicCaps: A Dataset for Detection and Attribution of Synthetic Music Generated via Text-to-Music ModelsLuca Comanducci0Paolo Bestagini1Stefano Tubaro2Department of Electronics, Information and Bioengineering (DEIB), Politecnico di Milano, 20133 Milano, ItalyDepartment of Electronics, Information and Bioengineering (DEIB), Politecnico di Milano, 20133 Milano, ItalyDepartment of Electronics, Information and Bioengineering (DEIB), Politecnico di Milano, 20133 Milano, ItalyText-to-music (TTM) models have recently revolutionized the automatic music generation research field, specifically by being able to generate music that sounds more plausible than all previous state-of-the-art models and by lowering the technical proficiency needed to use them. For these reasons, they have readily started to be adopted for commercial uses and music production practices. This widespread diffusion of TTMs poses several concerns regarding copyright violation and rightful attribution, posing the need of serious consideration of them by the audio forensics community. In this paper, we tackle the problem of detection and attribution of TTM-generated data. We propose a dataset, FakeMusicCaps, that contains several versions of the music-caption pairs dataset MusicCaps regenerated via several state-of-the-art TTM techniques. We evaluate the proposed dataset by performing initial experiments regarding the detection and attribution of TTM-generated audio considering both closed-set and open-set classification.https://www.mdpi.com/2313-433X/11/7/242music generationtext-to-musicaudio forensicsDeepFake
spellingShingle	Luca Comanducci Paolo Bestagini Stefano Tubaro FakeMusicCaps: A Dataset for Detection and Attribution of Synthetic Music Generated via Text-to-Music Models Journal of Imaging music generation text-to-music audio forensics DeepFake
title	FakeMusicCaps: A Dataset for Detection and Attribution of Synthetic Music Generated via Text-to-Music Models
title_full	FakeMusicCaps: A Dataset for Detection and Attribution of Synthetic Music Generated via Text-to-Music Models
title_fullStr	FakeMusicCaps: A Dataset for Detection and Attribution of Synthetic Music Generated via Text-to-Music Models
title_full_unstemmed	FakeMusicCaps: A Dataset for Detection and Attribution of Synthetic Music Generated via Text-to-Music Models
title_short	FakeMusicCaps: A Dataset for Detection and Attribution of Synthetic Music Generated via Text-to-Music Models
title_sort	fakemusiccaps a dataset for detection and attribution of synthetic music generated via text to music models
topic	music generation text-to-music audio forensics DeepFake
url	https://www.mdpi.com/2313-433X/11/7/242
work_keys_str_mv	AT lucacomanducci fakemusiccapsadatasetfordetectionandattributionofsyntheticmusicgeneratedviatexttomusicmodels AT paolobestagini fakemusiccapsadatasetfordetectionandattributionofsyntheticmusicgeneratedviatexttomusicmodels AT stefanotubaro fakemusiccapsadatasetfordetectionandattributionofsyntheticmusicgeneratedviatexttomusicmodels

FakeMusicCaps: A Dataset for Detection and Attribution of Synthetic Music Generated via Text-to-Music Models

Similar Items