Robust AI-Generated Lyrics Detection

Abstract

The rapid advance of Artificial Intelligence (AI)-based music genera-tion tools presents new opportunities for the music industry but alsoposes significant challenges, necessitating reliable methods for detect-ing AI-generated content. Existing detectors, however, face key practi-cal limitations: audio-based approaches struggle to generalize to un-seen generators and are not robust to common audio perturbations,while lyrics-based methods depend on cleanly formatted lyrics that areunavailable in real-world settings. To address this gap, this thesis pro-poses and evaluates a novel, practically grounded approach that lever-ages lyrical content extracted directly from the audio signal. Our methodfirst transcribes sung lyrics using a general-purpose Automatic SpeechRecognition (ASR) model, allowing established AI-generated text de-tection methods to be applied. To further improve performance, we in-troduce Double Entendre-detect (DE-detect), a multi-view late-fusionmethod that also incorporates audio-derived speech features capturingparalinguistic information. By focusing on lyrical and speech-relatedinformation rather than low-level audio artifacts, our method is de-signed for improved robustness and generalization. Experiments on adiverse dataset show that DE-detect achieves strong detection perfor-mance compared to text-only ones and, crucially, outperforms audio-based approaches, especially when tested against various audio pertur-bations and unseen music generators. This work thus presents an effec-tive, robust, and practical solution for detecting AI-generated music.


Citation

Markus Frohmann
Robust AI-Generated Lyrics Detection
, 2025.

BibTeX

@misc{MarkusFrohmann2025master-thesis,
    title = {Robust AI-Generated Lyrics Detection},
    author = {Markus Frohmann},
    school = {Johannes Kepler University Linz},
    year = {2025}
}