This project is maintained by blindconf
This page presents supplementary material with a selection of synthetic audio samples generated by the models used in our experiments (paper).
Real audio samples are derived from the LJ Speech Dataset.
We investigate open-world single-model attribution using Residual Fingerprints (RFPs).
RFPs achieve near-perfect AUROC (≈1.0) in distinguishing target synthesis systems from unseen generative models and real speech, demonstrating strong generalization.
Under realistic audio perturbations — such as noise, echo, and compression — RFPs maintain high attribution accuracy. When perturbations are severe, performance can be effectively restored through simple data augmentation during RFP construction.
Transcription: Printing, in the only sense with which we are at present concerned, differs from most if not from all the arts and crafts represented in the Exhibition.
Synthetic Audio Samples:
Audio Corruption Effects:
Transcription: In being comparatively modern.
Synthetic Audio Samples:
Audio Corruption Effects: