A selection of benign and adversarial data employed in our experiments.
This project is maintained by blindconf
Supplementary material containing a selection of benign, adversarial, and noisy data employed in our [paper].
For each sample, we include the word error rate (WER) as an accuracy metric and the segmental signal-to-noise ratio (SNRseg) as a quality noise metric. An SNRseg exceeding 0 dB indicates a stronger signal presence compared to noise. These samples are sourced from the Librispeech corpus dataset.
Benign transcription: ROBIN FITZOOTH Adversarial transcription: AND ONE MORE THIS MORNING
[Benign: WER=0.00],
[C&W adversarial: WER=0.00, SNRseg=24.50], [Psychoacoustic adversarial: WER=0.00, SNRseg=25.36]
Benign transcription: WILL YOU FORGIVE ME NOW Adversarial transcription: PAUL STICKS TO HIS THEME
[benign: WER=0.00],
[C&W adversarial: WER=0.00, SNRseg=22.04], [Psychoacoustic adversarial: WER=0.00, SNRseg=22.95]
Benign transcription: IT WILL BE NO DISAPPOINTMENT TO ME Adversarial transcription: AH VERY WELL
[benign: WER=0.00],
[C&W adversarial: WER=0.00, SNRseg=8.86], [Psychoacoustic adversarial: WER=0.00, SNRseg=11.23]