Adversarial example demo

Supplementary material containing a selection of benign, adversarial, and noisy data employed in our [paper].

For each sample, we include the word error rate (WER) as an accuracy metric and the segmental signal-to-noise ratio (SNR_seg) as a quality noise metric. An SNR_seg exceeding 0 dB indicates a stronger signal presence compared to noise. These samples are sourced from the Librispeech corpus dataset.

Librispeech

Sample 1 - FP32

Benign transcription:       ROBIN FITZOOTH
Adversarial transcription:  AND ONE MORE THIS MORNING

[Benign: WER=0.00],

[C&W adversarial: WER=0.00, SNR_seg=24.50], [Psychoacoustic adversarial: WER=0.00, SNR_seg=25.36]

Sample 2 - FP16

Benign transcription:       WILL YOU FORGIVE ME NOW
Adversarial transcription:  PAUL STICKS TO HIS THEME

[benign: WER=0.00],

[C&W adversarial: WER=0.00, SNR_seg=22.04], [Psychoacoustic adversarial: WER=0.00, SNR_seg=22.95]

Sample 3 - BF16

Benign transcription:       IT WILL BE NO DISAPPOINTMENT TO ME
Adversarial transcription:  AH VERY WELL

[benign: WER=0.00],

[C&W adversarial: WER=0.00, SNR_seg=8.86], [Psychoacoustic adversarial: WER=0.00, SNR_seg=11.23]