Abstract: Advances in generative audio have intensified copyright concerns, making audio watermarking increasingly important for asserting ownership. However, existing audio watermarking methods are vulnerable to adversarial attacks. We find that watermark decoder message probabilities follow normal distributions, a property exploited by defenses to detect manipulations. This paper introduces AWM, an adaptive audio watermark attack method designed to bypass existing defense strategies. AWM uses a two-stage optimization: the first stage ensures attack success, while the second improves audio quality. To evade detection, it estimates normal distribution parameters from limited samples of the target audio, and then adaptively steers decoded probabilities back into the estimated range. Evaluated on two watermarking methods across three voice datasets, AWM achieves high success while bypassing state-of-the-art detectors: detection rates are below 10% for replacement and creation, and 0% for removal./
We compare our attack method with AudioMarkBench: AuioMarkBench [NeurIPS 2024] a benchmark designed to evaluate the robustness of audio watermarking against adversarial attacks
We evaluate two audio watermarking methods:
For both clean/unwatermarked and watermarked audio, we observe two distinct distributions, each following a normal distribution pattern. We show the distributions of AudioMarkBench and Our Attack in the watermark replacement scenario, the red boxes show the outliers in the AudioMarkBench
| AudioSeal | Timbre | |
| Clean/Unwatermarked | ![]() |
![]() |
| Watermarked | ![]() |
![]() |
| AudioMarkBench | ![]() |
![]() |
| Our Attack | ![]() |
![]() |
Some explanations for Spectrogram and Distribution in the Watermark Replacement, Watermark Creation, and Watermark Removal:
In the Distribution, the examplations for the message probabilities:
Watermark replacement aims to replace an existing watermark with a different one
| Attack (AudioSeal) | Original (Clean) | Watermark | AudioMarkBench | Ours | Ours (+opt) |
| Audios | |||||
| Spectrogram | ![]() |
![]() |
![]() |
![]() |
![]() |
| Watermark Message | ---------------- | 000001110101100 | 1111111100000000 | 1111111100000000 | 1111111100000000 |
| Distribution | ---------------- | ![]() |
![]() |
![]() |
![]() |
| Attack (Timbre) | Original (Clean) | Watermark | AudioMarkBench | Ours | Ours (+opt) |
| Audios | |||||
| Spectrogram | ![]() |
![]() |
![]() |
![]() |
![]() |
| Watermark Message | ---------------- | 1111010111111110 | 1111111100000000 | 1111111100000000 | 1111111100000000 |
| Distribution | ---------------- | ![]() |
![]() |
![]() |
![]() |
Watermark creation aims to embed a new watermark into clean audio
| Attack (AudioSeal) | Original (Clean) | Watermark | AudioMarkBench | Ours | Ours (+opt) |
| Audios | |||||
| Spectrogram | ![]() |
![]() |
![]() |
![]() |
![]() |
| Watermark Message | ---------------- | 1111111011011111 | 1111111100000000 | 1111111100000000 | 1111111100000000 |
| Distribution | ---------------- | ![]() |
![]() |
![]() |
![]() |
| Attack (Timbre) | Original (Clean) | Watermark | AudioMarkBench | Ours | Ours (+opt) |
| Audios | |||||
| Spectrogram | ![]() |
![]() |
![]() |
![]() |
![]() |
| Watermark Message | ---------------- | 1011010010100000 | 1111111100000000 | 1111111100000000 | 1111111100000000 |
| Distribution | ---------------- | ![]() |
![]() |
![]() |
![]() |
watermark removal aims to eliminate the original watermark from a watermarked audio
| Attack (AudioSeal) | Original (Clean) | Watermark | AudioMarkBench | Ours | Ours (+opt) |
| Audios | |||||
| Spectrogram | ![]() |
![]() |
![]() |
![]() |
![]() |
| Watermark Message | ---------------- | 0011001110110010 | 0011001110111010 | 0110110110100101 | 0111011110100000 |
| Distribution | ---------------- | ![]() |
![]() |
![]() |
![]() |
| Attack (Timbre) | Original (Clean) | Watermark | AudioMarkBench | Ours | Ours (+opt) |
| Audios | |||||
| Spectrogram | ![]() |
![]() |
![]() |
![]() |
![]() |
| Watermark Message | ---------------- | 1011101010000010 | 0100011101111101 | 0100110011011000 | 0100111011000000 |
| Distribution | ---------------- | ![]() |
![]() |
![]() |
![]() |
We present three distributions in the Wavmark.
| Clean Distribution | Watemark Distribution (bit=0) | Watemark Distribution (bit=1) |
![]() |
![]() |
![]() |
Collaborative-watermarking-with-codecs [ICASSP 2025]
We use the zero-bit watermark: the watermark is present in watermarked audio and not present in clean speech. If the message probabilities > 0.5, it is classified as watermarked; otherwise, it is classified as clean. The distribution for the blue color is the true message probabilities distribution. We think it is still can be shown for the normal distribution.
| Clean Distribution | Watemark Distribution |
![]() |
![]() |
SilentCipher [Interspeech 2024]
If the confidence score >= 0.95, it is classified as watermarked; otherwise, it is classified as clean.
| Clean Distribution | Watemark Distribution |
![]() |
![]() |