Invisible Watermarking For Audio Generation Diffusion Models

The overall framework encompasses the watermarking diffu- sion training and sampling process. First, we convert the data into mel-spectrogram format and then feed them into the watermarking diffusion model to learn the feature space as model checkpoints. When we input a noise image into these model checkpoints, we obtain three distinct generations based on whether different triggers are presented with inputs or not. Built on previous work, thank all contributions. link

Requirement packages

# conda
conda install --file requirement.txt

# pip 
pip install --file requirement.txt

Prepare Dataset

⏵ download raw audio dataset

python utils/prepare_sc.py

⏵ mel-spectrogram convertion (the following code automatically setup dataset for training)

python utils/audio_conversion.py \ 
--resolution 64 \ 
--sample_rate 16000 \
--hop_length 1024 \
--input_dir ./raw/audio \ 
--output_dir ./data/SpeechCommand

⏵ directory tree (show structure for straightforward understanding)

watermark-audio-diffusion/
├── configs/
├── ...
├── main.py
├── vanilla.py
├── data/
│   ├── SpeechCommand/
│   │   ├── val/
│   │   ├── test/
│   │   ├── train/
│   │   │   ├── class_1
│   │   │   ├── class_2
│   │   │   └── ...
│   ├── out_class/
│   │   ├── test/
│   │   └── train/
├── raw/
│   ├── audio/
│   ├── npy/
│   ├── speech_command_v2/
│   └── .gz

⍾ Train

1) In-Distribution Watermark

# (blend) dataset name has to be the same as the one that store inside directory ./data
python main.py --dataset SpeechCommand --config sc_64.yml --ni --gamma 0.6 --target_label 6

# (patch) --miu_path is where you trigger located
python main.py --dataset SpeechCommand --config sc_64.yml --ni --gamma 0.1 --trigger_type patch --miu_path './images/white.png' --patch_size 3

2) Out-of-Distribution Watermark

# (blend) dataset name has to be out_class, put the out-distr class inside (directory tree)
python main.py --dataset out_class --config sc_64.yml --ni --gamma 0.6 --watermark d2dout

3) Instance-Specific Watermark

# (blend) --watermark argument specify watermarking type (d2din, d2dout, d2i)
python main.py --dataset SpeechCommand --config sc_64.yml --ni --gamma 0.6 --watermark d2i

(optional) Vanilla Diffusion Model

python vanilla.py --doc vanilla_sc64 --config sc_64.yml --ni

⍾ Sample | Generation

DDPM Schedule

# (blend)
python main.py --dataset SpeechCommand --config sc_64.yml --ni --sample --sample_type ddpm_noisy --fid --timesteps 1000 --eta 1 --gamma 0.6 --watermark d2din

DDIM Schedule

# (blend)
python main.py --dataset SpeechCommand --config sc_64.yml --ni --sample --fid --timesteps 100 --eta 0 --gamma 0.6 --skip_type 'quad' --watermark d2din

⍾ Evaluation

⏵ Train Classifier using ResNeXt model architecture for FID and WSR

# train 
python train_speech_commands.py

# test
python test_speech_commands.py

⏵ SNR, PSNR and SSIM please refer to eval directory

⍾ Citation

@article{xxw2023watermark,
    title   = {Invisible Watermarking for Audio Generation Diffusion Models},
    author  = {Cao, Xirong and Li, Xiang and Jadav, Divyesh and Wu, Yanzhao and Chen, Zhehui and Zeng, Chen and Wei, Wenqi},
    journal = {ArXiv},
    year    = {2023},
    volume  = {abs/2309.13166}
}

🙏 Appreciation

The code is based on Trojan Diffusion. TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets, arXiv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Invisible Watermarking For Audio Generation Diffusion Models

Requirement packages

Prepare Dataset

⍾ Train

⍾ Sample | Generation

⍾ Evaluation

⍾ Citation

🙏 Appreciation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
configs		configs
eval		eval
images		images
models		models
utils		utils
watermark		watermark
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirement.txt		requirement.txt
vanilla.py		vanilla.py

ywh-my/watermark-audio-diffusion

Folders and files

Latest commit

History

Repository files navigation

Invisible Watermarking For Audio Generation Diffusion Models

Requirement packages

Prepare Dataset

⍾ Train

⍾ Sample | Generation

⍾ Evaluation

⍾ Citation

🙏 Appreciation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages