Skip to content

nthu-datalab/On.the.Trade-off.between.Adversarial.and.Backdoor.Robustness

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

On the Trade-off between Adversarial and Backdoor Robustness

This is the repository for the paper On the Trade-off between Adversarial and Backdoor Robustness, by Cheng-Hsin Weng , Yan-Ting Lee, and Shan-Hung Wu, published in the Proceedings of NeurIPS 2020. Our code is implemented in TensorFlow.

In this paper, we conduct experiments to study whether adversarial robustness and backdoor robustness can affect each other and find a trade-off — by increasing the robustness of a network to adversarial examples, the network becomes more vulnerable to backdoor attacks.

Cover

Installation

Clone and install requirements.

git clone https://github.com/nthu-datalab/On.the.Trade-off.between.Adversarial.and.Backdoor.Robustness
cd On.the.Trade-off.between.Adversarial.and.Backdoor.Robustness
pip install -r requirements.txt

Table 1(a)

The trade-off between adversarial and backdoor robustness given different defenses against adversarial attacks - Adversarial training and its enhancements.

Dataset Adv. Defense Accuracy Adv. Roubustness Backdoor Success rate
MNIST None (Std. Training) 99.1% 0.0% 17.2%
Adv. Training 98.8% 93.4% 67.2%
Lipschitz Reg. 99.3% 0.0% 5.7%
Lipschitz Reg. + Adv. Training 98.7% 93.6% 52.1%
Denoising Layer 96.9% 0.0% 9.6%
Denoising Layer + Adv. Training 98.3% 90.6% 20.8%
CIFAR10 None (Std. Training) 90.0% 0.0% 64.1%
Adv. Training 79.3% 48.9% 99.9%
Lipschitz Reg. 88.2% 0.0% 75.6%
Lipschitz Reg. + Adv. Training 79.3% 48.5% 99.5%
Denoising Layer 90.8% 0.0% 99.6%
Denoising Layer + Adv. Training 79.4% 49.0% 100.0%
ImageNet None (Std. Training) 72.4% 0.1% 3.9%
Adv. Training 55.5% 18.4% 65.4%
Denoising Layer 71.9% 0.1% 6.9%
Denoising Layer + Adv. Training 55.6% 18.1% 68.0%

Table 1(b)

The trade-off between adversarial and backdoor robustness given different defenses against adversarial attacks - Certified robustness

Dataset Poisoned Data Rate Adv. Defense Accuracy Certified Robustness Adv. Roubustness Backdoor Success rate
MNIST 5% None 99.4% N/A 0.0% 36.3%
IBP 97.5% 84.1% 94.6% 92.4%
CIFAR10 5% None 87.9% N/A 0.0% 99.9%
IBP 47.7% 24.0% 35.3% 100.0%
0.5% None 88.7% N/A 0.0% 81.8%
IBP 50.8% 25.8% 35.7% 100.0%

Table 3 (a)(b)

he performance of the pre-training backdoor defenses that detect and remove poisoned training data.

Dataset Adv. Defense Detection Rate (Spectral signatures) Detection Rate (Activation Clustering)
5% 1% 0.5% 5% 1% 0.5%
CIFAR10 Dirty-Label Sticker + Std. Training 81.6% 24.4% 2.4% 100% 100% 5.58%
Clean-Label Sticker + Adv. Training 50.1% 10.6% 5.2% 48.2% 9.59% 5.01%
ImageNet Dirty-Label Sticker + Std. Training 100% 84.6% 100% 100% 100% 100%
Clean-Label Sticker + Adv. Training 50.5% 13.1% 9.23% 47.8% 9.67% 3.72%

Table 3 (c)

The performance of the post-training backdoor defense that cleanses neurons.

Dataset Trigger Type Trigger Label Training Algorithm Success rate w/o Defense Success rate w/ Defense
CIFAR10 Sticker Dirty Std. Training 100% 0.1%
Clean Adv. Training 99.9% 0%
Watermark Dirty Std. Training 99.7% 39.3%
Clean Adv. Training 92.7% 1.2%
ImageNet Sticker Dirty Std. Training 98.1% 2.3%
Clean Adv. Training 65.4% 1.1%
Watermark Dirty Std. Training 96.3% 39.8%
Clean Adv. Training 49.7% 4.0%

About

Code for "On the Trade-off between Adversarial and Backdoor Robustness" (NIPS 2020)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published