[ Have a look at the presentation slides:
slides-OFFZONE.pdf
/ slides-ODS.pdf
]
[ Related demonstration (Jupyter notebook):
demo.ipynb
]
Overview |
Attacks |
Tools |
More on the topic
An overview of black-box attacks on AI and tools that might be useful during security testing of machine learning models.
demo.ipynb
:
A demonstration of use of multifunctional tools during security testing of machine learning models digits_blackbox
& digits_keras
trained on the MNIST dataset and provided in Counterfit as example targets.
Slides:
– Machine Learning in products
– Threats to Machine Learning models
– Example model overview
– Evasion attacks
– Model inversion attacks
– Model extraction attacks
– Defences
– Adversarial Robustness Toolbox
– Counterfit
- Model inversion attack:
MIFace
— code / docs / 🔗DOI:10.1145/2810103.2813677 - Model extraction attack:
Copycat CNN
— code / docs / 🔗arXiv:1806.05476 - Evasion attack:
Fast Gradient Method (FGM)
— code / docs / 🔗arXiv:1412.6572 - + Evasion attack:
HopSkipJump
— code / docs / 🔗arXiv:1904.02144
– [ Trusted AI, IBM ] Adversarial Robustness Toolbox (ART): Trusted-AI/adversarial-robustness-toolbox
– [ Microsoft Azure ] Counterfit: Azure/counterfit
-
adversarial examples
evasion attacks
How MIT researchers made Google's AI think tabby cat is guacamole: overview / 🔗arXiv:1707.07397 + arXiv:1804.08598 -
model inversion attacks
Apple's take on model inversion: overview / 🔗arXiv:2111.03702 -
model inversion attacks
Google's demonstration of extraction of training data that the GPT-2 model has memorized: overview / 🔗arXiv:2012.07805 -
attacks on AI
adversarial attacks
poisoning attacks
model inference attacks
→ Posts on PortSwigger's "The Daily Swig" by Ben Dickson