moved adversarial-ml-101 to separate page

This commit is contained in:
Keith Manville 2020-10-12 17:17:24 -04:00
parent 3b9a5f5e8e
commit 3a030b7bd6
2 changed files with 26 additions and 25 deletions

View file

@ -0,0 +1,25 @@
## Adversarial ML 101
Informally, Adversarial ML is "subverting machine learning systems for fun and profit". The methods underpinning the production machine learning systems are systematically vulnerable to a new class of vulnerabilities across the machine learning supply chain collectively known as Adversarial Machine Learning. Adversaries can exploit these vulnerabilities to manipulate AI systems in order to alter their behavior to serve a malicious end goal.
Consider a typical ML pipeline shown in the left that is gated behind an API, wherein the only way to use the model is to send a query and observe an response. In this example, we assume a blackbox setting: the attacker does **NOT** have direct access to the training data, no knowledge of the algorithm used and no source code of the model. The attacker only queries the model and observes the response.
Here are some of the adversarial ML attacks that an adversary can perform on this system:
| Attack | Overview |
| :--- | :---
| Evasion Attack | Attacker modifies the query to get appropriate response. The attacker can find the just-the-right query to get the desired response algorithmically |
| Poisoning attack | Attacker contaminates the training phase of ML systems to get intended result |
| Model Inversion | Attacker recovers the secret features used in the model through careful queries |
| Membership Inference | Attacker can infer if given data record was part of the model's training dataset or not |
| Model Stealing | Attacker is able to recover a functionally equivalent model with similar fidelity as original model by constructing careful queries |
| Model Replication | Attacker is able to recover functionally equivalent model (but generally with lower fidelity) |
| Reprogramming ML system | Repurpose the ML system to perform an activity it was not programmed for |
| Attacking the ML supply chain | Attacker compromises the ML models as it is being downloaded for use |
| Exploit Software Dependencies | Attacker uses traditional software exploits like buffer overflow or hardware exploits like GPU trojans to attain their goal. |
## Finer points
1. This does not cover all kinds of attacks. Adversarial ML is an active area of research with new classes constantly being discovered.
2. These attacks are applicable to ML models in different paradigms, ML models on premise, on cloud, on Edge.
3. Though the illustration shows blackbox attacks, these attacks have also been shown to work in whitebox (where the attacker has access to either model architecture, code or training data) settings
4. Though we were not specific about what kind of data - image, audio, timeseries, or tabular data, research has shown that of these attacks have shown to exist in all the data types

View file

@ -1,7 +1,7 @@
# Table of Contents
1. [Adversarial ML 101](#adversarial-ml-101)
1. [Adversarial ML 101](/adversarial-ml-101.md)
2. [Why Adversarial ML Threat Matrix?](#why-adversarial-ml-threat-matrix)
3. [Structure of Adversarial ML Threat Matrix](#structure-of-adversarial-ml-threat-matrix)
4. [Things to keep in mind before you use the framework](#things-to-keep-in-mind-before-you-use-the-framework)
@ -21,30 +21,6 @@
The goal of this project is to position attacks on ML systems in an ATT&CK-style framework so that security analysts can orient themselves
in this new and upcoming threats.
## Adversarial ML 101
Informally, Adversarial ML is "subverting machine learning systems for fun and profit". The methods underpinning the production machine learning systems are systematically vulnerable to a new class of vulnerabilities across the machine learning supply chain collectively known as Adversarial Machine Learning. Adversaries can exploit these vulnerabilities to manipulate AI systems in order to alter their behavior to serve a malicious end goal.
Consider a typical ML pipeline shown in the left that is gated behind an API, wherein the only way to use the model is to send a query and observe an response. In this example, we assume a blackbox setting: the attacker does **NOT** have direct access to the training data, no knowledge of the algorithm used and no source code of the model. The attacker only queries the model and observes the response.
Here are some of the adversarial ML attacks that an adversary can perform on this system:
| Attack | Overview |
| :--- | :---
| Evasion Attack | Attacker modifies the query to get appropriate response. The attacker can find the just-the-right query to get the desired response algorithmically |
| Poisoning attack | Attacker contaminates the training phase of ML systems to get intended result |
| Model Inversion | Attacker recovers the secret features used in the model through careful queries |
| Membership Inference | Attacker can infer if given data record was part of the model's training dataset or not |
| Model Stealing | Attacker is able to recover a functionally equivalent model with similar fidelity as original model by constructing careful queries |
| Model Replication | Attacker is able to recover functionally equivalent model (but generally with lower fidelity) |
| Reprogramming ML system | Repurpose the ML system to perform an activity it was not programmed for |
| Attacking the ML supply chain | Attacker compromises the ML models as it is being downloaded for use |
| Exploit Software Dependencies | Attacker uses traditional software exploits like buffer overflow or hardware exploits like GPU trojans to attain their goal. |
## Finer points
1. This does not cover all kinds of attacks. Adversarial ML is an active area of research with new classes constantly being discovered.
2. These attacks are applicable to ML models in different paradigms, ML models on premise, on cloud, on Edge.
3. Though the illustration shows blackbox attacks, these attacks have also been shown to work in whitebox (where the attacker has access to either model architecture, code or training data) settings
4. Though we were not specific about what kind of data - image, audio, timeseries, or tabular data, research has shown that of these attacks have shown to exist in all the data types
## Why Adversarial ML Threat Matrix?
1. In the last three years, major companies such as [Google](https://www.zdnet.com/article/googles-best-image-recognition-system-flummoxed-by-fakes/), [Amazon] (https://www.fastcompany.com/90240975/alexa-can-be-hacked-by-chirping-birds), [Microsoft](https://www.theguardian.com/technology/2016/mar/24/tay-microsofts-ai- chatbot-gets-a-crash-course-in-racism-from-twitter), and [Tesla](https://spectrum.ieee.org/cars-that-think/transportation/self-driving/three-small-stickers-on- road-can-steer-tesla-autopilot-into-oncoming-lane), have had their ML systems tricked, evaded, or misled.
2. This trend is only set to rise: According to [Gartner report](https://www.gartner.com/doc/3939991). 30% of cyberattacks by 2022 will involve data poisoning, model theft or adversarial examples.