论文阅读笔记：A survey on Adversarial Machine Learning

这篇简短的综述论文主要介绍了什么是Adversarial Machine Learning，列出了AML的几种常见分类。

Introduction

Definition of adversarial machine learning

machine learning in the presence of an adversary.

All kinds of attacks can be categorized in three main types:

poisoning attacks: trying to poison the training of test dat
detection attacks: with the objective of evading detection
model extraction attacks: with the objective of replicating the model

当然，这三种分类方法只是作何自己做的一个分类，实际上还可以有不同的分类方法，需要做的就是理解每一种分分类的划分依据是什么就可以了，不需要刻意去记住有那几种分类。

当然，还有一种分类是将攻击者和防御者建模成一个博弈问题。

It is worth to mention another much different aspect of research in adversarial machine learning which models the model-attacker worlds as a game and defines different moves for each side of the game.
By such definition, a min-max algorithm can be used to determine best moves for each side and fide possible equilibriums of the game.

Taxonomy

这一小节列举出了目前在AML领域中几种常见的攻击方法。

A. Model Extraction Attacks

这种攻击方法一般在machine learning as a service的场景中。即云服务商只会给用户提供一个查询的API接口，但是每次查询是需要收费的。

这时候攻击者就希望能不能找到一个和目标模型表现近似的模型，从而避免向云服务商付费。

这种攻击可以大致分为下面3类：

eqution-solving model extraction attacks
在逻辑回归等简单模型中适用，原理主要是解方程组。
path finding attacks
一般用在决策树模型中。
membership queries attacks
判定一个给定的数据点事会否在原始数据集中。

B. Adversarial Examples

这个不需要过多解释了，现在的文章实在是太多了。

需要介绍的是在论文[1]中，作者提出了既能够欺骗人眼也能够欺骗机器的对抗样本。

对抗样本目前最需要解决的问题是防御问题，尤其时针对迁移性攻击的防御。

C. Data Poisoning Attacks

这种攻击适用于攻击者不能直接接触到训练数据，但训练过程是online training or adaptive training，即训练过程中需要不断加入新的训练数据。

这种攻击在SVM中很容易成功，因为svm做分类最主要就是support vectors在起作用。

D. Model Evasion Attacks

这种攻击和对抗样本攻击很像，是在模型做inference阶段的攻击。但是因为做spam detection or malware detection在机器学习中很重要，所以这种攻击就被单独拎出来了。

Conclusion

这篇文章中没有介绍道data inversion attack，即能够推导出原始训练数据近似的攻击。
AML的防御还是一个需要不断深入研究的领域。

参考

[1] Elsayed, G.F., Shankar, S., Cheung, B.Papernot, N., Kurakin, A., Goodfellow, I. and Sohl-Dickstein, J., 2018. Adversarial Examples that Fool both Human and Computer Vision. arXiv preprint arXiv:1802.08195