谨防最近防御对手实例的鲁棒性黑匣子

ID:38897

大小:1.47 MB

页数:40页

时间:2023-03-14

金币:2

上传者:战必胜
entropy
Article
Beware the Black-Box: On the Robustness of Recent Defenses
to Adversarial Examples
Kaleel Mahmood
1,
* , Deniz Gurevin
2
, Marten van Dijk
3
and Phuoung Ha Nguyen
4

 
Citation: Mahmood, K.; Gurevin, D.;
van Dijk, M.; Nguyen, P.H. Beware
the Black-Box: On the Robustness of
Recent Defenses to Adversarial
Examples. Entropy 2021, 23, 1359.
https://doi.org/10.3390/e23101359
Academic Editor: Luis
Hernández-Callejo
Received: 16 September 2021
Accepted: 14 October 2021
Published: 18 October 2021
Publishers Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
1
Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269, USA
2
Department of Electrical and Computer Engineering, University of Connecticut, Storrs, CT 06269, USA;
deniz.gurevin@uconn.edu
3
CWI, 1098 XG Amsterdam, The Netherlands; Marten.van.Dijk@cwi.nl
4
eBay, San Jose, CA 95125, USA; phuongha.ntu@gmail.com
* Correspondence: kaleel.mahmood@uconn.edu
Abstract: Many defenses have recently been proposed at venues like NIPS, ICML, ICLR and CVPR.
These defenses are mainly focused on mitigating white-box attacks. They do not properly examine
black-box attacks. In this paper, we expand upon the analyses of these defenses to include adaptive
black-box adversaries. Our evaluation is done on nine defenses including Barrage of Random Trans-
forms, ComDefend, Ensemble Diversity, Feature Distillation, The Odds are Odd, Error Correcting
Codes, Distribution Classifier Defense, K-Winner Take All and Buffer Zones. Our investigation is
done using two black-box adversarial models and six widely studied adversarial attacks for CIFAR-10
and Fashion-MNIST datasets. Our analyses show most recent defenses (7 out of 9) provide only
marginal improvements in security (
<
25%), as compared to undefended networks. For every defense,
we also show the relationship between the amount of data the adversary has at their disposal, and
the effectiveness of adaptive black-box attacks. Overall, our results paint a clear picture: defenses
need both thorough white-box and black-box analyses to be considered secure. We provide this large
scale study and analyses to motivate the field to move towards the development of more robust
black-box defenses.
Keywords: adversarial machine learning; black-box attacks; security
1. Introduction
Convolutional Neural Networks (CNNs) are widely used for image classification
[1,2]
and object detection. Despite their widespread use, CNNs have been shown to be vul-
nerable to adversarial examples [
3
]. Adversarial examples are clean images which have
malicious noise added to them. This noise is small enough so that humans can visually
recognize the images, but CNNs misclassify them.
Adversarial examples can be created through white-box or black-box attacks, depend-
ing on the assumed adversarial model. White-box attacks create adversarial examples by
directly using information about the trained parameters in a classifier (e.g., the weights
of a CNN). Black-box attacks on the other hand, assume an adversarial model where the
trained parameters of the classifier are secret or unknown. In black-box attacks, the ad-
versary generates adversarial examples by exploiting other information such as querying
the classifier [
4
6
], or using the original dataset the classifier was trained on [
7
10
]. We
can also further categorize black-box attacks based on whether the attack tries to tailor the
adversarial example to specifically overcome the defense (adaptive black-box attacks), or
if the attack is fixed regardless of the defense (non-adaptive black-box attacks). In terms
of attacks, we focus on adaptive black-box adversaries. A natural question is why do we
choose this scope?
(1) White-box robustness does not automatically mean black-box robustness. In secu-
rity communities such as cryptology, black-box attacks are considered strictly weaker than
Entropy 2021, 23, 1359. https://doi.org/10.3390/e23101359 https://www.mdpi.com/journal/entropy
资源描述:

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。
关闭