Ensemble adversarial black-box attacks against deep learning systems

作者:

Highlights:

• Deep learning models, e.g., state-of-the-art convolutional neural networks (CNNs), have been widely applied into security-sensitivity tasks, such as facial recognition, automated driving, etc. Then their vulnerability analysis is an emergent topic, especially for black-box attacks, where adversaries do not know the model internal architectures or training parameters.

• This paper fully investigates the black-box attack strategies, which always train a substitute model to craft adversarial examples, and these adversarial examples can be transferred to disorder target models due to the transferability. However, conventional single-substitute attack strategies are easily defended by existed defense mechanism, e.g., ensemble adversarial training technology, and achieve unsatisfactory attack performance in black-box attack scenario.

• In this paper, the authors attempt to ensemble multiple pre-trained substitute models to produce adversarial examples with more powerful transferability in the form of selective cascade ensemble and stack parallel ensemble. Moreover, potential factors that contribute to the high-efficiency attacks are presented from three perspectives: the transferability of substitutes, the diversity of substitutes and the number of substitutes. Two classical measurements, Success rate and Transfer rate are used to analyze the vulnerability of deep learning models in black-box attack scenario. Two common pairwise and non-pairwise diversity measures are adopted to explore the relationship between the diversity in substitutes ensembles and transferability of crafted adversarial examples.

摘要

•Deep learning models, e.g., state-of-the-art convolutional neural networks (CNNs), have been widely applied into security-sensitivity tasks, such as facial recognition, automated driving, etc. Then their vulnerability analysis is an emergent topic, especially for black-box attacks, where adversaries do not know the model internal architectures or training parameters.•This paper fully investigates the black-box attack strategies, which always train a substitute model to craft adversarial examples, and these adversarial examples can be transferred to disorder target models due to the transferability. However, conventional single-substitute attack strategies are easily defended by existed defense mechanism, e.g., ensemble adversarial training technology, and achieve unsatisfactory attack performance in black-box attack scenario.•In this paper, the authors attempt to ensemble multiple pre-trained substitute models to produce adversarial examples with more powerful transferability in the form of selective cascade ensemble and stack parallel ensemble. Moreover, potential factors that contribute to the high-efficiency attacks are presented from three perspectives: the transferability of substitutes, the diversity of substitutes and the number of substitutes. Two classical measurements, Success rate and Transfer rate are used to analyze the vulnerability of deep learning models in black-box attack scenario. Two common pairwise and non-pairwise diversity measures are adopted to explore the relationship between the diversity in substitutes ensembles and transferability of crafted adversarial examples.

论文关键词:Black-box attack,Vulnerability,Ensemble adversarial attack,Diversity,Transferability

论文评审过程:Received 15 March 2019, Revised 13 November 2019, Accepted 24 December 2019, Available online 31 December 2019, Version of Record 9 January 2020.

论文官网地址:https://doi.org/10.1016/j.patcog.2019.107184