Improving the Reliability of Deep Neural Networks in NLP: A Review

作者:

Highlights:

摘要

Deep learning models have achieved great success in solving a variety of natural language processing (NLP) problems. An ever-growing body of research, however, illustrates the vulnerability of deep neural networks (DNNs) to adversarial examples — inputs modified by introducing small perturbations to deliberately fool a target model into outputting incorrect results. The vulnerability to adversarial examples has become one of the main hurdles precluding neural network deployment into safety-critical environments. This paper discusses the contemporary usage of adversarial examples to foil DNNs and presents a comprehensive review of their use to improve the robustness of DNNs in NLP applications. In this paper, we summarize recent approaches for generating adversarial texts and propose a taxonomy to categorize them. We further review various types of defensive strategies against adversarial examples, explore their main challenges, and highlight some future research directions.

论文关键词:Adversarial examples,Adversarial texts,Natural language processing

论文评审过程:Received 12 February 2019, Revised 21 October 2019, Accepted 7 November 2019, Available online 16 November 2019, Version of Record 8 February 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2019.105210