Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems

Our Contributions

  1. We introduce VIPER, a black box visual adversarial attack that randomly replaces characters in the input text with visual nearest neighbors.
  2. VIPER leads to considerably reduced performances of SOTA models across several different NLP tasks. For instance, we see performance drops by up to 82% for Chunking.
  3. We show that humans are only mildly or not at all affected by our visual perturbations.
  4. We explore three methods to shield models from visual attacks, namely, visual character embeddings, adversarial training, and rule-based recovery.

Abstract

Visual modifications to text are often used to obfuscate offensive comments in social media (e.g., ''!d10t'') or as a writing style (''1337'' in ''leet speak''), among other scenarios. We consider this as a new type of adversarial attack in NLP, a setting to which humans are very robust, as our experiments with both simple and more difficult visual perturbations demonstrate. We investigate the impact of visual adversarial attacks on current NLP systems on character-, word-, and sentence-level tasks, showing that both neural and non-neural models are, in contrast to humans, extremely sensitive to such attacks, suffering performance decreases of up to 82%. We then explore three shielding methods—visual character embeddings, adversarial training, and rule-based recovery—which substantially improve the robustness of the models. However, the shielding methods still fall behind performances achieved in non-attack scenarios, which demonstrates the difficulty of dealing with visual attacks.

Bibtex

@inproceedings{Eger:2019:NAACL,
  title = {Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems},
  year = {2019},
  author = {Steffen Eger and G{\"o}zde G{\"u}l {\c S}ahin and Andreas R{\"u}ckl{\'e} and Ji-Ung Lee and Claudia Schulz and Mohsen Mesgar and Krishnkant Swarnkar and Edwin Simpson and Iryna Gurevych},
  booktitle = {Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics},
  month = {Februar},
  journal = {NAACL 2019},
  pages = {1634--1647},
  url = {https://www.aclweb.org/anthology/N19-1165}
}