Exploring Data Generation Methods for the Story Cloze Test

Our Contributions

  1. We demonstrate that a trivial sentiment baseline can already achieve 60% accuracy on the Story Cloze test set without any textual understanding related to the task.
  2. We present two methods to generate additional training data for the Story Cloze test and analyze their effectiveness.


The Story Cloze test (Mostafazadeh et al., 2016) is a recent effort in providing a common test scenario for text understanding systems. As part of the LSDSem 2017 shared task, we present a system based on a deep learning architecture combined with a rich set of manually-crafted linguistic features. The system outperforms all known baselines for the task, suggesting that the chosen approach is promising. We additionally present two methods for generating further training data based on stories from the ROCStories corpus. Our system and generated data are publicly available on GitHub.


  author    = {Bugert, Michael and Puzikov, Yevgeniy and Rücklé, Andreas and
               Eckle-Kohler, Judith and Martin, Teresa and Martinez Camara, Eugenio and
               Sorokin, Daniil and Peyrard, Maxime and Gurevych, Iryna},
  title     = {{LSDSem 2017: Exploring Data Generation Methods for the Story Cloze Test}},
  booktitle = {Proceedings of the 2nd Workshop on Linking Models of Lexical, Sentential and Discourse-level Semantics (LSDSem)},
  month     = {April},
  year      = {2017},
  address   = {Valencia, Spain},
  publisher = {Association for Computational Linguistics},
  pages     = {56-61},
  url       = {https://aclweb.org/anthology/W/W17/W17-0908.pdf}