I'm glad to announce that two of our publications have been accepted at EMNLP 2020!
- In MultiCQA, we study the zero-shot transfer capabilities of
text matching models on a massive scale, by self-supervised training on 140 source domains from
community question answering forums in English. We surprisingly find that neither domain
similarity nor data size are critical factors for the best zero-shot transferability. We also demonstrate that considering a broad selection of source domains is crucial for obtaining the best zero-shot transfer performances, which contrasts the standard procedure that merely relies on the largest and most similar ones.
We extensively study how to best combine multiple source domains and propose incorporating self-supervised with supervised multi-task learning.
Our MultiCQA model trained on all available source domains
considerably outperforms in-domain BERT on six out of nine benchmarks.
- In AdapterHub, we present a novel framework for adapting transformers, built on top of the popular HuggingFace Transformers library.
AdapterHub covers the entire lifecycle of training, sharing, and reusing adapter models in state-of-the-art pre-trained transformers such as BERT, RoBERTa, and XLM-R.
AdapterHub provides researchers with a unified interface to the most recent adapter architectures and composition techniques, and it allows dynamic "stitching-in" of pre-trained adapters for different tasks and languages.
Researchers can easily export their adapters and share them with the community through our central repository.
ArXiv links: