We show that the training steps of adapters are up to 60% faster than full model fine-tuning with common hyperparameter choices, while only being 4–6% slower during inference (see Table 1 below). The training speedup can be explained through decreased overhead of gradient computation. Most of the parameters are frozen when using adapters and it is not necessary to backpropagate through the first components (see Figure 1 below).
We propose AdapterDrop, the efficient and dynamic removal of adapters from lower transformer layers at training and inference time, resulting in more efficient adapter-based models that are dynamically adjustable regarding the available computational resources (depicted in Figure 1 above). We show that dropping adapters from lower transformer layers considerably improves the inference speed in multi-task settings. For example, with adapters dropped from the first five layers, AdapterDrop is 39% faster when performing inference on 8 tasks simultaneously. At the same time, we largely maintain the task performances even with several dropped layers (see Figure 2 below).
In our paper, we include several other interesting findings:
Massively pre-trained transformer models are computationally expensive to fine-tune, slow for inference, and have large storage requirements. Recent approaches tackle these shortcomings by training smaller models, dynamically reducing the model size, and by training light-weight adapters. In this paper, we propose AdapterDrop, removing adapters from lower transformer layers during training and inference, which incorporates concepts from all three directions. We show that AdapterDrop can dynamically reduce the computational overhead when performing inference over multiple tasks simultaneously, with minimal decrease in task performances. We further prune adapters from AdapterFusion, which improves the inference efficiency while maintaining the task performances entirely.
@article{rueckle-etal-2020-adapterdrop, title = "{AdapterDrop}: On the Efficiency of Adapters in Transformers", author = {R{\"u}ckl{\'e}, Andreas and Geigle, Gregor and Glockner, Max and Beck, Tilman and Pfeiffer, Jonas and Reimers, Nils and Gurevych, Iryna}, journal = {arXiv}, year = {2020}, url = "https://arxiv.org/abs/2010.11918" }