MobileBERT

Introduced by Sun et al. in MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices

MobileBERT is a type of inverted-bottleneck BERT that compresses and accelerates the popular BERT model. MobileBERT is a thin version of BERT_LARGE, while equipped with bottleneck structures and a carefully designed balance between self-attentions and feed-forward networks. To train MobileBERT, we first train a specially designed teacher model, an inverted-bottleneck incorporated BERT_LARGE model. Then, we conduct knowledge transfer from this teacher to MobileBERT. Like the original BERT, MobileBERT is task-agnostic, that is, it can be generically applied to various downstream NLP tasks via simple fine-tuning. It is trained by layer-to-layer imitating the inverted bottleneck BERT.

Source: MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Language Modelling	1	10.00%
Bayesian Optimization	1	10.00%
Model Compression	1	10.00%
Hate Speech Detection	1	10.00%
Intent Detection	1	10.00%
Natural Language Understanding	1	10.00%
Sentence	1	10.00%
Natural Language Inference	1	10.00%
Question Answering	1	10.00%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
Dense Connections	Feedforward Networks
Layer Normalization	Normalization
Multi-Head Attention	Attention Modules
Residual Connection	Skip Connections
Scaled Dot-Product Attention	Attention Mechanisms

Categories

Add Remove

Transformers

Autoencoding Transformers