TernaryBERT

Introduced by Zhang et al. in TernaryBERT: Distillation-aware Ultra-low Bit BERT

TernaryBERT is a Transformer-based model which ternarizes the weights of a pretrained BERT model to ${-1,0,+1}$, with different granularities for word embedding and weights in the Transformer layer. Instead of directly using knowledge distillation to compress a model, it is used to improve the performance of ternarized student model with the same size as the teacher model. In this way, we transfer the knowledge from the highly-accurate teacher model to the ternarized student model with smaller capacity.

Source: TernaryBERT: Distillation-aware Ultra-low Bit BERT

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Quantization	2	66.67%
Model Compression	1	33.33%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
BERT	Language Models

Categories

Add Remove

Transformers

Autoencoding Transformers