1 code implementation • ICCV 2021 • Youmin Kim, Jinbae Park, YounHo Jang, Muhammad Ali, Tae-Hyun Oh, Sung-Ho Bae
In prevalent knowledge distillation, logits in most image recognition models are computed by global average pooling, then used to learn to encode the high-level and task-relevant knowledge.
Ranked #31 on Knowledge Distillation on ImageNet
no code implementations • 25 Sep 2019 • Jinbae Park, Sung-Ho Bae
To solve this problem, we propose a hybrid weight representation (HWR) method which produces a network consisting of two types of weights, i. e., ternary weights (TW) and sparse-large weights (SLW).