no code implementations • 19 Nov 2021 • Yuezhou Sun, Wenlong Zhao, Lijun Zhang, Xiao Liu, Hui Guan, Matei Zaharia
This paper investigates deep neural network (DNN) compression from the perspective of compactly representing and storing trained parameters.
no code implementations • 13 Oct 2019 • Yifan Xu, Kening Zhang, Haoyu Dong, Yuezhou Sun, Wenlong Zhao, Zhuowen Tu
Exposure bias describes the phenomenon that a language model trained under the teacher forcing schema may perform poorly at the inference stage when its predictions are conditioned on its previous predictions unseen from the training corpus.