no code implementations • ACL 2022 • Shaoyi Huang, Dongkuan Xu, Ian E. H. Yen, Yijue Wang, Sung-En Chang, Bingbing Li, Shiyang Chen, Mimi Xie, Sanguthevar Rajasekaran, Hang Liu, Caiwen Ding
Conventional wisdom in pruning Transformer-based language models is that pruning reduces the model expressiveness and thus is more likely to underfit rather than overfit.
1 code implementation • NAACL 2021 • Dongkuan Xu, Ian E. H. Yen, Jinxi Zhao, Zhibin Xiao
In particular, common wisdom in pruning CNN states that sparse pruning technique compresses a model more than that obtained by reducing number of channels and layers (Elsen et al., 2020; Zhu and Gupta, 2017), while existing works on sparse pruning of BERT yields inferior results than its small-dense counterparts such as TinyBERT (Jiao et al., 2020).
1 code implementation • ICLR 2020 • Biswajit Paria, Chih-Kuan Yeh, Ian E. H. Yen, Ning Xu, Pradeep Ravikumar, Barnabás Póczos
Deep representation learning has become one of the most widely adopted approaches for visual search, recommendation, and identification.
no code implementations • ICLR 2019 • Chih-Kuan Yeh, Ian E. H. Yen, Hong-You Chen, Chun-Pei Yang, Shou-De Lin, Pradeep Ravikumar
State-of-the-art deep neural networks (DNNs) typically have tens of millions of parameters, which might not fit into the upper levels of the memory hierarchy, thus increasing the inference time and energy consumption significantly, and prohibiting their use on edge devices such as mobile phones.
1 code implementation • NeurIPS 2018 • Chih-Kuan Yeh, Joon Sik Kim, Ian E. H. Yen, Pradeep Ravikumar
We propose to explain the predictions of a deep neural network, by pointing to the set of what we call representer points in the training set, for a given test point prediction.
1 code implementation • EMNLP 2018 • Lingfei Wu, Ian E. H. Yen, Kun Xu, Fangli Xu, Avinash Balakrishnan, Pin-Yu Chen, Pradeep Ravikumar, Michael J. Witbrock
While the celebrated Word2Vec technique yields semantically rich representations for individual words, there has been relatively less success in extending to generate unsupervised sentences or documents embeddings.
no code implementations • 10 Oct 2018 • Sung-En Chang, Xun Zheng, Ian E. H. Yen, Pradeep Ravikumar, Rose Yu
Tensor decomposition has been extensively used as a tool for exploratory analysis.
2 code implementations • 14 Sep 2018 • Lingfei Wu, Ian E. H. Yen, Jie Chen, Rui Yan
We thus propose the first analysis of RB from the perspective of optimization, which by interpreting RB as a Randomized Block Coordinate Descent in the infinite-dimensional space, gives a faster convergence rate compared to that of other random features.
no code implementations • NeurIPS 2014 • Kai Zhong, Ian E. H. Yen, Inderjit S. Dhillon, Pradeep Ravikumar
We consider the class of optimization problems arising from computationally intensive L1-regularized M-estimators, where the function or gradient values are very expensive to compute.