Search Results for author: Ian E. H. Yen

Found 9 papers, 5 papers with code

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

no code implementations ACL 2022 Shaoyi Huang, Dongkuan Xu, Ian E. H. Yen, Yijue Wang, Sung-En Chang, Bingbing Li, Shiyang Chen, Mimi Xie, Sanguthevar Rajasekaran, Hang Liu, Caiwen Ding

Conventional wisdom in pruning Transformer-based language models is that pruning reduces the model expressiveness and thus is more likely to underfit rather than overfit.

Knowledge Distillation

Rethinking Network Pruning -- under the Pre-train and Fine-tune Paradigm

1 code implementation NAACL 2021 Dongkuan Xu, Ian E. H. Yen, Jinxi Zhao, Zhibin Xiao

In particular, common wisdom in pruning CNN states that sparse pruning technique compresses a model more than that obtained by reducing number of channels and layers (Elsen et al., 2020; Zhu and Gupta, 2017), while existing works on sparse pruning of BERT yields inferior results than its small-dense counterparts such as TinyBERT (Jiao et al., 2020).

Network Pruning

Minimizing FLOPs to Learn Efficient Sparse Representations

1 code implementation ICLR 2020 Biswajit Paria, Chih-Kuan Yeh, Ian E. H. Yen, Ning Xu, Pradeep Ravikumar, Barnabás Póczos

Deep representation learning has become one of the most widely adopted approaches for visual search, recommendation, and identification.

Quantization Representation Learning +1

DEEP-TRIM: REVISITING L1 REGULARIZATION FOR CONNECTION PRUNING OF DEEP NETWORK

no code implementations ICLR 2019 Chih-Kuan Yeh, Ian E. H. Yen, Hong-You Chen, Chun-Pei Yang, Shou-De Lin, Pradeep Ravikumar

State-of-the-art deep neural networks (DNNs) typically have tens of millions of parameters, which might not fit into the upper levels of the memory hierarchy, thus increasing the inference time and energy consumption significantly, and prohibiting their use on edge devices such as mobile phones.

Representer Point Selection for Explaining Deep Neural Networks

1 code implementation NeurIPS 2018 Chih-Kuan Yeh, Joon Sik Kim, Ian E. H. Yen, Pradeep Ravikumar

We propose to explain the predictions of a deep neural network, by pointing to the set of what we call representer points in the training set, for a given test point prediction.

Word Mover's Embedding: From Word2Vec to Document Embedding

1 code implementation EMNLP 2018 Lingfei Wu, Ian E. H. Yen, Kun Xu, Fangli Xu, Avinash Balakrishnan, Pin-Yu Chen, Pradeep Ravikumar, Michael J. Witbrock

While the celebrated Word2Vec technique yields semantically rich representations for individual words, there has been relatively less success in extending to generate unsupervised sentences or documents embeddings.

Document Embedding General Classification +6

Revisiting Random Binning Features: Fast Convergence and Strong Parallelizability

2 code implementations14 Sep 2018 Lingfei Wu, Ian E. H. Yen, Jie Chen, Rui Yan

We thus propose the first analysis of RB from the perspective of optimization, which by interpreting RB as a Randomized Block Coordinate Descent in the infinite-dimensional space, gives a faster convergence rate compared to that of other random features.

Proximal Quasi-Newton for Computationally Intensive L1-regularized M-estimators

no code implementations NeurIPS 2014 Kai Zhong, Ian E. H. Yen, Inderjit S. Dhillon, Pradeep Ravikumar

We consider the class of optimization problems arising from computationally intensive L1-regularized M-estimators, where the function or gradient values are very expensive to compute.

General Classification Structured Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.