no code implementations • 27 May 2024 • Florian Bordes, Richard Yuanzhe Pang, Anurag Ajay, Alexander C. Li, Adrien Bardes, Suzanne Petryk, Oscar Mañas, Zhiqiu Lin, Anas Mahmoud, Bargav Jayaraman, Mark Ibrahim, Melissa Hall, Yunyang Xiong, Jonathan Lebensold, Candace Ross, Srihari Jayakumar, Chuan Guo, Diane Bouchacourt, Haider Al-Tahan, Karthik Padthe, Vasu Sharma, Hu Xu, Xiaoqing Ellen Tan, Megan Richards, Samuel Lavoie, Pietro Astolfi, Reyhane Askari Hemmat, Jun Chen, Kushal Tirumala, Rim Assouel, Mazda Moayeri, Arjang Talattof, Kamalika Chaudhuri, Zechun Liu, Xilun Chen, Quentin Garrido, Karen Ullrich, Aishwarya Agrawal, Kate Saenko, Asli Celikyilmaz, Vikas Chandra
Then, we present and discuss approaches to evaluate VLMs.
no code implementations • 26 May 2024 • Zechun Liu, Changsheng Zhao, Igor Fedorov, Bilge Soran, Dhruv Choudhary, Raghuraman Krishnamoorthi, Vikas Chandra, Yuandong Tian, Tijmen Blankevoort
In this work, we identify a collection of applicable rotation parameterizations that lead to identical outputs in full-precision Transformer architectures, and find that some random rotations lead to much better quantization than others, with an up to 13 points difference in downstream zero-shot reasoning performance.
no code implementations • 22 Feb 2024 • Zechun Liu, Changsheng Zhao, Forrest Iandola, Chen Lai, Yuandong Tian, Igor Fedorov, Yunyang Xiong, Ernie Chang, Yangyang Shi, Raghuraman Krishnamoorthi, Liangzhen Lai, Vikas Chandra
The resultant models, denoted as MobileLLM-LS, demonstrate a further accuracy enhancement of 0. 7%/0. 8% than MobileLLM 125M/350M.
no code implementations • 1 Nov 2023 • Ernie Chang, Sidd Srinivasan, Mahi Luthra, Pin-Jie Lin, Varun Nagaraja, Forrest Iandola, Zechun Liu, Zhaoheng Ni, Changsheng Zhao, Yangyang Shi, Vikas Chandra
Text-to-audio generation (TTA) produces audio from a text description, learning from pairs of audio samples and hand-annotated text.
1 code implementation • 25 Oct 2023 • Shih-Yang Liu, Zechun Liu, Xijie Huang, Pingcheng Dong, Kwang-Ting Cheng
Our method, for the first time, can quantize both weights and activations in the LLaMA-13B to only 4-bit and achieves an average score of 63. 1 on the common sense zero-shot reasoning tasks, which is only 5. 8 lower than the full-precision model, significantly outperforming the previous state-of-the-art by 12. 7 points.
1 code implementation • 14 Oct 2023 • Jun Chen, Deyao Zhu, Xiaoqian Shen, Xiang Li, Zechun Liu, Pengchuan Zhang, Raghuraman Krishnamoorthi, Vikas Chandra, Yunyang Xiong, Mohamed Elhoseiny
Motivated by this, we target to build a unified interface for completing many vision-language tasks including image description, visual question answering, and visual grounding, among others.
Ranked #10 on Visual Question Answering on BenchLMM
1 code implementation • 12 Jun 2023 • Xijie Huang, Zechun Liu, Shih-Yang Liu, Kwang-Ting Cheng
Compared with previous coreset selection methods, our method significantly improves QAT performance with different dataset fractions.
no code implementations • 8 Jun 2023 • Ganesh Jawahar, Haichuan Yang, Yunyang Xiong, Zechun Liu, Dilin Wang, Fei Sun, Meng Li, Aasish Pappu, Barlas Oguz, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Raghuraman Krishnamoorthi, Vikas Chandra
In addition, the proposed method achieves the SOTA performance in NAS for building fast machine translation models, yielding better latency-BLEU tradeoff compared to HAT, state-of-the-art NAS for MT.
1 code implementation • 2 Jun 2023 • Zechun Liu, Barlas Oguz, Aasish Pappu, Yangyang Shi, Raghuraman Krishnamoorthi
For machine translation, we achieved BLEU scores of 21. 7 and 17. 6 on the WMT16 En-Ro benchmark, compared with a full precision mBART model score of 26. 8.
no code implementations • 29 May 2023 • Zechun Liu, Barlas Oguz, Changsheng Zhao, Ernie Chang, Pierre Stock, Yashar Mehdad, Yangyang Shi, Raghuraman Krishnamoorthi, Vikas Chandra
Several post-training quantization methods have been applied to large language models (LLMs), and have been shown to perform well down to 8-bits.
no code implementations • 22 Mar 2023 • Renjie Wei, Shuwen Zhang, Zechun Liu, Meng Li, Yuchen Fan, Runsheng Wang, Ru Huang
While the performance of deep convolutional neural networks for image super-resolution (SR) has improved significantly, the rapid increase of memory and computation requirements hinders their deployment on resource-constrained devices.
1 code implementation • 4 Feb 2023 • Shih-Yang Liu, Zechun Liu, Kwang-Ting Cheng
In addition, we also found that the interdependence between quantized weights in $\textit{query}$ and $\textit{key}$ of a self-attention layer makes ViT vulnerable to oscillation.
no code implementations • 9 Jun 2022 • Xijie Huang, Zhiqiang Shen, Shichao Li, Zechun Liu, Xianghong Hu, Jeffry Wicaksana, Eric Xing, Kwang-Ting Cheng
In order to deploy deep models in a computationally efficient manner, model quantization approaches have been frequently used.
2 code implementations • 25 May 2022 • Zechun Liu, Barlas Oguz, Aasish Pappu, Lin Xiao, Scott Yih, Meng Li, Raghuraman Krishnamoorthi, Yashar Mehdad
Modern pre-trained transformers have rapidly advanced the state-of-the-art in machine learning, but have also grown in parameters and computational complexity, making them increasingly difficult to deploy in resource-constrained environments.
1 code implementation • 21 Mar 2022 • Shichao Li, Zechun Liu, Zhiqiang Shen, Kwang-Ting Cheng
We propose a new object-centric framework for learning-based stereo 3D object detection.
1 code implementation • CVPR 2022 • Arnav Chavan, Zhiqiang Shen, Zhuang Liu, Zechun Liu, Kwang-Ting Cheng, Eric Xing
This paper explores the feasibility of finding an optimal sub-model from a vision transformer and introduces a pure vision transformer slimming (ViT-Slim) framework.
no code implementations • 3 Dec 2021 • Zechun Liu, Zhiqiang Shen, Yun Long, Eric Xing, Kwang-Ting Cheng, Chas Leichner
We identify that the NAS task requires the synthesized data (we target at image domain here) with enough semantics, diversity, and a minimal domain gap from the natural images.
1 code implementation • CVPR 2022 • Zechun Liu, Kwang-Ting Cheng, Dong Huang, Eric Xing, Zhiqiang Shen
The nonuniform quantization strategy for compressing neural networks usually achieves better performance than its counterpart, i. e., uniform strategy, due to its superior representational capacity.
1 code implementation • 9 Nov 2021 • Zhiqiang Shen, Zechun Liu, Eric Xing
The proposed weight sharing mechanism by sliced recursion structure allows us to build a transformer with more than 100 or even 1000 shared layers with ease while keeping a compact size (13~15M), to avoid optimization difficulties when the model is too large.
Ranked #271 on Image Classification on ImageNet
no code implementations • 21 Jun 2021 • Zechun Liu, Zhiqiang Shen, Shichao Li, Koen Helwegen, Dong Huang, Kwang-Ting Cheng
We show the regularization effect of second-order momentum in Adam is crucial to revitalize the weights that are dead due to the activation saturation in BNNs.
1 code implementation • 16 Apr 2021 • Tianlong Chen, Zhenyu Zhang, Xu Ouyang, Zechun Liu, Zhiqiang Shen, Zhangyang Wang
However, the BN layer is costly to calculate and is typically implemented with non-binary parameters, leaving a hurdle for the efficient implementation of BNN training.
Ranked #167 on Image Classification on CIFAR-10
no code implementations • ICLR 2021 • Zhiqiang Shen, Zechun Liu, Dejia Xu, Zitian Chen, Kwang-Ting Cheng, Marios Savvides
This work aims to empirically clarify a recently discovered perspective that label smoothing is incompatible with knowledge distillation.
1 code implementation • CVPR 2021 • Zhiqiang Shen, Zechun Liu, Jie Qin, Lei Huang, Kwang-Ting Cheng, Marios Savvides
In this paper, we focus on this more difficult scenario: learning networks where both weights and activations are binary, meanwhile, without any human annotated labels.
no code implementations • 8 Feb 2021 • Zhiqiang Shen, Zechun Liu, Jie Qin, Marios Savvides, Kwang-Ting Cheng
A common practice for this task is to train a model on the base set first and then transfer to novel classes through fine-tuning (Here fine-tuning procedure is defined as transferring knowledge from base to novel data, i. e. learning to transfer in few-shot scenario.)
no code implementations • 29 Nov 2020 • Ellen Yi-Ge, Rui Fan, Zechun Liu, Zhiqiang Shen
Keypoints of objects reflect their concise abstractions, while the corresponding connection links (CL) build the skeleton by detecting the intrinsic relations between keypoints.
no code implementations • 4 Jul 2020 • Yun Li, Zechun Liu, Weiqun Wu, Haotian Yao, Xiangyu Zhang, Chi Zhang, Baoqun Yin
In this paper, a simple yet effective network pruning framework is proposed to simultaneously address the problems of pruning indicator, pruning ratio, and efficiency constraint.
no code implementations • 18 May 2020 • Zechun Liu, Xiangyu Zhang, Zhiqiang Shen, Zhe Li, Yichen Wei, Kwang-Ting Cheng, Jian Sun
To tackle these three naturally different dimensions, we proposed a general framework by defining pruning as seeking the best pruning vector (i. e., the numerical value of layer-wise channel number, spacial size, depth) and construct a unique mapping from the pruning vector to the pruned network structures.
no code implementations • CVPR 2020 • Hai Phan, Zechun Liu, Dang Huynh, Marios Savvides, Kwang-Ting Cheng, Zhiqiang Shen
Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs), assuming an approximately optimal trade-off between computational cost and model accuracy.
1 code implementation • 29 Mar 2020 • Devesh Walawalkar, Zhiqiang Shen, Zechun Liu, Marios Savvides
In this paper, we propose Attentive CutMix, a naturally enhanced augmentation strategy based on CutMix.
3 code implementations • 11 Mar 2020 • Zhiqiang Shen, Zechun Liu, Zhuang Liu, Marios Savvides, Trevor Darrell, Eric Xing
This drawback hinders the model from learning subtle variance and fine-grained information.
4 code implementations • ECCV 2020 • Zechun Liu, Zhiqiang Shen, Marios Savvides, Kwang-Ting Cheng
In this paper, we propose several ideas for enhancing a binary network to close its accuracy gap from real-valued networks without incurring any additional computational cost.
3 code implementations • NeurIPS 2019 • Koen Helwegen, James Widdicombe, Lukas Geiger, Zechun Liu, Kwang-Ting Cheng, Roeland Nusselder
Together, the redefinition of latent weights as inertia and the introduction of Bop enable a better understanding of BNN optimization and open up the way for further improvements in training methodologies for BNNs.
6 code implementations • ECCV 2020 • Zichao Guo, Xiangyu Zhang, Haoyuan Mu, Wen Heng, Zechun Liu, Yichen Wei, Jian Sun
It is easy to train and fast to search.
Ranked #88 on Neural Architecture Search on ImageNet (Accuracy metric)
2 code implementations • ICCV 2019 • Zechun Liu, Haoyuan Mu, Xiangyu Zhang, Zichao Guo, Xin Yang, Tim Kwang-Ting Cheng, Jian Sun
In this paper, we propose a novel meta learning approach for automatic channel pruning of very deep neural networks.
1 code implementation • 4 Nov 2018 • Zechun Liu, Wenhan Luo, Baoyuan Wu, Xin Yang, Wei Liu, Kwang-Ting Cheng
To address the training difficulty, we propose a training algorithm using a tighter approximation to the derivative of the sign function, a magnitude-aware gradient for weight updating, a better initialization method, and a two-step scheme for training a deep network.
4 code implementations • ECCV 2018 • Zechun Liu, Baoyuan Wu, Wenhan Luo, Xin Yang, Wei Liu, Kwang-Ting Cheng
In this work, we study the 1-bit convolutional neural networks (CNNs), of which both the weights and activations are binary.