no code implementations • CVPR 2023 • Yonggan Fu, Yuecheng Li, Chenghui Li, Jason Saragih, Peizhao Zhang, Xiaoliang Dai, Yingyan Lin
Real-time and robust photorealistic avatars for telepresence in AR/VR have been highly desired for enabling immersive photorealistic telepresence.
1 code implementation • CVPR 2023 • Junjiao Tian, Xiaoliang Dai, Chih-Yao Ma, Zecheng He, Yen-Cheng Liu, Zsolt Kira
To solve this problem, we propose Trainable Projected Gradient Method (TPGM) to automatically learn the constraint imposed for each layer for a fine-grained fine-tuning regularization.
no code implementations • CVPR 2023 • Ji Hou, Xiaoliang Dai, Zijian He, Angela Dai, Matthias Nießner
Current popular backbones in computer vision, such as Vision Transformers (ViT) and ResNets are trained to perceive the world from 2D images.
no code implementations • 11 Jan 2023 • Sayan Ghosh, Karthik Prasad, Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Graham Cormode, Peter Vajda
The resulting family of pruned models can consistently obtain better performance than existing FBNetV3 models at the same level of computation, and thus provide state-of-the-art results when trading off between computational complexity and generalization performance on the ImageNet benchmark.
no code implementations • CVPR 2023 • Haoran You, Yunyang Xiong, Xiaoliang Dai, Bichen Wu, Peizhao Zhang, Haoqi Fan, Peter Vajda, Yingyan, Lin
Vision Transformers (ViTs) have shown impressive performance but still require a high computation cost as compared to convolutional neural networks (CNNs), one reason is that ViTs' attention measures global similarities and thus has a quadratic complexity with the number of input tokens.
no code implementations • 12 Nov 2022 • Yu-Jhe Li, Tao Xu, Bichen Wu, Ningyuan Zheng, Xiaoliang Dai, Albert Pumarola, Peizhao Zhang, Peter Vajda, Kris Kitani
In the first stage, we introduce a base encoder that converts the input image to a latent code.
3 code implementations • 17 Oct 2022 • Daniel Bolya, Cheng-Yang Fu, Xiaoliang Dai, Peizhao Zhang, Christoph Feichtenhofer, Judy Hoffman
Off-the-shelf, ToMe can 2x the throughput of state-of-the-art ViT-L @ 512 and ViT-H @ 518 models on images and 2. 2x the throughput of ViT-L on video with only a 0. 2-0. 3% accuracy drop in each case.
1 code implementation • CVPR 2023 • Feng Liang, Bichen Wu, Xiaoliang Dai, Kunpeng Li, Yinan Zhao, Hang Zhang, Peizhao Zhang, Peter Vajda, Diana Marculescu
To address this, we propose to finetune CLIP on a collection of masked image regions and their corresponding text descriptions.
no code implementations • 15 Sep 2022 • Daniel Bolya, Cheng-Yang Fu, Xiaoliang Dai, Peizhao Zhang, Judy Hoffman
While transformers have begun to dominate many tasks in vision, applying them to large images is still computationally difficult.
no code implementations • 29 Aug 2022 • Yen-Cheng Liu, Chih-Yao Ma, Xiaoliang Dai, Junjiao Tian, Peter Vajda, Zijian He, Zsolt Kira
To address this problem, we consider online and offline OOD detection modules, which are integrated with SSOD methods.
2 code implementations • 14 Jul 2022 • Tunhou Zhang, Dehua Cheng, Yuchen He, Zhengxing Chen, Xiaoliang Dai, Liang Xiong, Feng Yan, Hai Li, Yiran Chen, Wei Wen
To overcome the data multi-modality and architecture heterogeneity challenges in the recommendation domain, NASRec establishes a large supernet (i. e., search space) to search the full architectures.
2 code implementations • CVPR 2022 • Yu-Jhe Li, Xiaoliang Dai, Chih-Yao Ma, Yen-Cheng Liu, Kan Chen, Bichen Wu, Zijian He, Kris Kitani, Peter Vajda
To mitigate this problem, we propose a teacher-student framework named Adaptive Teacher (AT) which leverages domain adversarial learning and weak-strong data augmentation to address the domain gap.
no code implementations • 19 Nov 2021 • Bichen Wu, Chaojian Li, Hang Zhang, Xiaoliang Dai, Peizhao Zhang, Matthew Yu, Jialiang Wang, Yingyan Lin, Peter Vajda
To tackle these challenges, we propose FBNetV5, a NAS framework that can search for neural architectures for a variety of vision tasks with much reduced computational cost and human effort.
Ranked #7 on
Neural Architecture Search
on ImageNet
no code implementations • 29 Sep 2021 • Chaojian Li, KyungMin Kim, Bichen Wu, Peizhao Zhang, Hang Zhang, Xiaoliang Dai, Peter Vajda, Yingyan Lin
In particular, when transferred to PiT, our scaling strategies lead to a boosted ImageNet top-1 accuracy of from $74. 6\%$ to $76. 7\%$ ($\uparrow2. 1\%$) under the same 0. 7G FLOPs; and when transferred to the COCO object detection task, the average precision is boosted by $\uparrow0. 7\%$ under a similar throughput on a V100 GPU.
no code implementations • 29 Sep 2021 • Yu-Jhe Li, Xiaoliang Dai, Chih-Yao Ma, Yen-Cheng Liu, Kan Chen, Bichen Wu, Zijian He, Kris M. Kitani, Peter Vajda
This enables the student model to capture domain-invariant features.
no code implementations • ICCV 2021 • Bichen Wu, Chenfeng Xu, Xiaoliang Dai, Alvin Wan, Peizhao Zhang, Zhicheng Yan, Masayoshi Tomizuka, Joseph E. Gonzalez, Kurt Keutzer, Peter Vajda
A recent trend in computer vision is to replace convolutions with transformers.
no code implementations • CVPR 2021 • Zhicheng Yan, Xiaoliang Dai, Peizhao Zhang, Yuandong Tian, Bichen Wu, Matt Feiszli
Furthermore, to search fast in the multi-variate space, we propose a coarse-to-fine strategy by using a factorized distribution at the beginning which can reduce the number of architecture parameters by over an order of magnitude.
no code implementations • 29 Jul 2020 • Wenhan Xia, Hongxu Yin, Xiaoliang Dai, Niraj K. Jha
Modern deep neural networks are powerful and widely applicable models that extract task-relevant information through multi-level abstraction.
8 code implementations • 5 Jun 2020 • Bichen Wu, Chenfeng Xu, Xiaoliang Dai, Alvin Wan, Peizhao Zhang, Zhicheng Yan, Masayoshi Tomizuka, Joseph Gonzalez, Kurt Keutzer, Peter Vajda
In this work, we challenge this paradigm by (a) representing images as semantic visual tokens and (b) running transformers to densely model token relationships.
2 code implementations • CVPR 2021 • Xiaoliang Dai, Alvin Wan, Peizhao Zhang, Bichen Wu, Zijian He, Zhen Wei, Kan Chen, Yuandong Tian, Matthew Yu, Peter Vajda, Joseph E. Gonzalez
To address this, we present Neural Architecture-Recipe Search (NARS) to search both (a) architectures and (b) their corresponding training recipes, simultaneously.
Ranked #5 on
Neural Architecture Search
on ImageNet
1 code implementation • CVPR 2020 • Alvin Wan, Xiaoliang Dai, Peizhao Zhang, Zijian He, Yuandong Tian, Saining Xie, Bichen Wu, Matthew Yu, Tao Xu, Kan Chen, Peter Vajda, Joseph E. Gonzalez
We propose a masking mechanism for feature map reuse, so that memory and computational costs stay nearly constant as the search space expands.
Ranked #68 on
Neural Architecture Search
on ImageNet
no code implementations • 12 Dec 2019 • Shayan Hassantabar, Xiaoliang Dai, Niraj K. Jha
On MNIST dataset, our CNN architecture achieves an error rate of 0. 66%, with 8. 6x fewer parameters compared to the LeNet-5 baseline.
no code implementations • 11 Oct 2019 • Hongxu Yin, Bilal Mukadam, Xiaoliang Dai, Niraj K. Jha
For server (edge) side inference, we achieve a 96. 3% (95. 3%) accuracy in classifying diabetics against healthy individuals, and a 95. 7% (94. 6%) accuracy in distinguishing among type-1/type-2 diabetic, and healthy individuals.
no code implementations • 27 May 2019 • Xiaoliang Dai, Hongxu Yin, Niraj K. Jha
Deep neural networks (DNNs) have become a widely deployed model for numerous machine learning applications.
1 code implementation • CVPR 2019 • Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Hongxu Yin, Fei Sun, Yanghan Wang, Marat Dukhan, Yunqing Hu, Yiming Wu, Yangqing Jia, Peter Vajda, Matt Uyttendaele, Niraj K. Jha
We formulate platform-aware NN architecture search in an optimization framework and propose a novel algorithm to search for optimal architectures aided by efficient accuracy and resource (latency and/or energy) predictors.
5 code implementations • CVPR 2019 • Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, Kurt Keutzer
Due to this, previous neural architecture search (NAS) methods are computationally expensive.
Ranked #838 on
Image Classification
on ImageNet
no code implementations • 30 May 2018 • Xiaoliang Dai, Hongxu Yin, Niraj K. Jha
To address these problems, we propose a hidden-layer LSTM (H-LSTM) that adds hidden layers to LSTM's original one level non-linear control gates.
no code implementations • 6 Nov 2017 • Xiaoliang Dai, Hongxu Yin, Niraj K. Jha
To address these problems, we introduce a network growth algorithm that complements network pruning to learn both weights and compact DNN architectures during training.