Search Results for author: Xiaoliang Dai

Found 28 papers, 10 papers with code

Auto-CARD: Efficient and Robust Codec Avatar Driving for Real-time Mobile Telepresence

no code implementations CVPR 2023 Yonggan Fu, Yuecheng Li, Chenghui Li, Jason Saragih, Peizhao Zhang, Xiaoliang Dai, Yingyan Lin

Real-time and robust photorealistic avatars for telepresence in AR/VR have been highly desired for enabling immersive photorealistic telepresence.

Neural Architecture Search

Trainable Projected Gradient Method for Robust Fine-tuning

1 code implementation CVPR 2023 Junjiao Tian, Xiaoliang Dai, Chih-Yao Ma, Zecheng He, Yen-Cheng Liu, Zsolt Kira

To solve this problem, we propose Trainable Projected Gradient Method (TPGM) to automatically learn the constraint imposed for each layer for a fine-grained fine-tuning regularization.

Transfer Learning

Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors

no code implementations CVPR 2023 Ji Hou, Xiaoliang Dai, Zijian He, Angela Dai, Matthias Nießner

Current popular backbones in computer vision, such as Vision Transformers (ViT) and ResNets are trained to perceive the world from 2D images.

Contrastive Learning Instance Segmentation +5

Pruning Compact ConvNets for Efficient Inference

no code implementations11 Jan 2023 Sayan Ghosh, Karthik Prasad, Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Graham Cormode, Peter Vajda

The resulting family of pruned models can consistently obtain better performance than existing FBNetV3 models at the same level of computation, and thus provide state-of-the-art results when trading off between computational complexity and generalization performance on the ImageNet benchmark.

Network Pruning Neural Architecture Search

Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention at Vision Transformer Inference

no code implementations CVPR 2023 Haoran You, Yunyang Xiong, Xiaoliang Dai, Bichen Wu, Peizhao Zhang, Haoqi Fan, Peter Vajda, Yingyan, Lin

Vision Transformers (ViTs) have shown impressive performance but still require a high computation cost as compared to convolutional neural networks (CNNs), one reason is that ViTs' attention measures global similarities and thus has a quadratic complexity with the number of input tokens.

Token Merging: Your ViT But Faster

3 code implementations17 Oct 2022 Daniel Bolya, Cheng-Yang Fu, Xiaoliang Dai, Peizhao Zhang, Christoph Feichtenhofer, Judy Hoffman

Off-the-shelf, ToMe can 2x the throughput of state-of-the-art ViT-L @ 512 and ViT-H @ 518 models on images and 2. 2x the throughput of ViT-L on video with only a 0. 2-0. 3% accuracy drop in each case.

Hydra Attention: Efficient Attention with Many Heads

no code implementations15 Sep 2022 Daniel Bolya, Cheng-Yang Fu, Xiaoliang Dai, Peizhao Zhang, Judy Hoffman

While transformers have begun to dominate many tasks in vision, applying them to large images is still computationally difficult.

Open-Set Semi-Supervised Object Detection

no code implementations29 Aug 2022 Yen-Cheng Liu, Chih-Yao Ma, Xiaoliang Dai, Junjiao Tian, Peter Vajda, Zijian He, Zsolt Kira

To address this problem, we consider online and offline OOD detection modules, which are integrated with SSOD methods.

object-detection Object Detection +2

NASRec: Weight Sharing Neural Architecture Search for Recommender Systems

2 code implementations14 Jul 2022 Tunhou Zhang, Dehua Cheng, Yuchen He, Zhengxing Chen, Xiaoliang Dai, Liang Xiong, Feng Yan, Hai Li, Yiran Chen, Wei Wen

To overcome the data multi-modality and architecture heterogeneity challenges in the recommendation domain, NASRec establishes a large supernet (i. e., search space) to search the full architectures.

Click-Through Rate Prediction Neural Architecture Search +1

Cross-Domain Adaptive Teacher for Object Detection

2 code implementations CVPR 2022 Yu-Jhe Li, Xiaoliang Dai, Chih-Yao Ma, Yen-Cheng Liu, Kan Chen, Bichen Wu, Zijian He, Kris Kitani, Peter Vajda

To mitigate this problem, we propose a teacher-student framework named Adaptive Teacher (AT) which leverages domain adversarial learning and weak-strong data augmentation to address the domain gap.

Data Augmentation Domain Adaptation +2

FBNetV5: Neural Architecture Search for Multiple Tasks in One Run

no code implementations19 Nov 2021 Bichen Wu, Chaojian Li, Hang Zhang, Xiaoliang Dai, Peizhao Zhang, Matthew Yu, Jialiang Wang, Yingyan Lin, Peter Vajda

To tackle these challenges, we propose FBNetV5, a NAS framework that can search for neural architectures for a variety of vision tasks with much reduced computational cost and human effort.

Classification Image Classification +4

An Investigation on Hardware-Aware Vision Transformer Scaling

no code implementations29 Sep 2021 Chaojian Li, KyungMin Kim, Bichen Wu, Peizhao Zhang, Hang Zhang, Xiaoliang Dai, Peter Vajda, Yingyan Lin

In particular, when transferred to PiT, our scaling strategies lead to a boosted ImageNet top-1 accuracy of from $74. 6\%$ to $76. 7\%$ ($\uparrow2. 1\%$) under the same 0. 7G FLOPs; and when transferred to the COCO object detection task, the average precision is boosted by $\uparrow0. 7\%$ under a similar throughput on a V100 GPU.

Image Classification object-detection +2

FP-NAS: Fast Probabilistic Neural Architecture Search

no code implementations CVPR 2021 Zhicheng Yan, Xiaoliang Dai, Peizhao Zhang, Yuandong Tian, Bichen Wu, Matt Feiszli

Furthermore, to search fast in the multi-variate space, we propose a coarse-to-fine strategy by using a factorized distribution at the beginning which can reduce the number of architecture parameters by over an order of magnitude.

Neural Architecture Search

Fully Dynamic Inference with Deep Neural Networks

no code implementations29 Jul 2020 Wenhan Xia, Hongxu Yin, Xiaoliang Dai, Niraj K. Jha

Modern deep neural networks are powerful and widely applicable models that extract task-relevant information through multi-level abstraction.

Self-Driving Cars

Visual Transformers: Token-based Image Representation and Processing for Computer Vision

8 code implementations5 Jun 2020 Bichen Wu, Chenfeng Xu, Xiaoliang Dai, Alvin Wan, Peizhao Zhang, Zhicheng Yan, Masayoshi Tomizuka, Joseph Gonzalez, Kurt Keutzer, Peter Vajda

In this work, we challenge this paradigm by (a) representing images as semantic visual tokens and (b) running transformers to densely model token relationships.

General Classification Image Classification +1

STEERAGE: Synthesis of Neural Networks Using Architecture Search and Grow-and-Prune Methods

no code implementations12 Dec 2019 Shayan Hassantabar, Xiaoliang Dai, Niraj K. Jha

On MNIST dataset, our CNN architecture achieves an error rate of 0. 66%, with 8. 6x fewer parameters compared to the LeNet-5 baseline.

Navigate

DiabDeep: Pervasive Diabetes Diagnosis based on Wearable Medical Sensors and Efficient Neural Networks

no code implementations11 Oct 2019 Hongxu Yin, Bilal Mukadam, Xiaoliang Dai, Niraj K. Jha

For server (edge) side inference, we achieve a 96. 3% (95. 3%) accuracy in classifying diabetics against healthy individuals, and a 95. 7% (94. 6%) accuracy in distinguishing among type-1/type-2 diabetic, and healthy individuals.

Incremental Learning Using a Grow-and-Prune Paradigm with Efficient Neural Networks

no code implementations27 May 2019 Xiaoliang Dai, Hongxu Yin, Niraj K. Jha

Deep neural networks (DNNs) have become a widely deployed model for numerous machine learning applications.

Incremental Learning

ChamNet: Towards Efficient Network Design through Platform-Aware Model Adaptation

1 code implementation CVPR 2019 Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Hongxu Yin, Fei Sun, Yanghan Wang, Marat Dukhan, Yunqing Hu, Yiming Wu, Yangqing Jia, Peter Vajda, Matt Uyttendaele, Niraj K. Jha

We formulate platform-aware NN architecture search in an optimization framework and propose a novel algorithm to search for optimal architectures aided by efficient accuracy and resource (latency and/or energy) predictors.

Bayesian Optimization Efficient Neural Network +1

Grow and Prune Compact, Fast, and Accurate LSTMs

no code implementations30 May 2018 Xiaoliang Dai, Hongxu Yin, Niraj K. Jha

To address these problems, we propose a hidden-layer LSTM (H-LSTM) that adds hidden layers to LSTM's original one level non-linear control gates.

Image Captioning speech-recognition +1

NeST: A Neural Network Synthesis Tool Based on a Grow-and-Prune Paradigm

no code implementations6 Nov 2017 Xiaoliang Dai, Hongxu Yin, Niraj K. Jha

To address these problems, we introduce a network growth algorithm that complements network pruning to learn both weights and compact DNN architectures during training.

Network Pruning

Cannot find the paper you are looking for? You can Submit a new open access paper.