Search Results for author: Peter Vajda

Found 28 papers, 15 papers with code

Cross-Domain Adaptive Teacher for Object Detection

1 code implementation25 Nov 2021 Yu-Jhe Li, Xiaoliang Dai, Chih-Yao Ma, Yen-Cheng Liu, Kan Chen, Bichen Wu, Zijian He, Kris Kitani, Peter Vajda

To mitigate this problem, we propose a teacher-student framework named Adaptive Teacher (AT) which leverages domain adversarial learning and weak-strong data augmentation to address the domain gap.

Data Augmentation Domain Adaptation +1

FBNetV5: Neural Architecture Search for Multiple Tasks in One Run

no code implementations19 Nov 2021 Bichen Wu, Chaojian Li, Hang Zhang, Xiaoliang Dai, Peizhao Zhang, Matthew Yu, Jialiang Wang, Yingyan Lin, Peter Vajda

To tackle these challenges, we propose FBNetV5, a NAS framework that can search for neural architectures for a variety of vision tasks with much reduced computational cost and human effort.

Classification Image Classification +3

Pruning Compact ConvNets For Efficient Inference

no code implementations29 Sep 2021 Sayan Ghosh, Karthik Prasad, Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Graham Cormode, Peter Vajda

The resulting family of pruned models can consistently obtain better performance than existing FBNetV3 models at the same level of computation, and thus provide state-of-the-art results when trading off between computational complexity and generalization performance on the ImageNet benchmark.

Network Pruning Neural Architecture Search

An Investigation on Hardware-Aware Vision Transformer Scaling

no code implementations29 Sep 2021 Chaojian Li, KyungMin Kim, Bichen Wu, Peizhao Zhang, Hang Zhang, Xiaoliang Dai, Peter Vajda, Yingyan Lin

In particular, when transferred to PiT, our scaling strategies lead to a boosted ImageNet top-1 accuracy of from $74. 6\%$ to $76. 7\%$ ($\uparrow2. 1\%$) under the same 0. 7G FLOPs; and when transferred to the COCO object detection task, the average precision is boosted by $\uparrow0. 7\%$ under a similar throughput on a V100 GPU.

Image Classification Object Detection +1

Tackling the Ill-Posedness of Super-Resolution Through Adaptive Target Generation

1 code implementation CVPR 2021 Younghyun Jo, Seoung Wug Oh, Peter Vajda, Seon Joo Kim

By the one-to-many nature of the super-resolution (SR) problem, a single low-resolution (LR) image can be mapped to many high-resolution (HR) images.

Super-Resolution

Data-Efficient Language-Supervised Zero-Shot Learning with Self-Distillation

no code implementations18 Apr 2021 Ruizhe Cheng, Bichen Wu, Peizhao Zhang, Peter Vajda, Joseph E. Gonzalez

Our model transfers knowledge from pretrained image and sentence encoders and achieves strong performance with only 3M image text pairs, 133x smaller than CLIP.

Zero-Shot Learning

Unbiased Teacher for Semi-Supervised Object Detection

2 code implementations ICLR 2021 Yen-Cheng Liu, Chih-Yao Ma, Zijian He, Chia-Wen Kuo, Kan Chen, Peizhao Zhang, Bichen Wu, Zsolt Kira, Peter Vajda

To address this, we introduce Unbiased Teacher, a simple yet effective approach that jointly trains a student and a gradually progressing teacher in a mutually-beneficial manner.

Image Classification Object Detection +2

FBWave: Efficient and Scalable Neural Vocoders for Streaming Text-To-Speech on the Edge

no code implementations25 Nov 2020 Bichen Wu, Qing He, Peizhao Zhang, Thilo Koehler, Kurt Keutzer, Peter Vajda

More efficient variants of FBWave can achieve up to 109x fewer MACs while still delivering acceptable audio quality.

One Shot 3D Photography

no code implementations27 Aug 2020 Johannes Kopf, Kevin Matzen, Suhib Alsisan, Ocean Quigley, Francis Ge, Yangming Chong, Josh Patterson, Jan-Michael Frahm, Shu Wu, Matthew Yu, Peizhao Zhang, Zijian He, Peter Vajda, Ayush Saraf, Michael Cohen

3D photos are static in time, like traditional photos, but are displayed with interactive parallax on mobile or desktop screens, as well as on Virtual Reality devices, where viewing it also includes stereo.

Monocular Depth Estimation

Visual Transformers: Token-based Image Representation and Processing for Computer Vision

6 code implementations5 Jun 2020 Bichen Wu, Chenfeng Xu, Xiaoliang Dai, Alvin Wan, Peizhao Zhang, Zhicheng Yan, Masayoshi Tomizuka, Joseph Gonzalez, Kurt Keutzer, Peter Vajda

In this work, we challenge this paradigm by (a) representing images as semantic visual tokens and (b) running transformers to densely model token relationships.

General Classification Image Classification +1

Deep Space-Time Video Upsampling Networks

1 code implementation ECCV 2020 Jaeyeon Kang, Younghyun Jo, Seoung Wug Oh, Peter Vajda, Seon Joo Kim

Video super-resolution (VSR) and frame interpolation (FI) are traditional computer vision problems, and the performance have been improving by incorporating deep learning recently.

Frame Motion Compensation +1

SqueezeSegV3: Spatially-Adaptive Convolution for Efficient Point-Cloud Segmentation

2 code implementations ECCV 2020 Chenfeng Xu, Bichen Wu, Zining Wang, Wei Zhan, Peter Vajda, Kurt Keutzer, Masayoshi Tomizuka

Using standard convolutions to process such LiDAR images is problematic, as convolution filters pick up local features that are only active in specific regions in the image.

3D Semantic Segmentation Point Cloud Segmentation

Learning to Generate Grounded Visual Captions without Localization Supervision

2 code implementations1 Jun 2019 Chih-Yao Ma, Yannis Kalantidis, Ghassan AlRegib, Peter Vajda, Marcus Rohrbach, Zsolt Kira

When automatically generating a sentence description for an image or video, it often remains unclear how well the generated caption is grounded, that is whether the model uses the correct image regions to output particular words, or if the model is hallucinating based on priors in the dataset and/or the language model.

Image Captioning Language Modelling +1

Precision Highway for Ultra Low-Precision Quantization

no code implementations ICLR 2019 Eunhyeok Park, Dongyoung Kim, Sungjoo Yoo, Peter Vajda

We also report that the proposed method significantly outperforms the existing method in the 2-bit quantization of an LSTM for language modeling.

Language Modelling Quantization

ChamNet: Towards Efficient Network Design through Platform-Aware Model Adaptation

1 code implementation CVPR 2019 Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Hongxu Yin, Fei Sun, Yanghan Wang, Marat Dukhan, Yunqing Hu, Yiming Wu, Yangqing Jia, Peter Vajda, Matt Uyttendaele, Niraj K. Jha

We formulate platform-aware NN architecture search in an optimization framework and propose a novel algorithm to search for optimal architectures aided by efficient accuracy and resource (latency and/or energy) predictors.

Neural Architecture Search

Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search

no code implementations ICLR 2019 Bichen Wu, Yanghan Wang, Peizhao Zhang, Yuandong Tian, Peter Vajda, Kurt Keutzer

Recent work in network quantization has substantially reduced the time and space complexity of neural network inference, enabling their deployment on embedded and mobile devices with limited computational and memory resources.

Neural Architecture Search Quantization

Value-aware Quantization for Training and Inference of Neural Networks

no code implementations ECCV 2018 Eunhyeok Park, Sungjoo Yoo, Peter Vajda

We propose a novel value-aware quantization which applies aggressively reduced precision to the majority of data while separately handling a small amount of large data in high precision, which reduces total quantization errors under very low precision.

Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.