Search Results for author: Kurt Keutzer

Found 110 papers, 70 papers with code

What’s Hidden in a One-layer Randomly Weighted Transformer?

1 code implementation EMNLP 2021 Sheng Shen, Zhewei Yao, Douwe Kiela, Kurt Keutzer, Michael Mahoney

Hidden within a one-layer randomly weighted Transformer, we find that subnetworks that can achieve 29. 45/17. 29 BLEU on IWSLT14/WMT14.

Machine Translation Translation

Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

no code implementations ICML 2020 Zhuohan Li, Eric Wallace, Sheng Shen, Kevin Lin, Kurt Keutzer, Dan Klein, Joseph Gonzalez

Since hardware resources are limited, the objective of training deep learning models is typically to maximize accuracy subject to the time and memory constraints of training and inference.

Machine Translation Quantization +1

The ArtBench Dataset: Benchmarking Generative Models with Artworks

1 code implementation22 Jun 2022 Peiyuan Liao, Xiuyu Li, Xihui Liu, Kurt Keutzer

We introduce ArtBench-10, the first class-balanced, high-quality, cleanly annotated, and standardized dataset for benchmarking artwork generation.

Conditional Image Generation Unconditional Image Generation

Squeezeformer: An Efficient Transformer for Automatic Speech Recognition

1 code implementation2 Jun 2022 Sehoon Kim, Amir Gholami, Albert Shaw, Nicholas Lee, Karttikeya Mangalam, Jitendra Malik, Michael W. Mahoney, Kurt Keutzer

After reexamining the design choices for both the macro and micro-architecture of Conformer, we propose the Squeezeformer model, which consistently outperforms the state-of-the-art ASR models under the same training schemes.

Automatic Speech Recognition

Cross-Domain Object Detection with Mean-Teacher Transformer

no code implementations3 May 2022 Jinze Yu, Jiaming Liu, Xiaobao Wei, Haoyi Zhou, Yohei Nakata, Denis Gudovskiy, Tomoyuki Okuno, JianXin Li, Kurt Keutzer, Shanghang Zhang

With these strategies, more accurate pseudo labels can be obtained, and knowledge can be better transferred from source to target, thus improving the cross-domain capability of the detection transformer.

Domain Adaptation object-detection +2

PreTraM: Self-Supervised Pre-training via Connecting Trajectory and Map

no code implementations21 Apr 2022 Chenfeng Xu, Tian Li, Chen Tang, Lingfeng Sun, Kurt Keutzer, Masayoshi Tomizuka, Alireza Fathi, Wei Zhan

It is hard to replicate these approaches in trajectory forecasting due to the lack of adequate trajectory data (e. g., 34K samples in the nuScenes dataset).

Contrastive Learning Natural Language Processing +2

K-LITE: Learning Transferable Visual Models with External Knowledge

no code implementations20 Apr 2022 Sheng Shen, Chunyuan Li, Xiaowei Hu, Yujia Xie, Jianwei Yang, Pengchuan Zhang, Anna Rohrbach, Zhe Gan, Lijuan Wang, Lu Yuan, Ce Liu, Kurt Keutzer, Trevor Darrell, Jianfeng Gao

In this paper, we propose K-LITE (Knowledge-augmented Language-Image Training and Evaluation), a simple strategy to leverage external knowledge to build transferable visual systems: In training, it enriches entities in natural language with WordNet and Wiktionary knowledge, leading to an efficient and scalable approach to learning image representations that can understand both visual concepts and their knowledge; In evaluation, the natural language is also augmented with external knowledge and then used to reference learned visual concepts (or describe new ones) to enable zero-shot and few-shot transfer of the pre-trained models.

Image Classification object-detection +2

A Fast Post-Training Pruning Framework for Transformers

1 code implementation29 Mar 2022 Woosuk Kwon, Sehoon Kim, Michael W. Mahoney, Joseph Hassoun, Kurt Keutzer, Amir Gholami

Pruning is an effective way to reduce the huge inference cost of large Transformer models.

Staged Training for Transformer Language Models

1 code implementation11 Mar 2022 Sheng Shen, Pete Walsh, Kurt Keutzer, Jesse Dodge, Matthew Peters, Iz Beltagy

As an alternative, we consider a staged training setup that begins with a small model and incrementally increases the amount of compute used for training by applying a "growth operator" to increase the model depth and width.

NovelD: A Simple yet Effective Exploration Criterion

1 code implementation NeurIPS 2021 Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian

We analyze NovelD thoroughly in MiniGrid and found that empirically it helps the agent explore the environment more uniformly with a focus on exploring beyond the boundary.

Efficient Exploration Montezuma's Revenge +1

Differentiable NAS Framework and Application to Ads CTR Prediction

1 code implementation25 Oct 2021 Ravi Krishna, Aravind Kalaiah, Bichen Wu, Maxim Naumov, Dheevatsa Mudigere, Misha Smelyanskiy, Kurt Keutzer

Neural architecture search (NAS) methods aim to automatically find the optimal deep neural network (DNN) architecture as measured by a given objective function, typically some combination of task accuracy and inference efficiency.

Click-Through Rate Prediction Natural Language Processing +1

Multi-source Few-shot Domain Adaptation

no code implementations25 Sep 2021 Xiangyu Yue, Zangwei Zheng, Colorado Reed, Hari Prasanna Das, Kurt Keutzer, Alberto Sangiovanni Vincentelli

Multi-source Domain Adaptation (MDA) aims to transfer predictive models from multiple, fully-labeled source domains to an unlabeled target domain.

Domain Adaptation Self-Supervised Learning

What's Hidden in a One-layer Randomly Weighted Transformer?

1 code implementation8 Sep 2021 Sheng Shen, Zhewei Yao, Douwe Kiela, Kurt Keutzer, Michael W. Mahoney

Hidden within a one-layer randomly weighted Transformer, we find that subnetworks that can achieve 29. 45/17. 29 BLEU on IWSLT14/WMT14.

Machine Translation Translation

How Much Can CLIP Benefit Vision-and-Language Tasks?

2 code implementations13 Jul 2021 Sheng Shen, Liunian Harold Li, Hao Tan, Mohit Bansal, Anna Rohrbach, Kai-Wei Chang, Zhewei Yao, Kurt Keutzer

Most existing Vision-and-Language (V&L) models rely on pre-trained visual encoders, using a relatively small set of manually-annotated data (as compared to web-crawled data), to perceive the visual world.

Ranked #5 on Visual Entailment on SNLI-VE val (using extra training data)

Question Answering Visual Entailment +1

Learned Token Pruning for Transformers

1 code implementation2 Jul 2021 Sehoon Kim, Sheng Shen, David Thorsley, Amir Gholami, Woosuk Kwon, Joseph Hassoun, Kurt Keutzer

We extensively test the performance of LTP on GLUE tasks and show that our method outperforms the prior state-of-the-art token pruning methods by up to ~2. 5% higher accuracy with the same amount of FLOPs.

Invariant Information Bottleneck for Domain Generalization

no code implementations11 Jun 2021 Bo Li, Yifei Shen, Yezhen Wang, Wenzhen Zhu, Colorado J. Reed, Jun Zhang, Dongsheng Li, Kurt Keutzer, Han Zhao

IIB significantly outperforms IRM on synthetic datasets, where the pseudo-invariant features and geometric skews occur, showing the effectiveness of proposed formulation in overcoming failure modes of IRM.

Domain Generalization

LEAP: Learnable Pruning for Transformer-based Models

1 code implementation30 May 2021 Zhewei Yao, Xiaoxia Wu, Linjian Ma, Sheng Shen, Kurt Keutzer, Michael W. Mahoney, Yuxiong He

Moreover, in order to reduce hyperparameter tuning, a novel adaptive regularization coefficient is deployed to control the regularization penalty adaptively.

Natural Language Processing

A Survey of Quantization Methods for Efficient Neural Network Inference

no code implementations25 Mar 2021 Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer

Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks.

Natural Language Processing Quantization

Region Similarity Representation Learning

1 code implementation ICCV 2021 Tete Xiao, Colorado J Reed, Xiaolong Wang, Kurt Keutzer, Trevor Darrell

We present Region Similarity Representation Learning (ReSim), a new approach to self-supervised representation learning for localization-based tasks such as object detection and segmentation.

Instance Segmentation object-detection +4

Self-Supervised Pretraining Improves Self-Supervised Pretraining

1 code implementation23 Mar 2021 Colorado J. Reed, Xiangyu Yue, Ani Nrusimha, Sayna Ebrahimi, Vivek Vijaykumar, Richard Mao, Bo Li, Shanghang Zhang, Devin Guillory, Sean Metzger, Kurt Keutzer, Trevor Darrell

Through experimentation on 16 diverse vision datasets, we show HPT converges up to 80x faster, improves accuracy across tasks, and improves the robustness of the self-supervised pretraining process to changes in the image augmentation policy or amount of pretraining data.

Image Augmentation

Improving Context-Based Meta-Reinforcement Learning with Self-Supervised Trajectory Contrastive Learning

no code implementations10 Mar 2021 Bernie Wang, Simon Xu, Kurt Keutzer, Yang Gao, Bichen Wu

To address this, we propose a novel self-supervised learning task, which we named Trajectory Contrastive Learning (TCL), to improve meta-training.

Contrastive Learning Meta Reinforcement Learning +2

Hessian-Aware Pruning and Optimal Neural Implant

1 code implementation22 Jan 2021 Shixing Yu, Zhewei Yao, Amir Gholami, Zhen Dong, Sehoon Kim, Michael W Mahoney, Kurt Keutzer

To address this problem, we introduce a new Hessian Aware Pruning (HAP) method coupled with a Neural Implant approach that uses second-order sensitivity as a metric for structured pruning.

I-BERT: Integer-only BERT Quantization

4 code implementations5 Jan 2021 Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer

Transformer based models, like BERT and RoBERTa, have achieved state-of-the-art results in many Natural Language Processing tasks.

Natural Language Inference Natural Language Processing +2

Reservoir Transformers

no code implementations ACL 2021 Sheng Shen, Alexei Baevski, Ari S. Morcos, Kurt Keutzer, Michael Auli, Douwe Kiela

We demonstrate that transformers obtain impressive performance even when some of the layers are randomly initialized and never updated.

Language Modelling Machine Translation +1

BeBold: Exploration Beyond the Boundary of Explored Regions

3 code implementations15 Dec 2020 Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian

In this paper, we analyze the pros and cons of each method and propose the regulated difference of inverse visitation counts as a simple but effective criterion for IR.

Efficient Exploration NetHack +1

Annotation-Efficient Untrimmed Video Action Recognition

no code implementations30 Nov 2020 Yixiong Zou, Shanghang Zhang, Guangyao Chen, Yonghong Tian, Kurt Keutzer, José M. F. Moura

In this paper, we target a new problem, Annotation-Efficient Video Recognition, to reduce the requirement of annotations for both large amount of samples and the action location.

Action Recognition Contrastive Learning +2

Emotional Semantics-Preserved and Feature-Aligned CycleGAN for Visual Emotion Adaptation

no code implementations25 Nov 2020 Sicheng Zhao, Xuanbai Chen, Xiangyu Yue, Chuang Lin, Pengfei Xu, Ravi Krishna, Jufeng Yang, Guiguang Ding, Alberto L. Sangiovanni-Vincentelli, Kurt Keutzer

First, we generate an adapted domain to align the source and target domains on the pixel-level by improving CycleGAN with a multi-scale structured cycle-consistency loss.

Emotion Classification Emotion Recognition +1

FBWave: Efficient and Scalable Neural Vocoders for Streaming Text-To-Speech on the Edge

no code implementations25 Nov 2020 Bichen Wu, Qing He, Peizhao Zhang, Thilo Koehler, Kurt Keutzer, Peter Vajda

More efficient variants of FBWave can achieve up to 109x fewer MACs while still delivering acceptable audio quality.

HAWQV3: Dyadic Neural Network Quantization

1 code implementation20 Nov 2020 Zhewei Yao, Zhen Dong, Zhangcheng Zheng, Amir Gholami, Jiali Yu, Eric Tan, Leyuan Wang, Qijing Huang, Yida Wang, Michael W. Mahoney, Kurt Keutzer

Current low-precision quantization algorithms often have the hidden cost of conversion back and forth from floating point to quantized integer values.

Model Compression Quantization

Curriculum CycleGAN for Textual Sentiment Domain Adaptation with Multiple Sources

1 code implementation17 Nov 2020 Sicheng Zhao, Yang Xiao, Jiang Guo, Xiangyu Yue, Jufeng Yang, Ravi Krishna, Pengfei Xu, Kurt Keutzer

C-CycleGAN transfers source samples at instance-level to an intermediate domain that is closer to the target domain with sentiment semantics preserved and without losing discriminative features.

Domain Adaptation Sentiment Analysis

Cross-Domain Sentiment Classification with Contrastive Learning and Mutual Information Maximization

1 code implementation30 Oct 2020 Tian Li, Xiang Chen, Shanghang Zhang, Zhen Dong, Kurt Keutzer

Due to scarcity of labels on the target domain, we introduce mutual information maximization (MIM) apart from CL to exploit the features that best support the final prediction.

Contrastive Learning General Classification +3

Multi-Agent Collaboration via Reward Attribution Decomposition

2 code implementations16 Oct 2020 Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian

In this work, we propose Collaborative Q-learning (CollaQ) that achieves state-of-the-art performance in the StarCraft multi-agent challenge and supports ad hoc team play.

Dota 2 Multi-agent Reinforcement Learning +2

ePointDA: An End-to-End Simulation-to-Real Domain Adaptation Framework for LiDAR Point Cloud Segmentation

no code implementations7 Sep 2020 Sicheng Zhao, Yezhen Wang, Bo Li, Bichen Wu, Yang Gao, Pengfei Xu, Trevor Darrell, Kurt Keutzer

They require prior knowledge of real-world statistics and ignore the pixel-level dropout noise gap and the spatial feature gap between different domains.

Autonomous Driving Domain Adaptation +2

A Review of Single-Source Deep Unsupervised Visual Domain Adaptation

1 code implementation1 Sep 2020 Sicheng Zhao, Xiangyu Yue, Shanghang Zhang, Bo Li, Han Zhao, Bichen Wu, Ravi Krishna, Joseph E. Gonzalez, Alberto L. Sangiovanni-Vincentelli, Sanjit A. Seshia, Kurt Keutzer

To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled or unlabeled target domain.

Unsupervised Domain Adaptation

Emotion-Based End-to-End Matching Between Image and Music in Valence-Arousal Space

1 code implementation22 Aug 2020 Sicheng Zhao, Yaxian Li, Xingxu Yao, Wei-Zhi Nie, Pengfei Xu, Jufeng Yang, Kurt Keutzer

In this paper, we study end-to-end matching between image and music based on emotions in the continuous valence-arousal (VA) space.

Metric Learning

Boundary thickness and robustness in learning models

1 code implementation NeurIPS 2020 Yaoqing Yang, Rajiv Khanna, Yaodong Yu, Amir Gholami, Kurt Keutzer, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney

Using these observations, we show that noise-augmentation on mixup training further increases boundary thickness, thereby combating vulnerability to various forms of adversarial attacks and OOD transforms.

Adversarial Defense Data Augmentation

Rethinking Distributional Matching Based Domain Adaptation

no code implementations23 Jun 2020 Bo Li, Yezhen Wang, Tong Che, Shanghang Zhang, Sicheng Zhao, Pengfei Xu, Wei Zhou, Yoshua Bengio, Kurt Keutzer

In this paper, in order to devise robust DA algorithms, we first systematically analyze the limitations of DM based methods, and then build new benchmarks with more realistic domain shifts to evaluate the well-accepted DM methods.

Domain Adaptation

Visual Transformers: Token-based Image Representation and Processing for Computer Vision

6 code implementations5 Jun 2020 Bichen Wu, Chenfeng Xu, Xiaoliang Dai, Alvin Wan, Peizhao Zhang, Zhicheng Yan, Masayoshi Tomizuka, Joseph Gonzalez, Kurt Keutzer, Peter Vajda

In this work, we challenge this paradigm by (a) representing images as semantic visual tokens and (b) running transformers to densely model token relationships.

General Classification Image Classification +1

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

4 code implementations1 Jun 2020 Zhewei Yao, Amir Gholami, Sheng Shen, Mustafa Mustafa, Kurt Keutzer, Michael W. Mahoney

We introduce ADAHESSIAN, a second order stochastic optimization algorithm which dynamically incorporates the curvature of the loss function via ADAptive estimates of the HESSIAN.

Second-order methods Stochastic Optimization

SqueezeSegV3: Spatially-Adaptive Convolution for Efficient Point-Cloud Segmentation

2 code implementations ECCV 2020 Chenfeng Xu, Bichen Wu, Zining Wang, Wei Zhan, Peter Vajda, Kurt Keutzer, Masayoshi Tomizuka

Using standard convolutions to process such LiDAR images is problematic, as convolution filters pick up local features that are only active in specific regions in the image.

3D Semantic Segmentation Point Cloud Segmentation

PowerNorm: Rethinking Batch Normalization in Transformers

1 code implementation ICML 2020 Sheng Shen, Zhewei Yao, Amir Gholami, Michael W. Mahoney, Kurt Keutzer

To address this, we propose Power Normalization (PN), a novel normalization scheme that resolves this issue by (i) relaxing zero-mean normalization in BN, (ii) incorporating a running quadratic mean instead of per batch statistics to stabilize fluctuations, and (iii) using an approximate backpropagation for incorporating the running statistics in the forward pass.

Natural Language Processing

Multi-source Domain Adaptation in the Deep Learning Era: A Systematic Survey

no code implementations26 Feb 2020 Sicheng Zhao, Bo Li, Colorado Reed, Pengfei Xu, Kurt Keutzer

Therefore, transferring the learned knowledge from a separate, labeled source domain to an unlabeled or sparsely labeled target domain becomes an appealing alternative.

Domain Adaptation

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

2 code implementations26 Feb 2020 Zhuohan Li, Eric Wallace, Sheng Shen, Kevin Lin, Kurt Keutzer, Dan Klein, Joseph E. Gonzalez

Since hardware resources are limited, the objective of training deep learning models is typically to maximize accuracy subject to the time and memory constraints of training and inference.

Machine Translation Quantization +1

Algorithm-hardware Co-design for Deformable Convolution

2 code implementations19 Feb 2020 Qijing Huang, Dequan Wang, Yizhao Gao, Yaohui Cai, Zhen Dong, Bichen Wu, Kurt Keutzer, John Wawrzynek

In this work, we first investigate the overhead of the deformable convolution on embedded FPGA SoCs, and then show the accuracy-latency tradeoffs for a set of algorithm modifications including full versus depthwise, fixed-shape, and limited-range.

Image Classification Instance Segmentation +4

MADAN: Multi-source Adversarial Domain Aggregation Network for Domain Adaptation

1 code implementation19 Feb 2020 Sicheng Zhao, Bo Li, Xiangyu Yue, Pengfei Xu, Kurt Keutzer

Finally, feature-level alignment is performed between the aggregated domain and the target domain while training the task network.

Domain Adaptation Semantic Segmentation

SqueezeWave: Extremely Lightweight Vocoders for On-device Speech Synthesis

1 code implementation16 Jan 2020 Bohan Zhai, Tianren Gao, Flora Xue, Daniel Rothchild, Bichen Wu, Joseph E. Gonzalez, Kurt Keutzer

Automatic speech synthesis is a challenging task that is becoming increasingly important as edge devices begin to interact with users through speech.

Sound Audio and Speech Processing

ZeroQ: A Novel Zero Shot Quantization Framework

3 code implementations CVPR 2020 Yaohui Cai, Zhewei Yao, Zhen Dong, Amir Gholami, Michael W. Mahoney, Kurt Keutzer

Importantly, ZeroQ has a very low computational overhead, and it can finish the entire quantization process in less than 30s (0. 5\% of one epoch training time of ResNet50 on ImageNet).

 Ranked #1 on Data Free Quantization on CIFAR10 (CIFAR-10 W8A8 Top-1 Accuracy metric)

Data Free Quantization Neural Network Compression

PyHessian: Neural Networks Through the Lens of the Hessian

2 code implementations16 Dec 2019 Zhewei Yao, Amir Gholami, Kurt Keutzer, Michael Mahoney

To illustrate this, we analyze the effect of residual connections and Batch Normalization layers on the trainability of neural networks.

ANODEV2: A Coupled Neural ODE Framework

1 code implementation NeurIPS 2019 Tianjun Zhang, Zhewei Yao, Amir Gholami, Joseph E. Gonzalez, Kurt Keutzer, Michael W. Mahoney, George Biros

It has been observed that residual networks can be viewed as the explicit Euler discretization of an Ordinary Differential Equation (ODE).

Domain-Aware Dynamic Networks

no code implementations26 Nov 2019 Tianyuan Zhang, Bichen Wu, Xin Wang, Joseph Gonzalez, Kurt Keutzer

In this work, we propose a method to improve the model capacity without increasing inference-time complexity.

object-detection Object Detection

Multi-source Distilling Domain Adaptation

1 code implementation22 Nov 2019 Sicheng Zhao, Guangzhi Wang, Shanghang Zhang, Yang Gu, Yaxian Li, Zhichao Song, Pengfei Xu, Runbo Hu, Hua Chai, Kurt Keutzer

Deep neural networks suffer from performance decay when there is domain shift between the labeled source domain and unlabeled target domain, which motivates the research on domain adaptation (DA).

Domain Adaptation Multi-Source Unsupervised Domain Adaptation

Checkmate: Breaking the Memory Wall with Optimal Tensor Rematerialization

2 code implementations7 Oct 2019 Paras Jain, Ajay Jain, Aniruddha Nrusimha, Amir Gholami, Pieter Abbeel, Kurt Keutzer, Ion Stoica, Joseph E. Gonzalez

We formalize the problem of trading-off DNN training time and memory requirements as the tensor rematerialization optimization problem, a generalization of prior checkpointing strategies.

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT

no code implementations12 Sep 2019 Sheng Shen, Zhen Dong, Jiayu Ye, Linjian Ma, Zhewei Yao, Amir Gholami, Michael W. Mahoney, Kurt Keutzer

In particular, we propose a new group-wise quantization scheme, and we use a Hessian based mix-precision method to compress the model further.

Natural Language Processing Quantization

PDANet: Polarity-consistent Deep Attention Network for Fine-grained Visual Emotion Regression

1 code implementation11 Sep 2019 Sicheng Zhao, Zizhou Jia, Hui Chen, Leida Li, Guiguang Ding, Kurt Keutzer

By optimizing the PCR loss, PDANet can generate a polarity preserved attention map and thus improve the emotion regression performance.

Deep Attention Emotion Classification +1

ANODEV2: A Coupled Neural ODE Evolution Framework

no code implementations10 Jun 2019 Tianjun Zhang, Zhewei Yao, Amir Gholami, Kurt Keutzer, Joseph Gonzalez, George Biros, Michael Mahoney

It has been observed that residual networks can be viewed as the explicit Euler discretization of an Ordinary Differential Equation (ODE).

HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision

1 code implementation ICCV 2019 Zhen Dong, Zhewei Yao, Amir Gholami, Michael Mahoney, Kurt Keutzer

Another challenge is a similar factorial complexity for determining block-wise fine-tuning order when quantizing the model to a target precision.


LATTE: Accelerating LiDAR Point Cloud Annotation via Sensor Fusion, One-Click Annotation, and Tracking

2 code implementations19 Apr 2019 Bernie Wang, Virginia Wu, Bichen Wu, Kurt Keutzer

2) One-click annotation: Instead of drawing 3D bounding boxes or point-wise labels, we simplify the annotation to just one click on the target object, and automatically generate the bounding box for the target.

Autonomous Vehicles

Inefficiency of K-FAC for Large Batch Size Training

no code implementations14 Mar 2019 Linjian Ma, Gabe Montague, Jiayu Ye, Zhewei Yao, Amir Gholami, Kurt Keutzer, Michael W. Mahoney

In stochastic optimization, using large batch sizes during training can leverage parallel resources to produce faster wall-clock training times per training epoch.

Stochastic Optimization

Large-Batch Training for LSTM and Beyond

1 code implementation24 Jan 2019 Yang You, Jonathan Hseu, Chris Ying, James Demmel, Kurt Keutzer, Cho-Jui Hsieh

LEGW enables Sqrt Scaling scheme to be useful in practice and as a result we achieve much better results than the Linear Scaling learning rate scheme.

Trust Region Based Adversarial Attack on Neural Networks

2 code implementations CVPR 2019 Zhewei Yao, Amir Gholami, Peng Xu, Kurt Keutzer, Michael Mahoney

To address this problem, we present a new family of trust region based adversarial attacks, with the goal of computing adversarial perturbations efficiently.

Adversarial Attack

Parameter Re-Initialization through Cyclical Batch Size Schedules

no code implementations4 Dec 2018 Norman Mu, Zhewei Yao, Amir Gholami, Kurt Keutzer, Michael Mahoney

We demonstrate the ability of our method to improve language modeling performance by up to 7. 91 perplexity and reduce training iterations by up to $61\%$, in addition to its flexibility in enabling snapshot ensembling and use with adversarial training.

General Classification Image Classification +2

Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search

no code implementations ICLR 2019 Bichen Wu, Yanghan Wang, Peizhao Zhang, Yuandong Tian, Peter Vajda, Kurt Keutzer

Recent work in network quantization has substantially reduced the time and space complexity of neural network inference, enabling their deployment on embedded and mobile devices with limited computational and memory resources.

Neural Architecture Search Quantization

Synetgy: Algorithm-hardware Co-design for ConvNet Accelerators on Embedded FPGAs

1 code implementation21 Nov 2018 Yifan Yang, Qijing Huang, Bichen Wu, Tianjun Zhang, Liang Ma, Giulio Gambardella, Michaela Blott, Luciano Lavagno, Kees Vissers, John Wawrzynek, Kurt Keutzer

DiracDeltaNet achieves competitive accuracy on ImageNet (88. 7\% top-5), but with 42$\times$ fewer parameters and 48$\times$ fewer OPs than VGG16.

Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

1 code implementation5 Nov 2018 Spyridon Bakas, Mauricio Reyes, Andras Jakab, Stefan Bauer, Markus Rempfler, Alessandro Crimi, Russell Takeshi Shinohara, Christoph Berger, Sung Min Ha, Martin Rozycki, Marcel Prastawa, Esther Alberts, Jana Lipkova, John Freymann, Justin Kirby, Michel Bilello, Hassan Fathallah-Shaykh, Roland Wiest, Jan Kirschke, Benedikt Wiestler, Rivka Colen, Aikaterini Kotrotsou, Pamela Lamontagne, Daniel Marcus, Mikhail Milchenko, Arash Nazeri, Marc-Andre Weber, Abhishek Mahajan, Ujjwal Baid, Elizabeth Gerstner, Dongjin Kwon, Gagan Acharya, Manu Agarwal, Mahbubul Alam, Alberto Albiol, Antonio Albiol, Francisco J. Albiol, Varghese Alex, Nigel Allinson, Pedro H. A. Amorim, Abhijit Amrutkar, Ganesh Anand, Simon Andermatt, Tal Arbel, Pablo Arbelaez, Aaron Avery, Muneeza Azmat, Pranjal B., W Bai, Subhashis Banerjee, Bill Barth, Thomas Batchelder, Kayhan Batmanghelich, Enzo Battistella, Andrew Beers, Mikhail Belyaev, Martin Bendszus, Eze Benson, Jose Bernal, Halandur Nagaraja Bharath, George Biros, Sotirios Bisdas, James Brown, Mariano Cabezas, Shilei Cao, Jorge M. Cardoso, Eric N Carver, Adrià Casamitjana, Laura Silvana Castillo, Marcel Catà, Philippe Cattin, Albert Cerigues, Vinicius S. Chagas, Siddhartha Chandra, Yi-Ju Chang, Shiyu Chang, Ken Chang, Joseph Chazalon, Shengcong Chen, Wei Chen, Jefferson W. Chen, Zhaolin Chen, Kun Cheng, Ahana Roy Choudhury, Roger Chylla, Albert Clérigues, Steven Colleman, Ramiro German Rodriguez Colmeiro, Marc Combalia, Anthony Costa, Xiaomeng Cui, Zhenzhen Dai, Lutao Dai, Laura Alexandra Daza, Eric Deutsch, Changxing Ding, Chao Dong, Shidu Dong, Wojciech Dudzik, Zach Eaton-Rosen, Gary Egan, Guilherme Escudero, Théo Estienne, Richard Everson, Jonathan Fabrizio, Yong Fan, Longwei Fang, Xue Feng, Enzo Ferrante, Lucas Fidon, Martin Fischer, Andrew P. French, Naomi Fridman, Huan Fu, David Fuentes, Yaozong Gao, Evan Gates, David Gering, Amir Gholami, Willi Gierke, Ben Glocker, Mingming Gong, Sandra González-Villá, T. Grosges, Yuanfang Guan, Sheng Guo, Sudeep Gupta, Woo-Sup Han, Il Song Han, Konstantin Harmuth, Huiguang He, Aura Hernández-Sabaté, Evelyn Herrmann, Naveen Himthani, Winston Hsu, Cheyu Hsu, Xiaojun Hu, Xiaobin Hu, Yan Hu, Yifan Hu, Rui Hua, Teng-Yi Huang, Weilin Huang, Sabine Van Huffel, Quan Huo, Vivek HV, Khan M. Iftekharuddin, Fabian Isensee, Mobarakol Islam, Aaron S. Jackson, Sachin R. Jambawalikar, Andrew Jesson, Weijian Jian, Peter Jin, V Jeya Maria Jose, Alain Jungo, B Kainz, Konstantinos Kamnitsas, Po-Yu Kao, Ayush Karnawat, Thomas Kellermeier, Adel Kermi, Kurt Keutzer, Mohamed Tarek Khadir, Mahendra Khened, Philipp Kickingereder, Geena Kim, Nik King, Haley Knapp, Urspeter Knecht, Lisa Kohli, Deren Kong, Xiangmao Kong, Simon Koppers, Avinash Kori, Ganapathy Krishnamurthi, Egor Krivov, Piyush Kumar, Kaisar Kushibar, Dmitrii Lachinov, Tryphon Lambrou, Joon Lee, Chengen Lee, Yuehchou Lee, M Lee, Szidonia Lefkovits, Laszlo Lefkovits, James Levitt, Tengfei Li, Hongwei Li, Hongyang Li, Xiaochuan Li, Yuexiang Li, Heng Li, Zhenye Li, Xiaoyu Li, Zeju Li, Xiaogang Li, Wenqi Li, Zheng-Shen Lin, Fengming Lin, Pietro Lio, Chang Liu, Boqiang Liu, Xiang Liu, Mingyuan Liu, Ju Liu, Luyan Liu, Xavier Llado, Marc Moreno Lopez, Pablo Ribalta Lorenzo, Zhentai Lu, Lin Luo, Zhigang Luo, Jun Ma, Kai Ma, Thomas Mackie, Anant Madabushi, Issam Mahmoudi, Klaus H. Maier-Hein, Pradipta Maji, CP Mammen, Andreas Mang, B. S. Manjunath, Michal Marcinkiewicz, S McDonagh, Stephen McKenna, Richard McKinley, Miriam Mehl, Sachin Mehta, Raghav Mehta, Raphael Meier, Christoph Meinel, Dorit Merhof, Craig Meyer, Robert Miller, Sushmita Mitra, Aliasgar Moiyadi, David Molina-Garcia, Miguel A. B. Monteiro, Grzegorz Mrukwa, Andriy Myronenko, Jakub Nalepa, Thuyen Ngo, Dong Nie, Holly Ning, Chen Niu, Nicholas K Nuechterlein, Eric Oermann, Arlindo Oliveira, Diego D. C. Oliveira, Arnau Oliver, Alexander F. I. Osman, Yu-Nian Ou, Sebastien Ourselin, Nikos Paragios, Moo Sung Park, Brad Paschke, J. Gregory Pauloski, Kamlesh Pawar, Nick Pawlowski, Linmin Pei, Suting Peng, Silvio M. Pereira, Julian Perez-Beteta, Victor M. Perez-Garcia, Simon Pezold, Bao Pham, Ashish Phophalia, Gemma Piella, G. N. Pillai, Marie Piraud, Maxim Pisov, Anmol Popli, Michael P. Pound, Reza Pourreza, Prateek Prasanna, Vesna Prkovska, Tony P. Pridmore, Santi Puch, Élodie Puybareau, Buyue Qian, Xu Qiao, Martin Rajchl, Swapnil Rane, Michael Rebsamen, Hongliang Ren, Xuhua Ren, Karthik Revanuru, Mina Rezaei, Oliver Rippel, Luis Carlos Rivera, Charlotte Robert, Bruce Rosen, Daniel Rueckert, Mohammed Safwan, Mostafa Salem, Joaquim Salvi, Irina Sanchez, Irina Sánchez, Heitor M. Santos, Emmett Sartor, Dawid Schellingerhout, Klaudius Scheufele, Matthew R. Scott, Artur A. Scussel, Sara Sedlar, Juan Pablo Serrano-Rubio, N. Jon Shah, Nameetha Shah, Mazhar Shaikh, B. Uma Shankar, Zeina Shboul, Haipeng Shen, Dinggang Shen, Linlin Shen, Haocheng Shen, Varun Shenoy, Feng Shi, Hyung Eun Shin, Hai Shu, Diana Sima, M Sinclair, Orjan Smedby, James M. Snyder, Mohammadreza Soltaninejad, Guidong Song, Mehul Soni, Jean Stawiaski, Shashank Subramanian, Li Sun, Roger Sun, Jiawei Sun, Kay Sun, Yu Sun, Guoxia Sun, Shuang Sun, Yannick R Suter, Laszlo Szilagyi, Sanjay Talbar, DaCheng Tao, Zhongzhao Teng, Siddhesh Thakur, Meenakshi H Thakur, Sameer Tharakan, Pallavi Tiwari, Guillaume Tochon, Tuan Tran, Yuhsiang M. Tsai, Kuan-Lun Tseng, Tran Anh Tuan, Vadim Turlapov, Nicholas Tustison, Maria Vakalopoulou, Sergi Valverde, Rami Vanguri, Evgeny Vasiliev, Jonathan Ventura, Luis Vera, Tom Vercauteren, C. A. Verrastro, Lasitha Vidyaratne, Veronica Vilaplana, Ajeet Vivekanandan, Qian Wang, Chiatse J. Wang, Wei-Chung Wang, Duo Wang, Ruixuan Wang, Yuanyuan Wang, Chunliang Wang, Guotai Wang, Ning Wen, Xin Wen, Leon Weninger, Wolfgang Wick, Shaocheng Wu, Qiang Wu, Yihong Wu, Yong Xia, Yanwu Xu, Xiaowen Xu, Peiyuan Xu, Tsai-Ling Yang, Xiaoping Yang, Hao-Yu Yang, Junlin Yang, Haojin Yang, Guang Yang, Hongdou Yao, Xujiong Ye, Changchang Yin, Brett Young-Moxon, Jinhua Yu, Xiangyu Yue, Songtao Zhang, Angela Zhang, Kun Zhang, Xue-jie Zhang, Lichi Zhang, Xiaoyue Zhang, Yazhuo Zhang, Lei Zhang, Jian-Guo Zhang, Xiang Zhang, Tianhao Zhang, Sicheng Zhao, Yu Zhao, Xiaomei Zhao, Liang Zhao, Yefeng Zheng, Liming Zhong, Chenhong Zhou, Xiaobing Zhou, Fan Zhou, Hongtu Zhu, Jin Zhu, Ying Zhuge, Weiwei Zong, Jayashree Kalpathy-Cramer, Keyvan Farahani, Christos Davatzikos, Koen van Leemput, Bjoern Menze

This study assesses the state-of-the-art machine learning (ML) methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i. e., 2012-2018.

Brain Tumor Segmentation Survival Prediction +1

A Novel Domain Adaptation Framework for Medical Image Segmentation

no code implementations11 Oct 2018 Amir Gholami, Shashank Subramanian, Varun Shenoy, Naveen Himthani, Xiangyu Yue, Sicheng Zhao, Peter Jin, George Biros, Kurt Keutzer

Our biophysics based domain adaptation achieves better results, as compared to the existing state-of-the-art GAN model used to create synthetic data for training.

Domain Adaptation Image Registration +2

Large batch size training of neural networks with adversarial training and second-order information

1 code implementation ICLR 2019 Zhewei Yao, Amir Gholami, Daiyaan Arfeen, Richard Liaw, Joseph Gonzalez, Kurt Keutzer, Michael Mahoney

Our method exceeds the performance of existing solutions in terms of both accuracy and the number of SGD iterations (up to 1\% and $5\times$, respectively).

Second-order methods

Unsupervised Domain Adaptation: from Simulation Engine to the RealWorld

no code implementations24 Mar 2018 Sicheng Zhao, Bichen Wu, Joseph Gonzalez, Sanjit A. Seshia, Kurt Keutzer

To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled target domain.

Unsupervised Domain Adaptation

SqueezeNext: Hardware-Aware Neural Network Design

5 code implementations23 Mar 2018 Amir Gholami, Kiseok Kwon, Bichen Wu, Zizheng Tai, Xiangyu Yue, Peter Jin, Sicheng Zhao, Kurt Keutzer

One of the main barriers for deploying neural networks on embedded systems has been large memory and power consumption of existing neural networks.

Hessian-based Analysis of Large Batch Training and Robustness to Adversaries

6 code implementations NeurIPS 2018 Zhewei Yao, Amir Gholami, Qi Lei, Kurt Keutzer, Michael W. Mahoney

Extensive experiments on multiple networks show that saddle-points are not the cause for generalization gap of large batch size training, and the results consistently show that large batch converges to points with noticeably higher Hessian spectrum.

Integrated Model, Batch and Domain Parallelism in Training Neural Networks

no code implementations12 Dec 2017 Amir Gholami, Ariful Azad, Peter Jin, Kurt Keutzer, Aydin Buluc

We propose a new integrated method of exploiting model, batch and domain parallelism for the training of deep neural networks (DNNs) on large distributed-memory computers using minibatch stochastic gradient descent (SGD).

Regret Minimization for Partially Observable Deep Reinforcement Learning

1 code implementation ICML 2018 Peter Jin, Kurt Keutzer, Sergey Levine

Deep reinforcement learning algorithms that estimate state and state-action value functions have been shown to be effective in a variety of challenging domains, including learning control strategies from raw image pixels.


Keynote: Small Neural Nets Are Beautiful: Enabling Embedded Systems with Small Deep-Neural-Network Architectures

no code implementations7 Oct 2017 Forrest Iandola, Kurt Keutzer

Over the last five years Deep Neural Nets have offered more accurate solutions to many problems in speech recognition, and computer vision, and these solutions have surpassed a threshold of acceptability for many applications.

Speech Recognition

ImageNet Training in Minutes

1 code implementation14 Sep 2017 Yang You, Zhao Zhang, Cho-Jui Hsieh, James Demmel, Kurt Keutzer

If we can make full use of the supercomputer for DNN training, we should be able to finish the 90-epoch ResNet-50 training in one minute.

SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving

13 code implementations4 Dec 2016 Bichen Wu, Alvin Wan, Forrest Iandola, Peter H. Jin, Kurt Keutzer

In addition to requiring high accuracy to ensure safety, object detection for autonomous driving also requires real-time inference speed to guarantee prompt vehicle control, as well as small model size and energy efficiency to enable embedded system deployment.

Autonomous Driving object-detection +1

A Metaprogramming and Autotuning Framework for Deploying Deep Learning Applications

no code implementations21 Nov 2016 Matthew W. Moskewicz, Ali Jannesari, Kurt Keutzer

On Qualcomm GPUs, we show that our framework enables productive development of target-specific optimizations, and achieves reasonable absolute performance.

How to scale distributed deep learning?

no code implementations14 Nov 2016 Peter H. Jin, Qiaochu Yuan, Forrest Iandola, Kurt Keutzer

Training time on large datasets for deep neural networks is the principal workflow bottleneck in a number of important applications of deep learning, such as object classification and detection in automatic driver assistance systems (ADAS).

General Classification

Shallow Networks for High-Accuracy Road Object-Detection

no code implementations5 Jun 2016 Khalid Ashraf, Bichen Wu, Forrest N. Iandola, Mattthew W. Moskewicz, Kurt Keutzer

The ability to automatically detect other vehicles on the road is vital to the safety of partially-autonomous and fully-autonomous vehicles.

Autonomous Vehicles object-detection +1

Boda-RTC: Productive Generation of Portable, Efficient Code for Convolutional Neural Networks on Mobile Computing Platforms

1 code implementation1 Jun 2016 Matthew Moskewicz, Forrest Iandola, Kurt Keutzer

Results are presented for a case study of targeting the Qualcomm Snapdragon 820 mobile computing platform for CNN deployment.

Code Generation

Convolutional Monte Carlo Rollouts in Go

no code implementations10 Dec 2015 Peter H. Jin, Kurt Keutzer

In this work, we present a MCTS-based Go-playing program which uses convolutional networks in all parts.

FireCaffe: near-linear acceleration of deep neural network training on compute clusters

no code implementations CVPR 2016 Forrest N. Iandola, Khalid Ashraf, Matthew W. Moskewicz, Kurt Keutzer

Therefore, the key consideration here is to reduce communication overhead wherever possible, while not degrading the accuracy of the DNN models that we train.

Image Classification

DeepLogo: Hitting Logo Recognition with the Deep Neural Network Hammer

no code implementations7 Oct 2015 Forrest N. Iandola, Anting Shen, Peter Gao, Kurt Keutzer

Recently, there has been a flurry of industrial activity around logo recognition, such as Ditto's service for marketers to track their brands in user-generated images, and LogoGrab's mobile app platform for logo recognition.

2D object detection Image Classification +3

libHOG: Energy-Efficient Histogram of Oriented Gradient Computation

1 code implementation ITSC 2015 Forrest Iandola, Matthew Moskewicz, Kurt Keutzer

Histogram of Oriented Gradients (HOG) features are the underlying representation in automotive computer vision applications such as collision avoidance and lane keeping.

Object Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.