Search Results for author: Suyog Gupta

Found 10 papers, 6 papers with code

Learning Machines Implemented on Non-Deterministic Hardware

no code implementations9 Sep 2014 Suyog Gupta, Vikas Sindhwani, Kailash Gopalakrishnan

This paper highlights new opportunities for designing large-scale machine learning systems as a consequence of blurring traditional boundaries that have allowed algorithm designers and application-level practitioners to stay -- for the most part -- oblivious to the details of the underlying hardware-level implementations.

BIG-bench Machine Learning

Deep Learning with Limited Numerical Precision

2 code implementations9 Feb 2015 Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, Pritish Narayanan

Training of large-scale deep neural networks is often constrained by the available computational resources.

General Classification

Model Accuracy and Runtime Tradeoff in Distributed Deep Learning:A Systematic Study

1 code implementation14 Sep 2015 Suyog Gupta, Wei zhang, Fei Wang

This paper presents Rudra, a parameter server based distributed computing framework tuned for training large-scale deep neural networks.

Distributed Computing Image Classification

Staleness-aware Async-SGD for Distributed Deep Learning

1 code implementation18 Nov 2015 Wei Zhang, Suyog Gupta, Xiangru Lian, Ji Liu

Deep neural networks have been shown to achieve state-of-the-art performance in several machine learning tasks.

Distributed Computing Image Classification

To prune, or not to prune: exploring the efficacy of pruning for model compression

4 code implementations ICLR 2018 Michael Zhu, Suyog Gupta

Model pruning seeks to induce sparsity in a deep neural network's various connection matrices, thereby reducing the number of nonzero-valued parameters in the model.

Model Compression

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

2 code implementations21 Feb 2019 Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel A. U. Bacchiani, Thomas B. Jablin, Rob Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon

Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models.

Sequence-To-Sequence Speech Recognition

Accelerator-aware Neural Network Design using AutoML

no code implementations5 Mar 2020 Suyog Gupta, Berkin Akin

While neural network hardware accelerators provide a substantial amount of raw compute throughput, the models deployed on them must be co-designed for the underlying hardware architecture to obtain the optimal system performance.

Hardware Aware Neural Architecture Search Image Classification +1

MobileDets: Searching for Object Detection Architectures for Mobile Accelerators

4 code implementations CVPR 2021 Yunyang Xiong, Hanxiao Liu, Suyog Gupta, Berkin Akin, Gabriel Bender, Yongzhe Wang, Pieter-Jan Kindermans, Mingxing Tan, Vikas Singh, Bo Chen

By incorporating regular convolutions in the search space and directly optimizing the network architectures for object detection, we obtain a family of object detection models, MobileDets, that achieve state-of-the-art results across mobile accelerators.

Neural Architecture Search Object +2

Discovering Multi-Hardware Mobile Models via Architecture Search

no code implementations18 Aug 2020 Grace Chu, Okan Arikan, Gabriel Bender, Weijun Wang, Achille Brighton, Pieter-Jan Kindermans, Hanxiao Liu, Berkin Akin, Suyog Gupta, Andrew Howard

Hardware-aware neural architecture designs have been predominantly focusing on optimizing model performance on single hardware and model development complexity, where another important factor, model deployment complexity, has been largely ignored.

Neural Architecture Search

Cannot find the paper you are looking for? You can Submit a new open access paper.