Search Results for author: Alexey Tumanov

Found 12 papers, 7 papers with code

Signed Binary Weight Networks: Improving Efficiency of Binary Weight Networks by Exploiting Sparsity

no code implementations25 Nov 2022 Sachit Kuhar, Alexey Tumanov, Judy Hoffman

We propose a new method called signed-binary networks to improve further efficiency (by exploiting both weight sparsity and weight repetition) while maintaining similar accuracy.

Binarization

UnfoldML: Cost-Aware and Uncertainty-Based Dynamic 2D Prediction for Multi-Stage Classification

no code implementations26 Oct 2022 Yanbo Xu, Alind Khare, Glenn Matlin, Monish Ramadoss, Rishikesan Kamaleswaran, Chao Zhang, Alexey Tumanov

It achieves within 0. 1% accuracy from the highest-performing multi-class baseline, while saving close to 20X on spatio-temporal cost of inference and earlier (3. 5hrs) disease onset prediction.

Image Classification

CompOFA: Compound Once-For-All Networks for Faster Multi-Platform Deployment

1 code implementation26 Apr 2021 Manas Sahni, Shreya Varshini, Alind Khare, Alexey Tumanov

The emergence of CNNs in mainstream deployment has necessitated methods to design and train efficient architectures tailored to maximize the accuracy under diverse hardware & latency constraints.

CompOFA – Compound Once-For-All Networks for Faster Multi-Platform Deployment

2 code implementations ICLR 2021 Manas Sahni, Shreya Varshini, Alind Khare, Alexey Tumanov

The emergence of CNNs in mainstream deployment has necessitated methods to design and train efficient architectures tailored to maximize the accuracy under diverse hardware & latency constrains.

HOLMES: Health OnLine Model Ensemble Serving for Deep Learning Models in Intensive Care Units

2 code implementations10 Aug 2020 Shenda Hong, Yanbo Xu, Alind Khare, Satria Priambada, Kevin Maher, Alaa Aljiffry, Jimeng Sun, Alexey Tumanov

HOLMES is tested on risk prediction task on pediatric cardio ICU data with above 95% prediction accuracy and sub-second latency on 64-bed simulation.

Navigate

HyperSched: Dynamic Resource Reallocation for Model Development on a Deadline

no code implementations8 Jan 2020 Richard Liaw, Romil Bhardwaj, Lisa Dunlap, Yitian Zou, Joseph Gonzalez, Ion Stoica, Alexey Tumanov

Prior research in resource scheduling for machine learning training workloads has largely focused on minimizing job completion times.

The OoO VLIW JIT Compiler for GPU Inference

no code implementations28 Jan 2019 Paras Jain, Xiangxi Mo, Ajay Jain, Alexey Tumanov, Joseph E. Gonzalez, Ion Stoica

Current trends in Machine Learning~(ML) inference on hardware accelerated devices (e. g., GPUs, TPUs) point to alarmingly low utilization.

Serverless Computing: One Step Forward, Two Steps Back

3 code implementations10 Dec 2018 Joseph M. Hellerstein, Jose Faleiro, Joseph E. Gonzalez, Johann Schleier-Smith, Vikram Sreekanti, Alexey Tumanov, Chenggang Wu

Serverless computing offers the potential to program the cloud in an autoscaling, pay-as-you go manner.

Distributed, Parallel, and Cluster Computing Databases

InferLine: ML Inference Pipeline Composition Framework

1 code implementation5 Dec 2018 Daniel Crankshaw, Gur-Eyal Sela, Corey Zumar, Xiangxi Mo, Joseph E. Gonzalez, Ion Stoica, Alexey Tumanov

The dominant cost in production machine learning workloads is not training individual models but serving predictions from increasingly complex prediction pipelines spanning multiple models, machine learning frameworks, and parallel hardware accelerators.

Distributed, Parallel, and Cluster Computing

Ray: A Distributed Framework for Emerging AI Applications

4 code implementations16 Dec 2017 Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael. I. Jordan, Ion Stoica

To meet the performance requirements, Ray employs a distributed scheduler and a distributed and fault-tolerant store to manage the system's control state.

reinforcement-learning

IDK Cascades: Fast Deep Learning by Learning not to Overthink

no code implementations3 Jun 2017 Xin Wang, Yujia Luo, Daniel Crankshaw, Alexey Tumanov, Fisher Yu, Joseph E. Gonzalez

Advances in deep learning have led to substantial increases in prediction accuracy but have been accompanied by increases in the cost of rendering predictions.

Dialogue Generation

Real-Time Machine Learning: The Missing Pieces

2 code implementations11 Mar 2017 Robert Nishihara, Philipp Moritz, Stephanie Wang, Alexey Tumanov, William Paul, Johann Schleier-Smith, Richard Liaw, Mehrdad Niknami, Michael. I. Jordan, Ion Stoica

Machine learning applications are increasingly deployed not only to serve predictions using static models, but also as tightly-integrated components of feedback loops involving dynamic, real-time decision making.

BIG-bench Machine Learning Decision Making

Cannot find the paper you are looking for? You can Submit a new open access paper.