Search Results for author: Minghao Yan

Found 5 papers, 1 papers with code

Decoding Speculative Decoding

no code implementations • 2 Feb 2024 • Minghao Yan, Saurabh Agarwal, Shivaram Venkataraman

However, our experiments indicate the contrary with throughput diminishing as the probability of generated tokens to be accepted by the target model increases.

Paper
Add Code

PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices

no code implementations • 30 Oct 2023 • Minghao Yan, Hongyi Wang, Shivaram Venkataraman

As neural networks (NN) are deployed across diverse sectors, their energy demand correspondingly grows.

Bayesian Optimization Efficient Neural Network

Paper
Add Code

Distributed SLIDE: Enabling Training Large Neural Networks on Low Bandwidth and Simple CPU-Clusters via Model Parallelism and Sparsity

no code implementations • 29 Jan 2022 • Minghao Yan, Nicholas Meisburger, Tharun Medini, Anshumali Shrivastava

We show that with reduced communication, due to sparsity, we can train close to a billion parameter model on simple 4-16 core CPU nodes connected by basic low bandwidth interconnect.

Cloud Computing

Paper
Add Code

PairConnect: A Compute-Efficient MLP Alternative to Attention

no code implementations • 15 Jun 2021 • Zhaozhuo Xu, Minghao Yan, Junyan Zhang, Anshumali Shrivastava

The dot product self-attention in Transformer allows us to model interactions between words.

Language Modelling Word Embeddings

Paper
Add Code

Fast Processing and Querying of 170TB of Genomics Data via a Repeated And Merged BloOm Filter (RAMBO)

1 code implementation • 10 Oct 2019 • Gaurav Gupta, Minghao Yan, Benjamin Coleman, Bryce Kille, R. A. Leo Elworth, Tharun Medini, Todd Treangen, Anshumali Shrivastava

Interestingly, it is a count-min sketch type arrangement of a membership testing utility (Bloom Filter in our case).

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.