Search Results for author: Minghao Yan

Found 5 papers, 2 papers with code

Decoding Speculative Decoding

1 code implementation2 Feb 2024 Minghao Yan, Saurabh Agarwal, Shivaram Venkataraman

Speculative Decoding is a widely used technique to speed up inference for Large Language Models (LLMs) without sacrificing quality.

Language Modelling

Distributed SLIDE: Enabling Training Large Neural Networks on Low Bandwidth and Simple CPU-Clusters via Model Parallelism and Sparsity

no code implementations29 Jan 2022 Minghao Yan, Nicholas Meisburger, Tharun Medini, Anshumali Shrivastava

We show that with reduced communication, due to sparsity, we can train close to a billion parameter model on simple 4-16 core CPU nodes connected by basic low bandwidth interconnect.

Cloud Computing

Cannot find the paper you are looking for? You can Submit a new open access paper.