Search Results for author: Michael Ferdman

Found 3 papers, 0 papers with code

On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers

no code implementations • Findings (ACL) 2021 • Tianchu Ji, Shraddhan Jain, Michael Ferdman, Peter Milder, H. Andrew Schwartz, Niranjan Balasubramanian

This informs the design of an inference-time quantization technique using both pruning and log-scaled mapping which produces only a few (e. g. $2^3$) unique values.

Quantization Question Answering +1

Paper
Add Code

Medusa: A Scalable Interconnect for Many-Port DNN Accelerators and Wide DRAM Controller Interfaces

no code implementations • 11 Jul 2018 • Yongming Shen, Tianchu Ji, Michael Ferdman, Peter Milder

To cope with the increasing demand and computational intensity of deep neural networks (DNNs), industry and academia have turned to accelerator technologies.

Paper
Add Code

Maximizing CNN Accelerator Efficiency Through Resource Partitioning

no code implementations • 30 Jun 2016 • Yongming Shen, Michael Ferdman, Peter Milder

Current approaches construct a single processor that computes the CNN layers one at a time; the processor is optimized to maximize the throughput at which the collection of layers is computed.

Hardware Architecture

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.