Search Results for author: Yaman Umuroglu

Found 12 papers, 8 papers with code

FINN: A Framework for Fast, Scalable Binarized Neural Network Inference

4 code implementations1 Dec 2016 Yaman Umuroglu, Nicholas J. Fraser, Giulio Gambardella, Michaela Blott, Philip Leong, Magnus Jahre, Kees Vissers

Research has shown that convolutional neural networks contain significant redundancy, and high classification accuracy can be obtained even when weights and activations are reduced from floating point to binary values.

General Classification

Streamlined Deployment for Quantized Neural Networks

1 code implementation12 Sep 2017 Yaman Umuroglu, Magnus Jahre

Quantized Neural Networks (QNNs) have emerged as a potential solution to this problem, promising to offer most of the DNN accuracy benefits with much lower computational cost.

BISMO: A Scalable Bit-Serial Matrix Multiplication Overlay for Reconfigurable Computing

1 code implementation22 Jun 2018 Yaman Umuroglu, Lahiru Rasnayake, Magnus Sjalander

BISMO utilizes the excellent binary-operation performance of FPGAs to offer a matrix multiplication performance that scales with required precision and parallelism.

Hardware Architecture

Optimizing Bit-Serial Matrix Multiplication for Reconfigurable Computing

1 code implementation2 Jan 2019 Yaman Umuroglu, Davide Conficconi, Lahiru Rasnayake, Thomas B. Preusser, Magnus Sjalander

BISMO, a vectorized bit-serial matrix multiplication overlay for reconfigurable computing, previously utilized the excellent binary-operation performance of FPGAs to offer a matrix multiplication performance that scales with required precision and parallelism.

Hardware Architecture

QONNX: Representing Arbitrary-Precision Quantized Neural Networks

1 code implementation15 Jun 2022 Alessandro Pappalardo, Yaman Umuroglu, Michaela Blott, Jovan Mitrevski, Ben Hawks, Nhan Tran, Vladimir Loncar, Sioni Summers, Hendrik Borras, Jules Muhizi, Matthew Trahms, Shih-Chieh Hsu, Scott Hauck, Javier Duarte

We present extensions to the Open Neural Network Exchange (ONNX) intermediate representation format to represent arbitrary-precision quantized neural networks.

Quantization

Ps and Qs: Quantization-aware pruning for efficient low latency neural network inference

1 code implementation22 Feb 2021 Benjamin Hawks, Javier Duarte, Nicholas J. Fraser, Alessandro Pappalardo, Nhan Tran, Yaman Umuroglu

We study various configurations of pruning during quantization-aware training, which we term quantization-aware pruning, and the effect of techniques like regularization, batch normalization, and different pruning schemes on performance, computational complexity, and information content metrics.

Bayesian Optimization Computational Efficiency +2

Scaling Binarized Neural Networks on Reconfigurable Logic

no code implementations12 Jan 2017 Nicholas J. Fraser, Yaman Umuroglu, Giulio Gambardella, Michaela Blott, Philip Leong, Magnus Jahre, Kees Vissers

Binarized neural networks (BNNs) are gaining interest in the deep learning community due to their significantly lower computational and memory cost.

General Classification

FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks

no code implementations12 Sep 2018 Michaela Blott, Thomas Preusser, Nicholas Fraser, Giulio Gambardella, Kenneth O'Brien, Yaman Umuroglu

Given a neural network description, the tool optimizes for given platforms, design targets and a specific precision.

Hardware Architecture

LogicNets: Co-Designed Neural Networks and Circuits for Extreme-Throughput Applications

no code implementations6 Apr 2020 Yaman Umuroglu, Yash Akhauri, Nicholas J. Fraser, Michaela Blott

Deployment of deep neural networks for applications that require very high throughput or extremely low latency is a severe computational challenge, further exacerbated by inefficiencies in mapping the computation to hardware.

Network Intrusion Detection Quantization

A2Q+: Improving Accumulator-Aware Weight Quantization

no code implementations19 Jan 2024 Ian Colbert, Alessandro Pappalardo, Jakoba Petri-Koenig, Yaman Umuroglu

Recent studies show that also reducing the precision of the accumulator can further improve hardware efficiency at the risk of numerical overflow, which introduces arithmetic errors that can degrade model accuracy.

Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.