Search Results for author: Yaman Umuroglu

Found 11 papers, 7 papers with code

QONNX: Representing Arbitrary-Precision Quantized Neural Networks

1 code implementation15 Jun 2022 Alessandro Pappalardo, Yaman Umuroglu, Michaela Blott, Jovan Mitrevski, Ben Hawks, Nhan Tran, Vladimir Loncar, Sioni Summers, Hendrik Borras, Jules Muhizi, Matthew Trahms, Shih-Chieh Hsu, Scott Hauck, Javier Duarte

We present extensions to the Open Neural Network Exchange (ONNX) intermediate representation format to represent arbitrary-precision quantized neural networks.

Quantization

EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network Accelerators

no code implementations4 Feb 2022 Lois Orosa, Skanda Koppula, Yaman Umuroglu, Konstantinos Kanellopoulos, Juan Gomez-Luna, Michaela Blott, Kees Vissers, Onur Mutlu

We find that commonly-used low-power CNN inference accelerators based on spatial architectures are not optimized for both of these convolutional kernels.

Image Generation Image Segmentation +1

Ps and Qs: Quantization-aware pruning for efficient low latency neural network inference

1 code implementation22 Feb 2021 Benjamin Hawks, Javier Duarte, Nicholas J. Fraser, Alessandro Pappalardo, Nhan Tran, Yaman Umuroglu

We study various configurations of pruning during quantization-aware training, which we term quantization-aware pruning, and the effect of techniques like regularization, batch normalization, and different pruning schemes on performance, computational complexity, and information content metrics.

Neural Architecture Search Quantization

LogicNets: Co-Designed Neural Networks and Circuits for Extreme-Throughput Applications

no code implementations6 Apr 2020 Yaman Umuroglu, Yash Akhauri, Nicholas J. Fraser, Michaela Blott

Deployment of deep neural networks for applications that require very high throughput or extremely low latency is a severe computational challenge, further exacerbated by inefficiencies in mapping the computation to hardware.

Network Intrusion Detection Quantization

Optimizing Bit-Serial Matrix Multiplication for Reconfigurable Computing

1 code implementation2 Jan 2019 Yaman Umuroglu, Davide Conficconi, Lahiru Rasnayake, Thomas B. Preusser, Magnus Sjalander

BISMO, a vectorized bit-serial matrix multiplication overlay for reconfigurable computing, previously utilized the excellent binary-operation performance of FPGAs to offer a matrix multiplication performance that scales with required precision and parallelism.

Hardware Architecture

FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks

no code implementations12 Sep 2018 Michaela Blott, Thomas Preusser, Nicholas Fraser, Giulio Gambardella, Kenneth O'Brien, Yaman Umuroglu

Given a neural network description, the tool optimizes for given platforms, design targets and a specific precision.

Hardware Architecture

BISMO: A Scalable Bit-Serial Matrix Multiplication Overlay for Reconfigurable Computing

1 code implementation22 Jun 2018 Yaman Umuroglu, Lahiru Rasnayake, Magnus Sjalander

BISMO utilizes the excellent binary-operation performance of FPGAs to offer a matrix multiplication performance that scales with required precision and parallelism.

Hardware Architecture

Streamlined Deployment for Quantized Neural Networks

1 code implementation12 Sep 2017 Yaman Umuroglu, Magnus Jahre

Quantized Neural Networks (QNNs) have emerged as a potential solution to this problem, promising to offer most of the DNN accuracy benefits with much lower computational cost.

Scaling Binarized Neural Networks on Reconfigurable Logic

no code implementations12 Jan 2017 Nicholas J. Fraser, Yaman Umuroglu, Giulio Gambardella, Michaela Blott, Philip Leong, Magnus Jahre, Kees Vissers

Binarized neural networks (BNNs) are gaining interest in the deep learning community due to their significantly lower computational and memory cost.

General Classification

FINN: A Framework for Fast, Scalable Binarized Neural Network Inference

3 code implementations1 Dec 2016 Yaman Umuroglu, Nicholas J. Fraser, Giulio Gambardella, Michaela Blott, Philip Leong, Magnus Jahre, Kees Vissers

Research has shown that convolutional neural networks contain significant redundancy, and high classification accuracy can be obtained even when weights and activations are reduced from floating point to binary values.

General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.