4 code implementations • 1 Dec 2016 • Yaman Umuroglu, Nicholas J. Fraser, Giulio Gambardella, Michaela Blott, Philip Leong, Magnus Jahre, Kees Vissers
Research has shown that convolutional neural networks contain significant redundancy, and high classification accuracy can be obtained even when weights and activations are reduced from floating point to binary values.
1 code implementation • 15 Jun 2022 • Alessandro Pappalardo, Yaman Umuroglu, Michaela Blott, Jovan Mitrevski, Ben Hawks, Nhan Tran, Vladimir Loncar, Sioni Summers, Hendrik Borras, Jules Muhizi, Matthew Trahms, Shih-Chieh Hsu, Scott Hauck, Javier Duarte
We present extensions to the Open Neural Network Exchange (ONNX) intermediate representation format to represent arbitrary-precision quantized neural networks.
1 code implementation • 11 Jul 2018 • Vladimir Rybalkin, Alessandro Pappalardo, Muhammad Mohsin Ghaffar, Giulio Gambardella, Norbert Wehn, Michaela Blott
In this paper, we present the first systematic exploration of this design space as a function of precision for Bidirectional Long Short-Term Memory (BiLSTM) neural network.
Optical Character Recognition Optical Character Recognition (OCR) +1
1 code implementation • CVPR 2018 • Julian Faraone, Nicholas Fraser, Michaela Blott, Philip H. W. Leong
An efficient way to reduce this complexity is to quantize the weight parameters and/or activations during training by approximating their distributions with a limited entry codebook.
1 code implementation • 21 Nov 2018 • Yifan Yang, Qijing Huang, Bichen Wu, Tianjun Zhang, Liang Ma, Giulio Gambardella, Michaela Blott, Luciano Lavagno, Kees Vissers, John Wawrzynek, Kurt Keutzer
DiracDeltaNet achieves competitive accuracy on ImageNet (88. 7\% top-5), but with 42$\times$ fewer parameters and 48$\times$ fewer OPs than VGG16.
1 code implementation • 23 Jun 2022 • Hendrik Borras, Giuseppe Di Guglielmo, Javier Duarte, Nicolò Ghielmetti, Ben Hawks, Scott Hauck, Shih-Chieh Hsu, Ryan Kastner, Jason Liang, Andres Meza, Jules Muhizi, Tai Nguyen, Rushil Roy, Nhan Tran, Yaman Umuroglu, Olivia Weng, Aidan Yokuda, Michaela Blott
We present our development experience and recent results for the MLPerf Tiny Inference Benchmark on field-programmable gate array (FPGA) platforms.
1 code implementation • 28 Sep 2022 • Zhiqiang Que, Hongxiang Fan, Marcus Loo, He Li, Michaela Blott, Maurizio Pierini, Alexander Tapper, Wayne Luk
This work presents a novel reconfigurable architecture for Low Latency Graph Neural Network (LL-GNN) designs for particle detectors, delivering unprecedented low latency performance.
1 code implementation • 11 Sep 2019 • Michaela Blott, Lisa Halder, Miriam Leeser, Linda Doyle
In order to address these implementation challenges, a broad spectrum of new customized and heterogeneous hardware architectures have emerged, often accompanied with co-designed algorithms to extract maximum benefit out of the hardware.
Hardware Architecture
1 code implementation • 4 Feb 2022 • Lois Orosa, Skanda Koppula, Yaman Umuroglu, Konstantinos Kanellopoulos, Juan Gomez-Luna, Michaela Blott, Kees Vissers, Onur Mutlu
We find that commonly-used low-power CNN inference accelerators based on spatial architectures are not optimized for both of these convolutional kernels.
no code implementations • 19 Sep 2017 • Julian Faraone, Nicholas Fraser, Giulio Gambardella, Michaela Blott, Philip H. W. Leong
A low precision deep neural network training technique for producing sparse, ternary neural networks is presented.
no code implementations • 12 Jan 2017 • Nicholas J. Fraser, Yaman Umuroglu, Giulio Gambardella, Michaela Blott, Philip Leong, Magnus Jahre, Kees Vissers
Binarized neural networks (BNNs) are gaining interest in the deep learning community due to their significantly lower computational and memory cost.
no code implementations • 21 Jun 2018 • Thomas B. Preußer, Giulio Gambardella, Nicholas Fraser, Michaela Blott
Neural networks have established as a generic and powerful means to approach challenging problems such as image classification, object detection or decision making.
no code implementations • 12 Sep 2018 • Michaela Blott, Thomas Preusser, Nicholas Fraser, Giulio Gambardella, Kenneth O'Brien, Yaman Umuroglu
Given a neural network description, the tool optimizes for given platforms, design targets and a specific precision.
Hardware Architecture
no code implementations • 16 Dec 2019 • Giulio Gambardella, Johannes Kappauf, Michaela Blott, Christoph Doehring, Martin Kumm, Peter Zipf, Kees Vissers
In particular, Convolutional Neural Networks (CNNs), are gaining popularity and are evaluated for deployment in safety critical applications such as self driving vehicles.
no code implementations • 24 Mar 2020 • Mairin Kroes, Lucian Petrica, Sorin Cotofana, Michaela Blott
We hybridize genetic algorithms and simulated annealing with traditional bin packing heuristics to create flexible mappers capable of grouping parameter memories such that each group optimally fits FPGA on-chip memories.
no code implementations • 6 Apr 2020 • Yaman Umuroglu, Yash Akhauri, Nicholas J. Fraser, Michaela Blott
Deployment of deep neural networks for applications that require very high throughput or extremely low latency is a severe computational challenge, further exacerbated by inefficiencies in mapping the computation to hardware.
no code implementations • 11 Nov 2020 • Ussama Zahid, Giulio Gambardella, Nicholas J. Fraser, Michaela Blott, Kees Vissers
Our experiments show that by injecting faults in the convolutional layers during training, highly accurate convolutional neural networks (CNNs) can be trained which exhibits much better error tolerance compared to the original.
no code implementations • 25 Oct 2021 • Allison McCarn Deiana, Nhan Tran, Joshua Agar, Michaela Blott, Giuseppe Di Guglielmo, Javier Duarte, Philip Harris, Scott Hauck, Mia Liu, Mark S. Neubauer, Jennifer Ngadiuba, Seda Ogrenci-Memik, Maurizio Pierini, Thea Aarrestad, Steffen Bahr, Jurgen Becker, Anne-Sophie Berthold, Richard J. Bonventre, Tomas E. Muller Bravo, Markus Diefenthaler, Zhen Dong, Nick Fritzsche, Amir Gholami, Ekaterina Govorkova, Kyle J Hazelwood, Christian Herwig, Babar Khan, Sehoon Kim, Thomas Klijnsma, Yaling Liu, Kin Ho Lo, Tri Nguyen, Gianantonio Pezzullo, Seyedramin Rasoulinezhad, Ryan A. Rivera, Kate Scholberg, Justin Selig, Sougata Sen, Dmitri Strukov, William Tang, Savannah Thais, Kai Lukas Unger, Ricardo Vilalta, Belinavon Krosigk, Thomas K. Warburton, Maria Acosta Flechas, Anthony Aportela, Thomas Calvet, Leonardo Cristella, Daniel Diaz, Caterina Doglioni, Maria Domenica Galati, Elham E Khoda, Farah Fahim, Davide Giri, Benjamin Hawks, Duc Hoang, Burt Holzman, Shih-Chieh Hsu, Sergo Jindariani, Iris Johnson, Raghav Kansal, Ryan Kastner, Erik Katsavounidis, Jeffrey Krupa, Pan Li, Sandeep Madireddy, Ethan Marx, Patrick McCormack, Andres Meza, Jovan Mitrevski, Mohammed Attia Mohammed, Farouk Mokhtar, Eric Moreno, Srishti Nagu, Rohin Narayan, Noah Palladino, Zhiqiang Que, Sang Eon Park, Subramanian Ramamoorthy, Dylan Rankin, Simon Rothman, ASHISH SHARMA, Sioni Summers, Pietro Vischia, Jean-Roch Vlimant, Olivia Weng
In this community review report, we discuss applications and techniques for fast machine learning (ML) in science -- the concept of integrating power ML methods into the real-time experimental data processing loop to accelerate scientific discovery.
no code implementations • 24 Jun 2022 • Pedro J. Freire, Michael Anderson, Bernhard Spinnler, Thomas Bex, Jaroslaw E. Prilepsky, Tobias A. Eriksson, Nelson Costa, Wolfgang Schairer, Michaela Blott, Antonio Napoli, Sergei K. Turitsyn
For the first time, recurrent and feedforward neural network-based equalizers for nonlinearity compensation are implemented in an FPGA, with a level of complexity comparable to that of a dispersion equalizer.
no code implementations • 9 Dec 2022 • Pedro J. Freire, Sasipim Srivallapanondh, Michael Anderson, Bernhard Spinnler, Thomas Bex, Tobias A. Eriksson, Antonio Napoli, Wolfgang Schairer, Nelson Costa, Michaela Blott, Sergei K. Turitsyn, Jaroslaw E. Prilepsky
The main results are divided into three parts: a performance comparison, an analysis of how activation functions are implemented, and a report on the complexity of the hardware.
no code implementations • 21 Nov 2023 • Shivam Aggarwal, Alessandro Pappalardo, Hans Jakob Damsgaard, Giuseppe Franco, Thomas B. Preußer, Michaela Blott, Tulika Mitra
However, the exploration of floating-point formats smaller than 8 bits and their comparison with integer quantization remains relatively limited.
no code implementations • 18 Dec 2023 • Zhenhao He, Dario Korolija, Yu Zhu, Benjamin Ramhorst, Tristan Laan, Lucian Petrica, Michaela Blott, Gustavo Alonso
To facilitate the development of distributed applications with FPGAs, in this paper we propose ACCL+, an open-source versatile FPGA-based collective communication library.