no code implementations • 9 May 2018 • Zheng Wang, Michael O'Boyle
In the last decade, machine learning based compilation has moved from an an obscure research niche to a mainstream activity.
1 code implementation • 19 Sep 2018 • Jack Turner, José Cano, Valentin Radu, Elliot J. Crowley, Michael O'Boyle, Amos Storkey
Convolutional Neural Networks (CNNs) are extremely computationally demanding, presenting a large barrier to their deployment on resource-constrained devices.
2 code implementations • 10 Oct 2018 • Elliot J. Crowley, Jack Turner, Amos Storkey, Michael O'Boyle
Structured pruning is a popular method for compressing a neural network: given a large trained network, one alternates between removing channel connections and fine-tuning; reducing the overall width of the network.
no code implementations • NIPS Workshop CDNNRIA 2018 • Elliot J. Crowley, Jack Turner, Amos Storkey, Michael O'Boyle
First, when time-constrained, it is better to train a simple, smaller network from scratch than prune a large network.
no code implementations • 24 Oct 2018 • Jack Turner, Elliot J. Crowley, Valentin Radu, José Cano, Amos Storkey, Michael O'Boyle
The task of accelerating large neural networks on general purpose hardware has, in recent years, prompted the use of channel pruning to reduce network size.
2 code implementations • ICLR 2020 • Jack Turner, Elliot J. Crowley, Michael O'Boyle, Amos Storkey, Gavin Gray
The desire to map neural networks to varying-capacity devices has led to the development of a wealth of compression techniques, many of which involve replacing standard convolutional blocks in a large network with cheap alternative blocks.
3 code implementations • NeurIPS 2020 • Massimiliano Patacchiola, Jack Turner, Elliot J. Crowley, Michael O'Boyle, Amos Storkey
Recently, different machine learning methods have been introduced to tackle the challenging few-shot learning scenario that is, learning from a small labeled dataset related to a specific task.
no code implementations • 20 Feb 2020 • Valentin Radu, Kuba Kaszyk, Yuan Wen, Jack Turner, Jose Cano, Elliot J. Crowley, Bjorn Franke, Amos Storkey, Michael O'Boyle
We evaluate higher level libraries, which analyze the input characteristics of a convolutional layer, based on which they produce optimized OpenCL (Arm Compute Library and TVM) and CUDA (cuDNN) code.
1 code implementation • 17 Jun 2020 • Perry Gibson, José Cano, Jack Turner, Elliot J. Crowley, Michael O'Boyle, Amos Storkey
We observe that our new implementation scales well with the number of groups and provides the best inference times in all settings, improving the existing implementations of grouped convolutions in TVM, PyTorch and TensorFlow Lite by 3. 4x, 8x and 4x on average respectively.
no code implementations • 27 Aug 2020 • Bruce Collie, Philip Ginsbach, Jackson Woodruff, Ajitha Rajan, Michael O'Boyle
Our approach is integrated with standard compiler tooling, and we use this integration to evaluate migration opportunities in 9 existing C/C++ applications with over 1MLoC.
Software Engineering
no code implementations • 21 Nov 2020 • Chris Cummins, Hugh Leather, Zacharias Fisches, Tal Ben-Nun, Torsten Hoefler, Michael O'Boyle
Compiler architects increasingly look to machine learning when building heuristics for compiler optimization.
1 code implementation • 12 Feb 2021 • Jack Turner, Elliot J. Crowley, Michael O'Boyle
This unification allows us to express existing NAS operations as combinations of simpler transformations.
no code implementations • 15 Nov 2023 • Perry Gibson, José Cano, Elliot J. Crowley, Amos Storkey, Michael O'Boyle
Deep Neural Networks (DNNs) are extremely computationally demanding, which presents a large barrier to their deployment on resource-constrained devices.