Search Results for author: Warren J. Gross

Found 18 papers, 5 papers with code

Automatic Pruning of Fine-tuning Datasets for Transformer-based Language Models

1 code implementation11 Jul 2024 Mohammadreza Tayaranian, Seyyed Hasan Mozafari, Brett H. Meyer, James J. Clark, Warren J. Gross

Our experiments on 5 downstream tasks and 2 language models show that, on average, fine-tuning on the winning ticket subsets results in a $0. 1 \%$ increase in the evaluation performance of the model.

Natural Language Understanding Navigate

Step-GRAND: A Low Latency Universal Soft-input Decoder

no code implementations14 Jul 2023 Syed Mohsin Abbas, Marwan Jalaleddine, Chi-Ying Tsui, Warren J. Gross

GRAND features both soft-input and hard-input variants that are well suited to efficient hardware implementations that can be characterized with achievable average and worst-case decoding latency.

Decoder

SSS3D: Fast Neural Architecture Search For Efficient Three-Dimensional Semantic Segmentation

no code implementations21 Apr 2023 Olivier Therrien, Marihan Amein, Zhuoran Xiong, Warren J. Gross, Brett H. Meyer

We present SSS3D, a fast multi-objective NAS framework designed to find computationally efficient 3D semantic scene segmentation networks.

Neural Architecture Search Scene Segmentation

BD-KD: Balancing the Divergences for Online Knowledge Distillation

no code implementations25 Dec 2022 Ibtihel Amara, Nazanin Sepahvand, Brett H. Meyer, Warren J. Gross, James J. Clark

We show that adaptively balancing between the reverse and forward divergences shifts the focus of the training strategy to the compact student network without limiting the teacher network's learning process.

Knowledge Distillation Model Compression +1

Efficient Fine-Tuning of Compressed Language Models with Learners

no code implementations3 Aug 2022 Danilo Vucetic, Mohammadreza Tayaranian, Maryam Ziaeefard, James J. Clark, Brett H. Meyer, Warren J. Gross

We introduce Learner modules and priming, novel methods for fine-tuning that exploit the overparameterization of pre-trained language models to gain benefits in convergence speed and resource utilization.

CoLA Navigate

Efficient Fine-Tuning of BERT Models on the Edge

no code implementations3 May 2022 Danilo Vucetic, Mohammadreza Tayaranian, Maryam Ziaeefard, James J. Clark, Brett H. Meyer, Warren J. Gross

FAR reduces fine-tuning time on the DistilBERT model and CoLA dataset by 30%, and time spent on memory operations by 47%.

CoLA

Standard Deviation-Based Quantization for Deep Neural Networks

no code implementations24 Feb 2022 Amir Ardakani, Arash Ardakani, Brett Meyer, James J. Clark, Warren J. Gross

Quantization of deep neural networks is a promising approach that reduces the inference cost, making it feasible to run deep networks on resource-restricted devices.

Quantization

Learning to Skip Ineffectual Recurrent Computations in LSTMs

no code implementations9 Nov 2018 Arash Ardakani, Zhengyun Ji, Warren J. Gross

This observation suggests that a large fraction of the recurrent computations are ineffectual and can be avoided to speed up the process during the inference as they involve noncontributory multiplications/accumulations with zero-valued states.

Learning from the Syndrome

1 code implementation23 Oct 2018 Loren Lugosch, Warren J. Gross

In this paper, we introduce the syndrome loss, an alternative loss function for neural error-correcting decoders based on a relaxation of the syndrome.

Decoder valid

Learning Recurrent Binary/Ternary Weights

1 code implementation ICLR 2019 Arash Ardakani, Zhengyun Ji, Sean C. Smithson, Brett H. Meyer, Warren J. Gross

On the software side, we evaluate the performance (in terms of accuracy) of our method using long short-term memories (LSTMs) on various sequential models including sequence classification and language modeling.

Language Modelling

Multi-Mode Inference Engine for Convolutional Neural Networks

no code implementations11 Dec 2017 Arash Ardakani, Carlo Condo, Warren J. Gross

Their performance efficiency is limited to less than 55% on average, which leads to unnecessarily high processing latency and silicon area.

Hardware Architecture

Deep Learning Methods for Improved Decoding of Linear Codes

2 code implementations21 Jun 2017 Eliya Nachmani, Elad Marciano, Loren Lugosch, Warren J. Gross, David Burshtein, Yair Beery

Furthermore, we demonstrate that the neural belief propagation decoder can be used to improve the performance, or alternatively reduce the computational complexity, of a close to optimal decoder of short BCH codes.

Decoder

Neural Offset Min-Sum Decoding

1 code implementation20 Jan 2017 Loren Lugosch, Warren J. Gross

After describing our method, we compare the performance of the two neural decoding algorithms and show that our method achieves error-correction performance within 0. 1 dB of the multiplicative approach and as much as 1 dB better than traditional belief propagation for the codes under consideration.

Decoder

Neural Networks Designing Neural Networks: Multi-Objective Hyper-Parameter Optimization

no code implementations7 Nov 2016 Sean C. Smithson, Guang Yang, Warren J. Gross, Brett H. Meyer

The method is evaluated on the MNIST and CIFAR-10 image datasets, optimizing for both recognition accuracy and computational complexity.

BIG-bench Machine Learning Image Classification +2

Sparsely-Connected Neural Networks: Towards Efficient VLSI Implementation of Deep Neural Networks

no code implementations4 Nov 2016 Arash Ardakani, Carlo Condo, Warren J. Gross

The proposed architecture can save up to 90% of memory compared to the conventional implementations of fully-connected neural networks.

VLSI Implementation of Deep Neural Network Using Integral Stochastic Computing

no code implementations29 Sep 2015 Arash Ardakani, François Leduc-Primeau, Naoya Onizawa, Takahiro Hanyu, Warren J. Gross

We also synthesize the circuits in a 65 nm CMOS technology and we show that the proposed integral stochastic architecture results in up to 21% reduction in energy consumption compared to the binary radix implementation at the same misclassification rate.

Associative Memories Based on Multiple-Valued Sparse Clustered Networks

no code implementations3 Feb 2014 Hooman Jarollahi, Naoya Onizawa, Takahiro Hanyu, Warren J. Gross

Associative memories are structures that store data patterns and retrieve them given partial inputs.

Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.