no code implementations • 21 Apr 2023 • Olivier Therrien, Marihan Amein, Zhuoran Xiong, Warren J. Gross, Brett H. Meyer
We present SSS3D, a fast multi-objective NAS framework designed to find computationally efficient 3D semantic scene segmentation networks.
no code implementations • 28 Mar 2023 • Zhuoran Xiong, Marihan Amein, Olivier Therrien, Warren J. Gross, Brett H. Meyer
We present FMAS, a fast multi-objective neural architecture search framework for semantic segmentation.
no code implementations • 25 Dec 2022 • Ibtihel Amara, Nazanin Sepahvand, Brett H. Meyer, Warren J. Gross, James J. Clark
We show that adaptively balancing between the reverse and forward divergences shifts the focus of the training strategy to the compact student network without limiting the teacher network's learning process.
no code implementations • 15 Sep 2022 • Ibtihel Amara, Maryam Ziaeefard, Brett H. Meyer, Warren Gross, James J. Clark
Knowledge distillation (KD) is an effective tool for compressing deep classification models for edge devices.
no code implementations • 3 Aug 2022 • Danilo Vucetic, Mohammadreza Tayaranian, Maryam Ziaeefard, James J. Clark, Brett H. Meyer, Warren J. Gross
We introduce Learner modules and priming, novel methods for fine-tuning that exploit the overparameterization of pre-trained language models to gain benefits in convergence speed and resource utilization.
no code implementations • 3 May 2022 • Danilo Vucetic, Mohammadreza Tayaranian, Maryam Ziaeefard, James J. Clark, Brett H. Meyer, Warren J. Gross
FAR reduces fine-tuning time on the DistilBERT model and CoLA dataset by 30%, and time spent on memory operations by 47%.
1 code implementation • 2 Jun 2020 • Loren Lugosch, Derek Nowrouzezahrai, Brett H. Meyer
The surprisal of the input, measured as the negative log-likelihood of the current observation according to the autoregressive model, is used as a measure of input difficulty.
1 code implementation • ICLR 2019 • Arash Ardakani, Zhengyun Ji, Sean C. Smithson, Brett H. Meyer, Warren J. Gross
On the software side, we evaluate the performance (in terms of accuracy) of our method using long short-term memories (LSTMs) on various sequential models including sequence classification and language modeling.
no code implementations • 7 Nov 2016 • Sean C. Smithson, Guang Yang, Warren J. Gross, Brett H. Meyer
The method is evaluated on the MNIST and CIFAR-10 image datasets, optimizing for both recognition accuracy and computational complexity.