Search Results for author: Benjamin Thérien

Found 8 papers, 3 papers with code

Simple and Scalable Strategies to Continually Pre-train Large Language Models

1 code implementation • 13 Mar 2024 • Adam Ibrahim, Benjamin Thérien, Kshitij Gupta, Mats L. Richter, Quentin Anthony, Timothée Lesort, Eugene Belilovsky, Irina Rish

In this work, we show that a simple and scalable combination of learning rate (LR) re-warming, LR re-decaying, and replay of previous data is sufficient to match the performance of fully re-training from scratch on all available data, as measured by the final loss and the average score on several language model (LM) evaluation benchmarks.

Continual Learning Language Modelling

6,574

Paper
Code

Can We Learn Communication-Efficient Optimizers?

no code implementations • 2 Dec 2023 • Charles-Étienne Joseph, Benjamin Thérien, Abhinav Moudgil, Boris Knyazev, Eugene Belilovsky

Although many variants of these approaches have been proposed, they can sometimes lag behind state-of-the-art adaptive optimizers for deep learning.

Language Modelling

Paper
Add Code

Continual Pre-Training of Large Language Models: How to (re)warm your model?

2 code implementations • 8 Aug 2023 • Kshitij Gupta, Benjamin Thérien, Adam Ibrahim, Mats L. Richter, Quentin Anthony, Eugene Belilovsky, Irina Rish, Timothée Lesort

We study the warmup phase of models pre-trained on the Pile (upstream data, 300B tokens) as we continue to pre-train on SlimPajama (downstream data, 297B tokens), following a linear warmup and cosine decay schedule.

Language Modelling

6,574

Paper
Code

Object Re-Identification from Point Clouds

no code implementations • 17 May 2023 • Benjamin Thérien, Chengjie Huang, Adrian Chow, Krzysztof Czarnecki

To our knowledge, we are the first to study object re-identification from real point cloud observations.

3D Multi-Object Tracking Autonomous Driving +3

Paper
Add Code

A Closer Look at Robustness to L-infinity and Spatial Perturbations and their Composition

no code implementations • 5 Oct 2022 • Luke Rowe, Benjamin Thérien, Krzysztof Czarnecki, Hongyang Zhang

In adversarial machine learning, the popular $\ell_\infty$ threat model has been the focus of much previous work.

Paper
Add Code

Interpretable Deep Tracking

no code implementations • 3 Oct 2022 • Benjamin Thérien, Krzysztof Czarnecki

By enumerating different tracking decisions and associated reasoning procedures, we can train individual networks to reason about the possible decisions via IIT.

Motion Forecasting Multi-Object Tracking

Paper
Add Code

Exploring the Optimality of Tight-Frame Scattering Networks

no code implementations • 29 Sep 2021 • Shanel Gauthier, Benjamin Thérien, Laurent Alsène-Racicot, Muawiz Sajjad Chaudhary, Irina Rish, Eugene Belilovsky, Michael Eickenberg, Guy Wolf

The wavelet filters used in the scattering transform are typically selected to create a tight frame via a parameterized mother wavelet.

Paper
Add Code

Parametric Scattering Networks

1 code implementation • CVPR 2022 • Shanel Gauthier, Benjamin Thérien, Laurent Alsène-Racicot, Muawiz Chaudhary, Irina Rish, Eugene Belilovsky, Michael Eickenberg, Guy Wolf

The wavelet scattering transform creates geometric invariants and deformation stability.

Ranked #3 on Small Data Image Classification on CIFAR-10, 500 Labels

Small Data Image Classification

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.