no code implementations • 4 Feb 2025 • Oscar Skean, Md Rifat Arefin, Dan Zhao, Niket Patel, Jalal Naghiyev, Yann Lecun, Ravid Shwartz-Ziv
From extracting features to generating text, the outputs of large language models (LLMs) typically rely on their final layers, following the conventional wisdom that earlier layers capture only low-level cues.
no code implementations • 12 Dec 2024 • Oscar Skean, Md Rifat Arefin, Yann Lecun, Ravid Shwartz-Ziv
Understanding what defines a good representation in large language models (LLMs) is fundamental to both theoretical understanding and practical applications.
1 code implementation • 4 Nov 2024 • Md Rifat Arefin, Gopeshh Subbaraj, Nicolas Gontier, Yann Lecun, Irina Rish, Ravid Shwartz-Ziv, Christopher Pal
To address this, we propose Sequential Variance-Covariance Regularization (Seq-VCR), which enhances the entropy of intermediate representations and prevents collapse.
1 code implementation • 9 Sep 2024 • Mohammad-Javad Darvishi-Bayazi, Md Rifat Arefin, Jocelyn Faubert, Irina Rish
Machine learning models often struggle with distribution shifts in real-world scenarios, whereas humans exhibit robust adaptation.
1 code implementation • 20 Feb 2024 • Md Rifat Arefin, Yan Zhang, Aristide Baratin, Francesco Locatello, Irina Rish, Dianbo Liu, Kenji Kawaguchi
Models prone to spurious correlations in training data often produce brittle predictions and introduce unintended biases.
2 code implementations • 19 Sep 2023 • Mohammad-Javad Darvishi-Bayazi, Mohammad Sajjad Ghaemi, Timothee Lesort, Md Rifat Arefin, Jocelyn Faubert, Irina Rish
We see improvement in the performance of the target model on the target (NMT) datasets by using the knowledge from the source dataset (TUAB) when a low amount of labelled data was available.
no code implementations • 10 Jul 2022 • Timothée Lesort, Oleksiy Ostapenko, Diganta Misra, Md Rifat Arefin, Pau Rodríguez, Laurent Charlin, Irina Rish
In this paper, we study the progressive knowledge accumulation (KA) in DNNs trained with gradient-based algorithms in long sequences of tasks with data re-occurrence.
1 code implementation • 30 Apr 2022 • Oleksiy Ostapenko, Timothee Lesort, Pau Rodríguez, Md Rifat Arefin, Arthur Douillard, Irina Rish, Laurent Charlin
Motivated by this, we study the efficacy of pre-trained vision models as a foundation for downstream continual learning (CL) scenarios.
no code implementations • 1 Dec 2020 • Md Rifat Arefin, Minhas Kamal, Kishan Kumar Ganguly, Tarek Salah Uddin Mahmud
Recommender system has become an inseparable part of online shopping and its usability is increasing with the advancement of these e-commerce sites.
1 code implementation • 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2020 • Md Rifat Arefin, Vincent Michalski, Pierre-Luc St-Charles, Alfredo Kalaitzis, Sookyung Kim, Samira E. Kahou, Yoshua Bengio
High-resolution satellite imagery is critical for various earth observation applications related to environment monitoring, geoscience, forecasting, and land use analysis.
2 code implementations • 15 Feb 2020 • Michel Deudon, Alfredo Kalaitzis, Israel Goytom, Md Rifat Arefin, Zhichao Lin, Kris Sankaran, Vincent Michalski, Samira E. Kahou, Julien Cornebise, Yoshua Bengio
Multi-frame Super-Resolution (MFSR) offers a more grounded approach to the ill-posed problem, by conditioning on multiple low-resolution views.
Ranked #6 on
Multi-Frame Super-Resolution
on PROBA-V
1 code implementation • ICLR 2020 • Michel Deudon, Alfredo Kalaitzis, Md Rifat Arefin, Israel Goytom, Zhichao Lin, Kris Sankaran, Vincent Michalski, Samira E. Kahou, Julien Cornebise, Yoshua Bengio
Multi-frame Super-Resolution (MFSR) offers a more grounded approach to the ill-posed problem, by conditioning on multiple low-resolution views.
Ranked #6 on
Multi-Frame Super-Resolution
on PROBA-V