1 code implementation • 18 May 2024 • Oleksiy Ostapenko, Zhan Su, Edoardo Maria Ponti, Laurent Charlin, Nicolas Le Roux, Matheus Pereira, Lucas Caccia, Alessandro Sordoni
The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trained adapters to improve performance for new tasks.
no code implementations • 9 Oct 2023 • Xinyi Wang, Lucas Caccia, Oleksiy Ostapenko, Xingdi Yuan, William Yang Wang, Alessandro Sordoni
To encourage a more structural generation of CoT steps, we propose a hierarchical generation scheme: we let the LM generate a planning token at the start of each reasoning step, intuitively serving as a high-level plan of the current step, and add their embeddings to the model parameters.
no code implementations • 10 Jul 2022 • Timothée Lesort, Oleksiy Ostapenko, Diganta Misra, Md Rifat Arefin, Pau Rodríguez, Laurent Charlin, Irina Rish
In this paper, we study the progressive knowledge accumulation (KA) in DNNs trained with gradient-based algorithms in long sequences of tasks with data re-occurrence.
1 code implementation • 30 Apr 2022 • Oleksiy Ostapenko, Timothee Lesort, Pau Rodríguez, Md Rifat Arefin, Arthur Douillard, Irina Rish, Laurent Charlin
Motivated by this, we study the efficacy of pre-trained vision models as a foundation for downstream continual learning (CL) scenarios.
1 code implementation • NeurIPS 2021 • Oleksiy Ostapenko, Pau Rodriguez, Massimo Caccia, Laurent Charlin
We introduce local module composition (LMC), an approach to modular CL where each module is provided a local structural component that estimates a module's relevance to the input.
3 code implementations • 2 Aug 2021 • Fabrice Normandin, Florian Golemo, Oleksiy Ostapenko, Pau Rodriguez, Matthew D Riemer, Julio Hurtado, Khimya Khetarpal, Ryan Lindeborg, Lucas Cecchi, Timothée Lesort, Laurent Charlin, Irina Rish, Massimo Caccia
We propose a taxonomy of settings, where each setting is described as a set of assumptions.
no code implementations • NeurIPS 2020 • Massimo Caccia, Pau Rodriguez, Oleksiy Ostapenko, Fabrice Normandin, Min Lin, Lucas Page-Caccia, Issam Hadj Laradji, Irina Rish, Alexandre Lacoste, David Vázquez, Laurent Charlin
The main challenge is that the agent must not forget previous tasks and also adapt to novel tasks in the stream.
1 code implementation • NeurIPS 2020 • Massimo Caccia, Pau Rodriguez, Oleksiy Ostapenko, Fabrice Normandin, Min Lin, Lucas Caccia, Issam Laradji, Irina Rish, Alexandre Lacoste, David Vazquez, Laurent Charlin
We propose Continual-MAML, an online extension of the popular MAML algorithm as a strong baseline for this scenario.
no code implementations • 30 Nov 2019 • Abdullah Salama, Oleksiy Ostapenko, Tassilo Klein, Moin Nabi
We prove the viability of our method by producing highly compressed models, namely VGG-16, ResNet-56, and ResNet-110 respectively on CIFAR10 without losing any performance compared to the baseline, as well as ResNet-34 and ResNet-50 on ImageNet without a significant loss of accuracy.
no code implementations • ICLR 2019 • Oleksiy Ostapenko, Mihai Puscas, Tassilo Klein, Moin Nabi
Continuously trainable models should be able to learn from a stream of data over an undefined period of time.
2 code implementations • CVPR 2019 • Oleksiy Ostapenko, Mihai Puscas, Tassilo Klein, Patrick Jähnichen, Moin Nabi
In order to tackle these challenges, we introduce Dynamic Generative Memory (DGM) - a synaptic plasticity driven framework for continual learning.
Ranked #4 on Continual Learning on ImageNet-50 (5 tasks)
no code implementations • 22 Nov 2018 • Frederik Pahde, Oleksiy Ostapenko, Patrick Jähnichen, Tassilo Klein, Moin Nabi
State-of-the-art deep learning algorithms yield remarkable results in many visual recognition tasks.
no code implementations • NIPS Workshop CDNNRIA 2018 • Abdullah Salama, Oleksiy Ostapenko, Moin Nabi, Tassilo Klein
High performance of deep learning models typically comes at cost of considerable model size and computation time.