Search Results for author: Oleksiy Ostapenko

Found 13 papers, 6 papers with code

Towards Modular LLMs by Building and Reusing a Library of LoRAs

1 code implementation18 May 2024 Oleksiy Ostapenko, Zhan Su, Edoardo Maria Ponti, Laurent Charlin, Nicolas Le Roux, Matheus Pereira, Lucas Caccia, Alessandro Sordoni

The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trained adapters to improve performance for new tasks.

Language Modelling Large Language Model

Guiding Language Model Reasoning with Planning Tokens

no code implementations9 Oct 2023 Xinyi Wang, Lucas Caccia, Oleksiy Ostapenko, Xingdi Yuan, William Yang Wang, Alessandro Sordoni

To encourage a more structural generation of CoT steps, we propose a hierarchical generation scheme: we let the LM generate a planning token at the start of each reasoning step, intuitively serving as a high-level plan of the current step, and add their embeddings to the model parameters.

Language Modelling Math

Challenging Common Assumptions about Catastrophic Forgetting

no code implementations10 Jul 2022 Timothée Lesort, Oleksiy Ostapenko, Diganta Misra, Md Rifat Arefin, Pau Rodríguez, Laurent Charlin, Irina Rish

In this paper, we study the progressive knowledge accumulation (KA) in DNNs trained with gradient-based algorithms in long sequences of tasks with data re-occurrence.

Continual Learning Memorization

Continual Learning with Foundation Models: An Empirical Study of Latent Replay

1 code implementation30 Apr 2022 Oleksiy Ostapenko, Timothee Lesort, Pau Rodríguez, Md Rifat Arefin, Arthur Douillard, Irina Rish, Laurent Charlin

Motivated by this, we study the efficacy of pre-trained vision models as a foundation for downstream continual learning (CL) scenarios.

Benchmarking Continual Learning

Continual Learning via Local Module Composition

1 code implementation NeurIPS 2021 Oleksiy Ostapenko, Pau Rodriguez, Massimo Caccia, Laurent Charlin

We introduce local module composition (LMC), an approach to modular CL where each module is provided a local structural component that estimates a module's relevance to the input.

Continual Learning Transfer Learning

Pruning at a Glance: Global Neural Pruning for Model Compression

no code implementations30 Nov 2019 Abdullah Salama, Oleksiy Ostapenko, Tassilo Klein, Moin Nabi

We prove the viability of our method by producing highly compressed models, namely VGG-16, ResNet-56, and ResNet-110 respectively on CIFAR10 without losing any performance compared to the baseline, as well as ResNet-34 and ResNet-50 on ImageNet without a significant loss of accuracy.

Model Compression

Cannot find the paper you are looking for? You can Submit a new open access paper.