no code implementations • 10 Dec 2024 • Lennart Schneider, Martin Wistuba, Aaron Klein, Jacek Golebiowski, Giovanni Zappella, Felice Antonio Merra
Optimal prompt selection is crucial for maximizing large language model (LLM) performance on downstream tasks.
no code implementations • 31 Jul 2024 • Simon Valentin, Jinmiao Fu, Gianluca Detommaso, Shaoyuan Xu, Giovanni Zappella, Bryan Wang
Large language models (LLMs) can be prone to hallucinations - generating unreliable outputs that are unfaithful to their inputs, external facts or internally inconsistent.
no code implementations • 5 Jun 2024 • Martin Wistuba, Prabhu Teja Sivaprasad, Lukas Balles, Giovanni Zappella
In this paper, we conduct this research and find that the choice of prompt tuning as a PEFT method hurts the overall performance of the CL system.
no code implementations • 8 Dec 2023 • Lukas Balles, Cedric Archambeau, Giovanni Zappella
With increasing scale in model and dataset size, the training of deep neural networks becomes a massive computational burden.
no code implementations • 29 Nov 2023 • Martin Wistuba, Prabhu Teja Sivaprasad, Lukas Balles, Giovanni Zappella
Recent work using pretrained transformers has shown impressive performance when fine-tuned with data from the downstream problem of interest.
1 code implementation • 24 Apr 2023 • Martin Wistuba, Martin Ferianc, Lukas Balles, Cedric Archambeau, Giovanni Zappella
We discuss requirements for the use of continual learning algorithms in practice, from which we derive design principles for Renate.
2 code implementations • 14 Jul 2022 • Ondrej Bohdal, Lukas Balles, Martin Wistuba, Beyza Ermis, Cédric Archambeau, Giovanni Zappella
Hyperparameter optimization (HPO) and neural architecture search (NAS) are methods of choice to obtain the best-in-class machine learning models, but in practice they can be costly to run.
no code implementations • 28 Jun 2022 • Beyza Ermis, Giovanni Zappella, Martin Wistuba, Aditya Rawal, Cedric Archambeau
This phenomenon is known as catastrophic forgetting and it is often difficult to prevent due to practical constraints, such as the amount of data that can be stored or the limited computation sources that can be used.
no code implementations • 28 Mar 2022 • Lukas Balles, Giovanni Zappella, Cédric Archambeau
Most widely-used CL methods rely on a rehearsal memory of data points to be reused while training on new data.
no code implementations • 9 Mar 2022 • Beyza Ermis, Giovanni Zappella, Martin Wistuba, Aditya Rawal, Cedric Archambeau
Moreover, applications increasingly rely on large pre-trained neural networks, such as pre-trained Transformers, since the resources or data might not be available in sufficiently large quantities to practitioners to train the model from scratch.
no code implementations • 9 Dec 2021 • Lukas Balles, Giovanni Zappella, Cédric Archambeau
We devise a coreset selection method based on the idea of gradient matching: The gradients induced by the coreset should match, as closely as possible, those induced by the original training dataset.
no code implementations • ICML Workshop AutoML 2021 • Giovanni Zappella, David Salinas, Cédric Archambeau
In this work we consider the problem of repeated hyperparameter and neural architecture search (HNAS).
no code implementations • 15 Dec 2020 • Piali Das, Valerio Perrone, Nikita Ivkin, Tanya Bansal, Zohar Karnin, Huibin Shen, Iaroslav Shcherbatyi, Yotam Elor, Wilton Wu, Aida Zolic, Thibaut Lienart, Alex Tang, Amr Ahmed, Jean Baptiste Faddoul, Rodolphe Jenatton, Fela Winkelmolen, Philip Gautier, Leo Dirac, Andre Perunicic, Miroslav Miladinovic, Giovanni Zappella, Cédric Archambeau, Matthias Seeger, Bhaskar Dutt, Laurence Rouesnel
AutoML systems provide a black-box solution to machine learning problems by selecting the right way of processing features, choosing an algorithm and tuning the hyperparameters of the entire pipeline.
no code implementations • 28 Apr 2020 • Giuseppe Di Benedetto, Vito Bellini, Giovanni Zappella
Here we present a contextual bandit algorithm which detects and adapts to abrupt changes of the reward function and leverages previous estimations whenever the environment falls back to a previously observed state.
no code implementations • 27 Apr 2020 • Beyza Ermis, Patrick Ernst, Yannik Stein, Giovanni Zappella
Personalization is a crucial aspect of many online experiences.
no code implementations • ICML 2020 • Claire Vernade, Alexandra Carpentier, Tor Lattimore, Giovanni Zappella, Beyza Ermis, Michael Brueckner
Stochastic linear bandits are a natural and well-studied model for structured exploration/exploitation problems and are widely used in applications such as online marketing and recommendation.
no code implementations • ICML 2017 • Claudio Gentile, Shuai Li, Purushottam Kar, Alexandros Karatzoglou, Evans Etrue, Giovanni Zappella
We investigate a novel cluster-of-bandit algorithm CAB for collaborative recommendation tasks that implements the underlying feedback sharing mechanism by estimating the neighborhood of users in a context-dependent manner.
no code implementations • 31 Jan 2014 • Claudio Gentile, Shuai Li, Giovanni Zappella
We introduce a novel algorithmic approach to content recommendation based on adaptive clustering of exploration-exploitation ("bandit") strategies.
no code implementations • NeurIPS 2013 • Nicolò Cesa-Bianchi, Claudio Gentile, Giovanni Zappella
Multi-armed bandit problems are receiving a great deal of attention because they adequately formalize the exploration-exploitation trade-offs arising in several industrially relevant applications, such as online advertisement and, more generally, recommendation systems.
no code implementations • NeurIPS 2012 • Nicolò Cesa-Bianchi, Claudio Gentile, Fabio Vitale, Giovanni Zappella
We provide a theoretical analysis within this model, showing that we can achieve an optimal (to whithin a constant factor) number of mistakes on any graph $G = (V, E)$ such that $|E|$ is at least order of $|V|^{3/2}$ by querying at most order of $|V|^{3/2}$ edge labels.
3 code implementations • 19 Dec 2011 • Giovanni Zappella
We introduce a scalable algorithm, MUCCA, for multiclass node classification in weighted graphs.
no code implementations • NeurIPS 2011 • Fabio Vitale, Nicolò Cesa-Bianchi, Claudio Gentile, Giovanni Zappella
Although it is known how to predict the nodes of an unweighted tree in a nearly optimal way, in the weighted case a fully satisfactory algorithm is not available yet.