Search Results for author: Andrea Gesmundo

Found 28 papers, 7 papers with code

An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems

1 code implementation • 25 May 2022 • Andrea Gesmundo, Jeff Dean

Multitask learning assumes that models capable of learning from multiple tasks can achieve better quality and efficiency via knowledge transfer, a key feature of human learning.

Ranked #1 on Image Classification on KMNIST

Continual Learning Fine-Grained Image Classification +1

32,735

Paper
Code

A Continual Development Methodology for Large-scale Multitask Dynamic ML Systems

1 code implementation • 15 Sep 2022 • Andrea Gesmundo

This methodology has multiple efficiency and scalability disadvantages, such as leading to spend significant resources into the creation of multiple trial models that do not contribute to the final solution. The presented work is based on the intuition that defining ML models as modular and extensible artefacts allows to introduce a novel ML development methodology enabling the integration of multiple design and evaluation iterations into the continuous enrichment of a single unbounded intelligent system.

Ranked #1 on Image Classification on cats_vs_dogs

Domain Generalization Fine-Grained Image Classification +3

32,735

Paper
Code

A Multiagent Framework for the Asynchronous and Collaborative Extension of Multitask ML Systems

1 code implementation • 29 Sep 2022 • Andrea Gesmundo

We believe that this novel methodology for ML development can be demonstrated through a modularized representation of ML models and the definition of novel abstractions allowing to implement and execute diverse methods for the asynchronous use and extension of modular intelligent systems.

32,732

Paper
Code

Scaling Up Models and Data with $\texttt{t5x}$ and $\texttt{seqio}$

3 code implementations • 31 Mar 2022 • Adam Roberts, Hyung Won Chung, Anselm Levskaya, Gaurav Mishra, James Bradbury, Daniel Andor, Sharan Narang, Brian Lester, Colin Gaffney, Afroz Mohiuddin, Curtis Hawthorne, Aitor Lewkowycz, Alex Salcianu, Marc van Zee, Jacob Austin, Sebastian Goodman, Livio Baldini Soares, Haitang Hu, Sasha Tsvyashchenko, Aakanksha Chowdhery, Jasmijn Bastings, Jannis Bulian, Xavier Garcia, Jianmo Ni, Andrew Chen, Kathleen Kenealy, Jonathan H. Clark, Stephan Lee, Dan Garrette, James Lee-Thorp, Colin Raffel, Noam Shazeer, Marvin Ritter, Maarten Bosma, Alexandre Passos, Jeremy Maitin-Shepard, Noah Fiedel, Mark Omernick, Brennan Saeta, Ryan Sepassi, Alexander Spiridonov, Joshua Newlan, Andrea Gesmundo

Recent neural network-based language models have benefited greatly from scaling up the size of training datasets and the number of parameters in the models themselves.

2,476

Paper
Code

Parameter-Efficient Transfer Learning for NLP

15 code implementations • 2 Feb 2019 • Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, Sylvain Gelly

On GLUE, we attain within 0. 4% of the performance of full fine-tuning, adding only 3. 6% parameters per task.

Ranked #4 on Image Classification on OmniBenchmark (using extra training data)

Image Classification Text Classification +1

2,422

Paper
Code

Ask the Right Questions: Active Question Reformulation with Reinforcement Learning

2 code implementations • ICLR 2018 • Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Wojciech Gajewski, Andrea Gesmundo, Neil Houlsby, Wei Wang

The agent probes the system with, potentially many, natural language reformulations of an initial question and aggregates the returned evidence to yield the best answer.

Information Retrieval Question Answering +3

343

Paper
Code

Temporal Coding in Spiking Neural Networks with Alpha Synaptic Function: Learning with Backpropagation

4 code implementations • 30 Jul 2019 • Iulia M. Comsa, Krzysztof Potempa, Luca Versari, Thomas Fischbacher, Andrea Gesmundo, Jyrki Alakuijala

The timing of individual neuronal spikes is essential for biological brains to make fast responses to sensory stimuli.

Decision Making

140

Paper
Code

Transfer Learning with Neural AutoML

no code implementations • NeurIPS 2018 • Catherine Wong, Neil Houlsby, Yifeng Lu, Andrea Gesmundo

We extend RL-based architecture search methods to support parallel training on multiple tasks and then transfer the search strategy to new tasks.

General Classification Image Classification +2

Paper
Add Code

Analyzing Language Learned by an Active Question Answering Agent

no code implementations • 23 Jan 2018 • Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Wojciech Gajewski, Andrea Gesmundo, Neil Houlsby, Wei Wang

We analyze the language learned by an agent trained with reinforcement learning as a component of the ActiveQA system [Buck et al., 2017].

Information Retrieval Question Answering +3

Paper
Add Code

Transfer Learning to Learn with Multitask Neural Model Search

no code implementations • ICLR 2018 • Catherine Wong, Andrea Gesmundo

We demonstrate that MNMS can conduct an automated architecture search for multiple tasks simultaneously while still learning well-performing, specialized models for each task.

Hyperparameter Optimization Neural Architecture Search +1

Paper
Add Code

Evolutionary-Neural Hybrid Agents for Architecture Search

no code implementations • 24 Nov 2018 • Krzysztof Maziarz, Mingxing Tan, Andrey Khorlin, Marin Georgiev, Andrea Gesmundo

We show that the Evo-NAS agent outperforms both neural and evolutionary agents when applied to architecture search for a suite of text and image classification benchmarks.

Evolutionary Algorithms General Classification +3

Paper
Add Code

Neural Architecture Search Over a Graph Search Space

no code implementations • 27 Dec 2018 • Stanisław Jastrzębski, Quentin de Laroussilhe, Mingxing Tan, Xiao Ma, Neil Houlsby, Andrea Gesmundo

However, the success of NAS depends on the definition of the search space.

Image Classification Neural Architecture Search

Paper
Add Code

Heuristic Cube Pruning in Linear Time

no code implementations • ACL 2012 • Andrea Gesmundo, Giorgio Satta, James Henderson

Machine Translation

Paper
Add Code

Lemmatisation as a Tagging Task

no code implementations • ACL 2012 • Andrea Gesmundo, Tanja Samard{\v{z}}i{\'c}

Information Retrieval Part-Of-Speech Tagging

Paper
Add Code

Undirected Machine Translation with Discriminative Reinforcement Learning

no code implementations • EACL 2014 • Andrea Gesmundo, James Henderson

Language Modelling Machine Translation +4

Paper
Add Code

Projecting the Knowledge Graph to Syntactic Parsing

no code implementations • EACL 2014 • Andrea Gesmundo, Keith Hall

Dependency Parsing

Paper
Add Code

HadoopPerceptron: a Toolkit for Distributed Perceptron Training and Prediction with MapReduce

no code implementations • EACL 2012 • Andrea Gesmundo, Nadi Tomeh

Paper
Add Code

Lemmatising Serbian as Category Tagging with Bidirectional Sequence Classification

no code implementations • LREC 2012 • Andrea Gesmundo, Tanja Samard{\v{z}}i{\'c}

We present a novel tool for morphological analysis of Serbian, which is a low-resource language with rich morphology.

Classification General Classification +3

Paper
Add Code

Fast Task-Aware Architecture Inference

no code implementations • 15 Feb 2019 • Efi Kokiopoulou, Anja Hauth, Luciano Sbaiz, Andrea Gesmundo, Gabor Bartok, Jesse Berent

At the core of our framework lies a deep value network that can predict the performance of input architectures on a task by utilizing task meta-features and the previous model training experiments performed on related tasks.

Computational Efficiency Neural Architecture Search

Paper
Add Code

Transfer NAS: Knowledge Transfer between Search Spaces with Transformer Agents

no code implementations • 19 Jun 2019 • Zalán Borsos, Andrey Khorlin, Andrea Gesmundo

Recent advances in Neural Architecture Search (NAS) have produced state-of-the-art architectures on several tasks.

Neural Architecture Search Transfer Learning

Paper
Add Code

Flexible Multi-task Networks by Learning Parameter Allocation

no code implementations • 10 Oct 2019 • Krzysztof Maziarz, Efi Kokiopoulou, Andrea Gesmundo, Luciano Sbaiz, Gabor Bartok, Jesse Berent

The binary allocation variables are learned jointly with the model parameters by standard back-propagation thanks to the Gumbel-Softmax reparametrization method.

Ranked #1 on Multi-Task Learning on OMNIGLOT

Multi-Task Learning

Paper
Add Code

Ranking architectures using meta-learning

no code implementations • 26 Nov 2019 • Alina Dubatovka, Efi Kokiopoulou, Luciano Sbaiz, Andrea Gesmundo, Gabor Bartok, Jesse Berent

However, it requires a large amount of computing resources and in order to alleviate this, a performance prediction network has been recently proposed that enables efficient architecture search by forecasting the performance of candidate architectures, instead of relying on actual model training.

Meta-Learning Neural Architecture Search

Paper
Add Code

Routing Networks with Co-training for Continual Learning

no code implementations • 9 Sep 2020 • Mark Collier, Efi Kokiopoulou, Andrea Gesmundo, Jesse Berent

We propose the use of sparse routing networks for continual learning.

Continual Learning

Paper
Add Code

Evo-NAS: Evolutionary-Neural Hybrid Agent for Architecture Search

no code implementations • 25 Sep 2019 • Krzysztof Maziarz, Mingxing Tan, Andrey Khorlin, Kuang-Yu Samuel Chang, Andrea Gesmundo

We show that the Evo-NAS agent outperforms both neural and evolutionary agents when applied to architecture search for a suite of text and image classification benchmarks.

Evolutionary Algorithms Image Classification +2

Paper
Add Code

Gumbel-Matrix Routing for Flexible Multi-task Learning

no code implementations • 25 Sep 2019 • Krzysztof Maziarz, Efi Kokiopoulou, Andrea Gesmundo, Luciano Sbaiz, Gabor Bartok, Jesse Berent

We propose the Gumbel-Matrix routing, a novel multi-task routing method based on the Gumbel-Softmax, that is designed to learn fine-grained parameter sharing.

Multi-Task Learning

Paper
Add Code

muNet: Evolving Pretrained Deep Neural Networks into Scalable Auto-tuning Multitask Systems

no code implementations • 22 May 2022 • Andrea Gesmundo, Jeff Dean

We propose a method that uses the layers of a pretrained deep neural network as building blocks to construct an ML system that can jointly solve an arbitrary number of tasks.

Image Classification Transfer Learning

Paper
Add Code

Multipath agents for modular multitask ML systems

no code implementations • 6 Feb 2023 • Andrea Gesmundo

Diverse agents can compete to produce the best performing model for a task by reusing the modules introduced to the system by competing agents.

Image Classification

Paper
Add Code

Composable Function-preserving Expansions for Transformer Architectures

no code implementations • 11 Aug 2023 • Andrea Gesmundo, Kaitlin Maile

Training state-of-the-art neural networks requires a high cost in terms of compute and time.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.