Search Results for author: Andrea Gesmundo

Found 28 papers, 7 papers with code

An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems

1 code implementation25 May 2022 Andrea Gesmundo, Jeff Dean

Multitask learning assumes that models capable of learning from multiple tasks can achieve better quality and efficiency via knowledge transfer, a key feature of human learning.

Continual Learning Fine-Grained Image Classification +1

A Continual Development Methodology for Large-scale Multitask Dynamic ML Systems

1 code implementation15 Sep 2022 Andrea Gesmundo

This methodology has multiple efficiency and scalability disadvantages, such as leading to spend significant resources into the creation of multiple trial models that do not contribute to the final solution. The presented work is based on the intuition that defining ML models as modular and extensible artefacts allows to introduce a novel ML development methodology enabling the integration of multiple design and evaluation iterations into the continuous enrichment of a single unbounded intelligent system.

Domain Generalization Fine-Grained Image Classification +3

A Multiagent Framework for the Asynchronous and Collaborative Extension of Multitask ML Systems

1 code implementation29 Sep 2022 Andrea Gesmundo

We believe that this novel methodology for ML development can be demonstrated through a modularized representation of ML models and the definition of novel abstractions allowing to implement and execute diverse methods for the asynchronous use and extension of modular intelligent systems.

Ask the Right Questions: Active Question Reformulation with Reinforcement Learning

2 code implementations ICLR 2018 Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Wojciech Gajewski, Andrea Gesmundo, Neil Houlsby, Wei Wang

The agent probes the system with, potentially many, natural language reformulations of an initial question and aggregates the returned evidence to yield the best answer.

Information Retrieval Question Answering +3

Transfer Learning with Neural AutoML

no code implementations NeurIPS 2018 Catherine Wong, Neil Houlsby, Yifeng Lu, Andrea Gesmundo

We extend RL-based architecture search methods to support parallel training on multiple tasks and then transfer the search strategy to new tasks.

General Classification Image Classification +2

Analyzing Language Learned by an Active Question Answering Agent

no code implementations23 Jan 2018 Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Wojciech Gajewski, Andrea Gesmundo, Neil Houlsby, Wei Wang

We analyze the language learned by an agent trained with reinforcement learning as a component of the ActiveQA system [Buck et al., 2017].

Information Retrieval Question Answering +3

Transfer Learning to Learn with Multitask Neural Model Search

no code implementations ICLR 2018 Catherine Wong, Andrea Gesmundo

We demonstrate that MNMS can conduct an automated architecture search for multiple tasks simultaneously while still learning well-performing, specialized models for each task.

Hyperparameter Optimization Neural Architecture Search +1

Evolutionary-Neural Hybrid Agents for Architecture Search

no code implementations24 Nov 2018 Krzysztof Maziarz, Mingxing Tan, Andrey Khorlin, Marin Georgiev, Andrea Gesmundo

We show that the Evo-NAS agent outperforms both neural and evolutionary agents when applied to architecture search for a suite of text and image classification benchmarks.

Evolutionary Algorithms General Classification +3

Fast Task-Aware Architecture Inference

no code implementations15 Feb 2019 Efi Kokiopoulou, Anja Hauth, Luciano Sbaiz, Andrea Gesmundo, Gabor Bartok, Jesse Berent

At the core of our framework lies a deep value network that can predict the performance of input architectures on a task by utilizing task meta-features and the previous model training experiments performed on related tasks.

Computational Efficiency Neural Architecture Search

Transfer NAS: Knowledge Transfer between Search Spaces with Transformer Agents

no code implementations19 Jun 2019 Zalán Borsos, Andrey Khorlin, Andrea Gesmundo

Recent advances in Neural Architecture Search (NAS) have produced state-of-the-art architectures on several tasks.

Neural Architecture Search Transfer Learning

Flexible Multi-task Networks by Learning Parameter Allocation

no code implementations10 Oct 2019 Krzysztof Maziarz, Efi Kokiopoulou, Andrea Gesmundo, Luciano Sbaiz, Gabor Bartok, Jesse Berent

The binary allocation variables are learned jointly with the model parameters by standard back-propagation thanks to the Gumbel-Softmax reparametrization method.

Multi-Task Learning

Ranking architectures using meta-learning

no code implementations26 Nov 2019 Alina Dubatovka, Efi Kokiopoulou, Luciano Sbaiz, Andrea Gesmundo, Gabor Bartok, Jesse Berent

However, it requires a large amount of computing resources and in order to alleviate this, a performance prediction network has been recently proposed that enables efficient architecture search by forecasting the performance of candidate architectures, instead of relying on actual model training.

Meta-Learning Neural Architecture Search

Evo-NAS: Evolutionary-Neural Hybrid Agent for Architecture Search

no code implementations25 Sep 2019 Krzysztof Maziarz, Mingxing Tan, Andrey Khorlin, Kuang-Yu Samuel Chang, Andrea Gesmundo

We show that the Evo-NAS agent outperforms both neural and evolutionary agents when applied to architecture search for a suite of text and image classification benchmarks.

Evolutionary Algorithms Image Classification +2

Gumbel-Matrix Routing for Flexible Multi-task Learning

no code implementations25 Sep 2019 Krzysztof Maziarz, Efi Kokiopoulou, Andrea Gesmundo, Luciano Sbaiz, Gabor Bartok, Jesse Berent

We propose the Gumbel-Matrix routing, a novel multi-task routing method based on the Gumbel-Softmax, that is designed to learn fine-grained parameter sharing.

Multi-Task Learning

muNet: Evolving Pretrained Deep Neural Networks into Scalable Auto-tuning Multitask Systems

no code implementations22 May 2022 Andrea Gesmundo, Jeff Dean

We propose a method that uses the layers of a pretrained deep neural network as building blocks to construct an ML system that can jointly solve an arbitrary number of tasks.

Image Classification Transfer Learning

Multipath agents for modular multitask ML systems

no code implementations6 Feb 2023 Andrea Gesmundo

Diverse agents can compete to produce the best performing model for a task by reusing the modules introduced to the system by competing agents.

Image Classification

Composable Function-preserving Expansions for Transformer Architectures

no code implementations11 Aug 2023 Andrea Gesmundo, Kaitlin Maile

Training state-of-the-art neural networks requires a high cost in terms of compute and time.

Cannot find the paper you are looking for? You can Submit a new open access paper.