Search Results for author: Alessandro Achille

Found 41 papers, 11 papers with code

Incremental Few-Shot Meta-Learning via Indirect Discriminant Alignment

no code implementations ECCV 2020 Qing Liu, Orchid Majumder, Alessandro Achille, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

This process enables incrementally improving the model by processing multiple learning episodes, each representing a different learning task, even with few training examples.

Few-Shot Learning Incremental Learning

Integral Continual Learning Along the Tangent Vector Field of Tasks

no code implementations23 Nov 2022 Tian Yu Liu, Aditya Golatkar, Stefano Soatto, Alessandro Achille

We propose a continual learning method which incorporates information from specialized datasets incrementally, by integrating it along the vector field of "generalist" models.

Continual Learning

Critical Learning Periods for Multisensory Integration in Deep Networks

no code implementations6 Oct 2022 Michael Kleinman, Alessandro Achille, Stefano Soatto

We show that the ability of a neural network to integrate information from diverse sources hinges critically on being exposed to properly correlated signals during the early phases of training.

On the Learnability of Physical Concepts: Can a Neural Network Understand What's Real?

no code implementations25 Jul 2022 Alessandro Achille, Stefano Soatto

We revisit the classic signal-to-symbol barrier in light of the remarkable ability of deep neural networks to generate realistic synthetic data.

On Leave-One-Out Conditional Mutual Information For Generalization

no code implementations1 Jul 2022 Mohamad Rida Rammal, Alessandro Achille, Aditya Golatkar, Suhas Diggavi, Stefano Soatto

We derive information theoretic generalization bounds for supervised learning algorithms based on a new measure of leave-one-out conditional mutual information (loo-CMI).

Generalization Bounds Image Classification

Gacs-Korner Common Information Variational Autoencoder

no code implementations24 May 2022 Michael Kleinman, Alessandro Achille, Stefano Soatto, Jonathan Kao

We propose a notion of common information that allows one to quantify and separate the information that is shared between two random variables from the information that is unique to each.

Towards Differential Relational Privacy and its use in Question Answering

no code implementations30 Mar 2022 Simone Bombari, Alessandro Achille, Zijian Wang, Yu-Xiang Wang, Yusheng Xie, Kunwar Yashraj Singh, Srikar Appalaraju, Vijay Mahadevan, Stefano Soatto

While bounding general memorization can have detrimental effects on the performance of a trained model, bounding RM does not prevent effective learning.

Memorization Question Answering

Task Adaptive Parameter Sharing for Multi-Task Learning

1 code implementation CVPR 2022 Matthew Wallingford, Hao Li, Alessandro Achille, Avinash Ravichandran, Charless Fowlkes, Rahul Bhotika, Stefano Soatto

TAPS solves a joint optimization problem which determines which layers to share with the base model and the value of the task-specific weights.

Multi-Task Learning

Mixed Differential Privacy in Computer Vision

no code implementations CVPR 2022 Aditya Golatkar, Alessandro Achille, Yu-Xiang Wang, Aaron Roth, Michael Kearns, Stefano Soatto

AdaMix incorporates few-shot training, or cross-modal zero-shot learning, on public data prior to private fine-tuning, to improve the trade-off.

Zero-Shot Learning

DIVA: Dataset Derivative of a Learning Task

no code implementations ICLR 2022 Yonatan Dukler, Alessandro Achille, Giovanni Paolini, Avinash Ravichandran, Marzia Polito, Stefano Soatto

A learning task is a function from a training set to the validation error, which can be represented by a trained deep neural network (DNN).

AutoML

STRIC: Stacked Residuals of Interpretable Components for Time Series Anomaly Detection

no code implementations29 Sep 2021 Luca Zancato, Alessandro Achille, Giovanni Paolini, Alessandro Chiuso, Stefano Soatto

After modeling the signals, we use an anomaly detection system based on the classic CUMSUM algorithm and a variational approximation of the $f$-divergence to detect both isolated point anomalies and change-points in statistics of the signals.

Anomaly Detection Time Series Anomaly Detection

Redundant Information Neural Estimation

no code implementations ICLR Workshop Neural_Compression 2021 Michael Kleinman, Alessandro Achille, Stefano Soatto, Jonathan Kao

We introduce the Redundant Information Neural Estimator (RINE), a method that allows efficient estimation for the component of information about a target variable that is common to a set of sources, previously referred to as the “redundant information.” We show that existing definitions of the redundant information can be recast in terms of an optimization over a family of deterministic or stochastic functions.

Image Classification

A linearized framework and a new benchmark for model selection for fine-tuning

no code implementations29 Jan 2021 Aditya Deshpande, Alessandro Achille, Avinash Ravichandran, Hao Li, Luca Zancato, Charless Fowlkes, Rahul Bhotika, Stefano Soatto, Pietro Perona

Since all model selection algorithms in the literature have been tested on different use-cases and never compared directly, we introduce a new comprehensive benchmark for model selection comprising of: i) A model zoo of single and multi-domain models, and ii) Many target tasks.

Model Selection

Supervised Momentum Contrastive Learning for Few-Shot Classification

no code implementations26 Jan 2021 Orchid Majumder, Avinash Ravichandran, Subhransu Maji, Alessandro Achille, Marzia Polito, Stefano Soatto

In this work we investigate the complementary roles of these two sources of information by combining instance-discriminative contrastive learning and supervised learning in a single framework called Supervised Momentum Contrastive learning (SUPMOCO).

Classification Contrastive Learning +4

Estimating informativeness of samples with Smooth Unique Information

1 code implementation ICLR 2021 Hrayr Harutyunyan, Alessandro Achille, Giovanni Paolini, Orchid Majumder, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

We define a notion of information that an individual sample provides to the training of a neural network, and we specialize it to measure both how much a sample informs the final weights and how much it informs the function computed by the weights.

Informativeness

Structured Prediction as Translation between Augmented Natural Languages

1 code implementation ICLR 2021 Giovanni Paolini, Ben Athiwaratkun, Jason Krone, Jie Ma, Alessandro Achille, Rishita Anubhai, Cicero Nogueira dos santos, Bing Xiang, Stefano Soatto

We propose a new framework, Translation between Augmented Natural Languages (TANL), to solve many structured prediction language tasks including joint entity and relation extraction, nested named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, and dialogue state tracking.

coreference-resolution Coreference Resolution +10

Multimodal Attention for Layout Synthesis in Diverse Domains

no code implementations1 Jan 2021 Kamal Gupta, Vijay Mahadevan, Alessandro Achille, Justin Lazarow, Larry S. Davis, Abhinav Shrivastava

We address the problem of scene layout generation for diverse domains such as images, mobile applications, documents and 3D objects.

Mixed-Privacy Forgetting in Deep Networks

no code implementations CVPR 2021 Aditya Golatkar, Alessandro Achille, Avinash Ravichandran, Marzia Polito, Stefano Soatto

We show that the influence of a subset of the training samples can be removed -- or "forgotten" -- from the weights of a network trained on large-scale image classification tasks, and we provide strong computable bounds on the amount of remaining information after forgetting.

Image Classification

LQF: Linear Quadratic Fine-Tuning

no code implementations CVPR 2021 Alessandro Achille, Aditya Golatkar, Avinash Ravichandran, Marzia Polito, Stefano Soatto

Classifiers that are linear in their parameters, and trained by optimizing a convex loss function, have predictable behavior with respect to changes in the training data, initial conditions, and optimization.

Image Classification

Usable Information and Evolution of Optimal Representations During Training

no code implementations ICLR 2021 Michael Kleinman, Alessandro Achille, Daksh Idnani, Jonathan C. Kao

We introduce a notion of usable information contained in the representation learned by a deep network, and use it to study how optimal representations for the task emerge during training.

Decision Making Image Classification

Predicting Training Time Without Training

no code implementations NeurIPS 2020 Luca Zancato, Alessandro Achille, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

We tackle the problem of predicting the number of optimization steps that a pre-trained deep network needs to converge to a given value of the loss function.

Adversarial Training Reduces Information and Improves Transferability

no code implementations22 Jul 2020 Matteo Terzi, Alessandro Achille, Marco Maggipinto, Gian Antonio Susto

Recent results show that features of adversarially trained networks for classification, in addition to being robust, enable desirable properties such as invertibility.

LayoutTransformer: Layout Generation and Completion with Self-attention

2 code implementations ICCV 2021 Kamal Gupta, Justin Lazarow, Alessandro Achille, Larry Davis, Vijay Mahadevan, Abhinav Shrivastava

Generating a new layout or extending an existing layout requires understanding the relationships between these primitives.

Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations

1 code implementation ECCV 2020 Aditya Golatkar, Alessandro Achille, Stefano Soatto

We describe a procedure for removing dependency on a cohort of training data from a trained deep network that improves upon and generalizes previous methods to different readout functions and can be extended to ensure forgetting in the activations of the network.

Incremental Meta-Learning via Indirect Discriminant Alignment

no code implementations11 Feb 2020 Qing Liu, Orchid Majumder, Alessandro Achille, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

Majority of the modern meta-learning methods for few-shot classification tasks operate in two phases: a meta-training phase where the meta-learner learns a generic representation by solving multiple few-shot tasks sampled from a large dataset and a testing phase, where the meta-learner leverages its learnt internal representation for a specific few-shot task involving classes which were not seen during the meta-training phase.

Incremental Learning Meta-Learning

TextTubes for Detecting Curved Text in the Wild

1 code implementation19 Dec 2019 Joël Seytre, Jon Wu, Alessandro Achille

We present a detector for curved text in natural images.

Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks

2 code implementations CVPR 2020 Aditya Golatkar, Alessandro Achille, Stefano Soatto

We explore the problem of selectively forgetting a particular subset of the data used for training a deep neural network.

Where is the Information in a Deep Network?

no code implementations25 Sep 2019 Alessandro Achille, Stefano Soatto

We relate this to the Information in the Weights, and use this result to show that models of low (information) complexity not only generalize better, but are bound to learn invariant representations of future inputs.

Toward Understanding Catastrophic Forgetting in Continual Learning

no code implementations2 Aug 2019 Cuong V. Nguyen, Alessandro Achille, Michael Lam, Tal Hassner, Vijay Mahadevan, Stefano Soatto

As an application, we apply our procedure to study two properties of a task sequence: (1) total complexity and (2) sequential heterogeneity.

Continual Learning

Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence

no code implementations NeurIPS 2019 Aditya Golatkar, Alessandro Achille, Stefano Soatto

Deep neural networks (DNNs), however, challenge this view: We show that removing regularization after an initial transient period has little effect on generalization, even if the final loss landscape is the same as if there had been no regularization.

Data Augmentation

Where is the Information in a Deep Neural Network?

no code implementations29 May 2019 Alessandro Achille, Giovanni Paolini, Stefano Soatto

We establish a novel relation between the information in the weights and the effective information in the activations, and use this result to show that models with low (information) complexity not only generalize better, but are bound to learn invariant representations of future inputs.

Inductive Bias

Critical Learning Periods in Deep Networks

no code implementations ICLR 2019 Alessandro Achille, Matteo Rovere, Stefano Soatto

Deficits that do not affect low-level statistics, such as vertical flipping of the images, have no lasting effect on performance and can be overcome with further training.

Disentanglement

The Information Complexity of Learning Tasks, their Structure and their Distance

no code implementations5 Apr 2019 Alessandro Achille, Giovanni Paolini, Glen Mbeng, Stefano Soatto

Our framework is the first to measure complexity in a way that accounts for the effect of the optimization scheme, which is critical in Deep Learning.

Memorization Transfer Learning

Task2Vec: Task Embedding for Meta-Learning

1 code implementation ICCV 2019 Alessandro Achille, Michael Lam, Rahul Tewari, Avinash Ravichandran, Subhransu Maji, Charless Fowlkes, Stefano Soatto, Pietro Perona

We demonstrate that this embedding is capable of predicting task similarities that match our intuition about semantic and taxonomic relations between different visual tasks (e. g., tasks based on classifying different types of plants are similar) We also demonstrate the practical value of this framework for the meta-task of selecting a pre-trained feature extractor for a new task.

Meta-Learning

Dynamics and Reachability of Learning Tasks

no code implementations4 Oct 2018 Alessandro Achille, Glen Mbeng, Stefano Soatto

We compute the transition probability between two learning tasks, and show that it decomposes into two factors.

Semantic Textual Similarity Transfer Learning

Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies

1 code implementation NeurIPS 2018 Alessandro Achille, Tom Eccles, Loic Matthey, Christopher P. Burgess, Nick Watters, Alexander Lerchner, Irina Higgins

Intelligent behaviour in the real-world requires the ability to acquire new knowledge from an ongoing sequence of experiences while preserving and reusing past knowledge.

Representation Learning

Critical Learning Periods in Deep Neural Networks

1 code implementation24 Nov 2017 Alessandro Achille, Matteo Rovere, Stefano Soatto

Deficits that do not affect low-level statistics, such as vertical flipping of the images, have no lasting effect on performance and can be overcome with further training.

Disentanglement

A Separation Principle for Control in the Age of Deep Learning

no code implementations9 Nov 2017 Alessandro Achille, Stefano Soatto

Again this can be finitely-parametrized using a deep neural network, and already some applications are beginning to emerge.

Emergence of Invariance and Disentanglement in Deep Representations

no code implementations5 Jun 2017 Alessandro Achille, Stefano Soatto

Using established principles from Statistics and Information Theory, we show that invariance to nuisance factors in a deep neural network is equivalent to information minimality of the learned representation, and that stacking layers and injecting noise during training naturally bias the network towards learning invariant representations.

Disentanglement

Information Dropout: Learning Optimal Representations Through Noisy Computation

1 code implementation4 Nov 2016 Alessandro Achille, Stefano Soatto

The cross-entropy loss commonly used in deep learning is closely related to the defining properties of optimal representations, but does not enforce some of the key properties.

Representation Learning Variational Inference

Cannot find the paper you are looking for? You can Submit a new open access paper.