You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

no code implementations • ECCV 2020 • Qing Liu, Orchid Majumder, Alessandro Achille, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

This process enables incrementally improving the model by processing multiple learning episodes, each representing a different learning task, even with few training examples.

1 code implementation • 23 Oct 2023 • Tian Yu Liu, Matthew Trager, Alessandro Achille, Pramuditha Perera, Luca Zancato, Stefano Soatto

We propose to extract meaning representations from autoregressive language models by considering the distribution of all possible trajectories extending an input text.

no code implementations • 23 Aug 2023 • Michael Kleinman, Alessandro Achille, Stefano Soatto

Critical learning periods are periods early in development where temporary sensory deficits can have a permanent effect on behavior and learned representations.

no code implementations • 2 Aug 2023 • Aditya Golatkar, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto

We introduce Compartmentalized Diffusion Models (CDM), a method to train different diffusion models (or prompts) on distinct data sources and arbitrarily compose them at inference time.

no code implementations • 6 Jun 2023 • Chethan Parameshwara, Alessandro Achille, Xiaolong Li, Jiawei Mo, Matthew Trager, Ashwin Swaminathan, Cj Taylor, Dheera Venkatraman, Xiaohan Fei, Stefano Soatto

We describe a first step towards learning general-purpose visual representations of physical scenes using only image prediction as a training criterion.

no code implementations • 1 Jun 2023 • Pramuditha Perera, Matthew Trager, Luca Zancato, Alessandro Achille, Stefano Soatto

We investigate whether prompts learned independently for different tasks can be later combined through prompt algebra to obtain a model that supports composition of tasks.

no code implementations • ICCV 2023 • Yonatan Dukler, Benjamin Bowman, Alessandro Achille, Aditya Golatkar, Ashwin Swaminathan, Stefano Soatto

We present Synergy Aware Forgetting Ensemble (SAFE), a method to adapt large models on a diverse collection of data while minimizing the expected cost to remove the influence of training samples from the trained model.

no code implementations • 7 Apr 2023 • Alessandro Achille, Michael Kearns, Carson Klingenberg, Stefano Soatto

One potential fix for training corpus data defects is model disgorgement -- the elimination of not just the improperly used data, but also the effects of improperly used data on any component of an ML model.

no code implementations • CVPR 2023 • Luca Zancato, Alessandro Achille, Tian Yu Liu, Matthew Trager, Pramuditha Perera, Stefano Soatto

Second, we apply ${\rm T^3AR}$ for test-time adaptation and show that exploiting a pool of external images at test-time leads to more robust representations over existing methods on DomainNet-126 and VISDA-C, especially when few adaptation data are available (up to 8%).

1 code implementation • 3 Mar 2023 • Kaiwen Gui, Alexander M. Dalzell, Alessandro Achille, Martin Suchara, Frederic T. Chong

When our protocol is compiled into CNOT and arbitrary single-qubit gates, it prepares an $N$-dimensional state in depth $O(\log(N))$ and spacetime allocation (a metric that accounts for the fact that oftentimes some ancilla qubits need not be active for the entire circuit) $O(N)$, which are both optimal.

no code implementations • CVPR 2023 • Achin Jain, Gurumurthy Swaminathan, Paolo Favaro, Hao Yang, Avinash Ravichandran, Hrayr Harutyunyan, Alessandro Achille, Onkar Dabeer, Bernt Schiele, Ashwin Swaminathan, Stefano Soatto

The PPL improves the performance estimation on average by 37% across 16 classification and 33% across 10 detection datasets, compared to the power law.

no code implementations • ICCV 2023 • Matthew Trager, Pramuditha Perera, Luca Zancato, Alessandro Achille, Parminder Bhatia, Stefano Soatto

These vectors can be seen as "ideal words" for generating concepts directly within the embedding space of the model.

no code implementations • 15 Feb 2023 • Benjamin Bowman, Alessandro Achille, Luca Zancato, Matthew Trager, Pramuditha Perera, Giovanni Paolini, Stefano Soatto

During inference, models can be assembled based on arbitrary selections of data sources, which we call "\`a-la-carte learning".

no code implementations • CVPR 2023 • Benjamin Bowman, Alessandro Achille, Luca Zancato, Matthew Trager, Pramuditha Perera, Giovanni Paolini, Stefano Soatto

During inference, models can be assembled based on arbitrary selections of data sources, which we call a-la-carte learning.

no code implementations • 23 Nov 2022 • Tian Yu Liu, Aditya Golatkar, Stefano Soatto, Alessandro Achille

We propose a lightweight continual learning method which incorporates information from specialized datasets incrementally, by integrating it along the vector field of "generalist" models.

no code implementations • CVPR 2023 • Michael Kleinman, Alessandro Achille, Stefano Soatto

We show that the ability of a neural network to integrate information from diverse sources hinges critically on being exposed to properly correlated signals during the early phases of training.

no code implementations • 25 Jul 2022 • Alessandro Achille, Stefano Soatto

We revisit the classic signal-to-symbol barrier in light of the remarkable ability of deep neural networks to generate realistic synthetic data.

no code implementations • 1 Jul 2022 • Mohamad Rida Rammal, Alessandro Achille, Aditya Golatkar, Suhas Diggavi, Stefano Soatto

We derive information theoretic generalization bounds for supervised learning algorithms based on a new measure of leave-one-out conditional mutual information (loo-CMI).

1 code implementation • CVPR 2022 • Matthew Wallingford, Hao Li, Alessandro Achille, Avinash Ravichandran, Charless Fowlkes, Rahul Bhotika, Stefano Soatto

TAPS solves a joint optimization problem which determines which layers to share with the base model and the value of the task-specific weights.

no code implementations • 30 Mar 2022 • Simone Bombari, Alessandro Achille, Zijian Wang, Yu-Xiang Wang, Yusheng Xie, Kunwar Yashraj Singh, Srikar Appalaraju, Vijay Mahadevan, Stefano Soatto

While bounding general memorization can have detrimental effects on the performance of a trained model, bounding RM does not prevent effective learning.

no code implementations • CVPR 2022 • Aditya Golatkar, Alessandro Achille, Yu-Xiang Wang, Aaron Roth, Michael Kearns, Stefano Soatto

AdaMix incorporates few-shot training, or cross-modal zero-shot learning, on public data prior to private fine-tuning, to improve the trade-off.

no code implementations • NeurIPS 2021 • Julian Zilly, Alessandro Achille, Andrea Censi, Emilio Frazzoli

In particular, we show that, when using weight decay, weights in successive layers of a deep network may become "mutually frozen".

no code implementations • ICLR 2022 • Yonatan Dukler, Alessandro Achille, Giovanni Paolini, Avinash Ravichandran, Marzia Polito, Stefano Soatto

A learning task is a function from a training set to the validation error, which can be represented by a trained deep neural network (DNN).

no code implementations • 29 Sep 2021 • Luca Zancato, Alessandro Achille, Giovanni Paolini, Alessandro Chiuso, Stefano Soatto

After modeling the signals, we use an anomaly detection system based on the classic CUMSUM algorithm and a variational approximation of the $f$-divergence to detect both isolated point anomalies and change-points in statistics of the signals.

no code implementations • ICLR Workshop Neural_Compression 2021 • Michael Kleinman, Alessandro Achille, Stefano Soatto, Jonathan Kao

We introduce the Redundant Information Neural Estimator (RINE), a method that allows efficient estimation for the component of information about a target variable that is common to a set of sources, previously referred to as the “redundant information.” We show that existing definitions of the redundant information can be recast in terms of an optimization over a family of deterministic or stochastic functions.

no code implementations • 29 Jan 2021 • Aditya Deshpande, Alessandro Achille, Avinash Ravichandran, Hao Li, Luca Zancato, Charless Fowlkes, Rahul Bhotika, Stefano Soatto, Pietro Perona

Since all model selection algorithms in the literature have been tested on different use-cases and never compared directly, we introduce a new comprehensive benchmark for model selection comprising of: i) A model zoo of single and multi-domain models, and ii) Many target tasks.

no code implementations • 26 Jan 2021 • Orchid Majumder, Avinash Ravichandran, Subhransu Maji, Alessandro Achille, Marzia Polito, Stefano Soatto

In this work we investigate the complementary roles of these two sources of information by combining instance-discriminative contrastive learning and supervised learning in a single framework called Supervised Momentum Contrastive learning (SUPMOCO).

1 code implementation • ICLR 2021 • Hrayr Harutyunyan, Alessandro Achille, Giovanni Paolini, Orchid Majumder, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

We define a notion of information that an individual sample provides to the training of a neural network, and we specialize it to measure both how much a sample informs the final weights and how much it informs the function computed by the weights.

1 code implementation • ICLR 2021 • Giovanni Paolini, Ben Athiwaratkun, Jason Krone, Jie Ma, Alessandro Achille, Rishita Anubhai, Cicero Nogueira dos santos, Bing Xiang, Stefano Soatto

We propose a new framework, Translation between Augmented Natural Languages (TANL), to solve many structured prediction language tasks including joint entity and relation extraction, nested named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, and dialogue state tracking.

Ranked #3 on Relation Classification on TACRED

no code implementations • 1 Jan 2021 • Kamal Gupta, Vijay Mahadevan, Alessandro Achille, Justin Lazarow, Larry S. Davis, Abhinav Shrivastava

We address the problem of scene layout generation for diverse domains such as images, mobile applications, documents and 3D objects.

no code implementations • CVPR 2021 • Aditya Golatkar, Alessandro Achille, Avinash Ravichandran, Marzia Polito, Stefano Soatto

We show that the influence of a subset of the training samples can be removed -- or "forgotten" -- from the weights of a network trained on large-scale image classification tasks, and we provide strong computable bounds on the amount of remaining information after forgetting.

no code implementations • CVPR 2021 • Alessandro Achille, Aditya Golatkar, Avinash Ravichandran, Marzia Polito, Stefano Soatto

Classifiers that are linear in their parameters, and trained by optimizing a convex loss function, have predictable behavior with respect to changes in the training data, initial conditions, and optimization.

no code implementations • ICLR 2021 • Michael Kleinman, Alessandro Achille, Daksh Idnani, Jonathan C. Kao

We introduce a notion of usable information contained in the representation learned by a deep network, and use it to study how optimal representations for the task emerge during training.

no code implementations • NeurIPS 2020 • Luca Zancato, Alessandro Achille, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

We tackle the problem of predicting the number of optimization steps that a pre-trained deep network needs to converge to a given value of the loss function.

no code implementations • 22 Jul 2020 • Matteo Terzi, Alessandro Achille, Marco Maggipinto, Gian Antonio Susto

Recent results show that features of adversarially trained networks for classification, in addition to being robust, enable desirable properties such as invertibility.

2 code implementations • ICCV 2021 • Kamal Gupta, Justin Lazarow, Alessandro Achille, Larry Davis, Vijay Mahadevan, Abhinav Shrivastava

Generating a new layout or extending an existing layout requires understanding the relationships between these primitives.

1 code implementation • ECCV 2020 • Aditya Golatkar, Alessandro Achille, Stefano Soatto

We describe a procedure for removing dependency on a cohort of training data from a trained deep network that improves upon and generalizes previous methods to different readout functions and can be extended to ensure forgetting in the activations of the network.

no code implementations • 11 Feb 2020 • Qing Liu, Orchid Majumder, Alessandro Achille, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

Majority of the modern meta-learning methods for few-shot classification tasks operate in two phases: a meta-training phase where the meta-learner learns a generic representation by solving multiple few-shot tasks sampled from a large dataset and a testing phase, where the meta-learner leverages its learnt internal representation for a specific few-shot task involving classes which were not seen during the meta-training phase.

1 code implementation • 19 Dec 2019 • Joël Seytre, Jon Wu, Alessandro Achille

We present a detector for curved text in natural images.

2 code implementations • CVPR 2020 • Aditya Golatkar, Alessandro Achille, Stefano Soatto

We explore the problem of selectively forgetting a particular subset of the data used for training a deep neural network.

no code implementations • 25 Sep 2019 • Alessandro Achille, Stefano Soatto

We relate this to the Information in the Weights, and use this result to show that models of low (information) complexity not only generalize better, but are bound to learn invariant representations of future inputs.

no code implementations • 2 Aug 2019 • Cuong V. Nguyen, Alessandro Achille, Michael Lam, Tal Hassner, Vijay Mahadevan, Stefano Soatto

As an application, we apply our procedure to study two properties of a task sequence: (1) total complexity and (2) sequential heterogeneity.

no code implementations • NeurIPS 2019 • Aditya Golatkar, Alessandro Achille, Stefano Soatto

Deep neural networks (DNNs), however, challenge this view: We show that removing regularization after an initial transient period has little effect on generalization, even if the final loss landscape is the same as if there had been no regularization.

no code implementations • 29 May 2019 • Alessandro Achille, Giovanni Paolini, Stefano Soatto

We establish a novel relation between the information in the weights and the effective information in the activations, and use this result to show that models with low (information) complexity not only generalize better, but are bound to learn invariant representations of future inputs.

no code implementations • ICLR 2019 • Alessandro Achille, Matteo Rovere, Stefano Soatto

Deficits that do not affect low-level statistics, such as vertical flipping of the images, have no lasting effect on performance and can be overcome with further training.

no code implementations • 5 Apr 2019 • Alessandro Achille, Giovanni Paolini, Glen Mbeng, Stefano Soatto

Our framework is the first to measure complexity in a way that accounts for the effect of the optimization scheme, which is critical in Deep Learning.

1 code implementation • ICCV 2019 • Alessandro Achille, Michael Lam, Rahul Tewari, Avinash Ravichandran, Subhransu Maji, Charless Fowlkes, Stefano Soatto, Pietro Perona

We demonstrate that this embedding is capable of predicting task similarities that match our intuition about semantic and taxonomic relations between different visual tasks (e. g., tasks based on classifying different types of plants are similar) We also demonstrate the practical value of this framework for the meta-task of selecting a pre-trained feature extractor for a new task.

no code implementations • 4 Oct 2018 • Alessandro Achille, Glen Mbeng, Stefano Soatto

We compute the transition probability between two learning tasks, and show that it decomposes into two factors.

1 code implementation • NeurIPS 2018 • Alessandro Achille, Tom Eccles, Loic Matthey, Christopher P. Burgess, Nick Watters, Alexander Lerchner, Irina Higgins

Intelligent behaviour in the real-world requires the ability to acquire new knowledge from an ongoing sequence of experiences while preserving and reusing past knowledge.

1 code implementation • 24 Nov 2017 • Alessandro Achille, Matteo Rovere, Stefano Soatto

Deficits that do not affect low-level statistics, such as vertical flipping of the images, have no lasting effect on performance and can be overcome with further training.

no code implementations • 9 Nov 2017 • Alessandro Achille, Stefano Soatto

Again this can be finitely-parametrized using a deep neural network, and already some applications are beginning to emerge.

no code implementations • 5 Jun 2017 • Alessandro Achille, Stefano Soatto

Using established principles from Statistics and Information Theory, we show that invariance to nuisance factors in a deep neural network is equivalent to information minimality of the learned representation, and that stacking layers and injecting noise during training naturally bias the network towards learning invariant representations.

1 code implementation • 4 Nov 2016 • Alessandro Achille, Stefano Soatto

The cross-entropy loss commonly used in deep learning is closely related to the defining properties of optimal representations, but does not enforce some of the key properties.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.