You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

no code implementations • 21 Jul 2021 • Andrzej Banburski, Fernanda De La Torre, Nishka Pant, Ishana Shastri, Tomaso Poggio

Recent theoretical results show that gradient descent on deep neural networks under exponential loss functions locally maximizes classification margin, which is equivalent to minimizing the norm of the weight matrices under margin constraints.

no code implementations • 21 Feb 2021 • Owen Kunhardt, Arturo Deza, Tomaso Poggio

In this paper, we propose an adaptation to the area under the curve (AUC) metric to measure the adversarial robustness of a model over a particular $\epsilon$-interval $[\epsilon_0, \epsilon_1]$ (interval of adversarial perturbation strengths) that facilitates unbiased comparisons across models when they have different initial $\epsilon_0$ performance.

no code implementations • 31 Dec 2020 • Tomaso Poggio, Qianli Liao

Deep ReLU networks trained with the square loss have been observed to perform well in classification tasks.

1 code implementation • 15 Dec 2020 • Elian Malkin, Arturo Deza, Tomaso Poggio

The spatially-varying field of the human visual system has recently received a resurgence of interest with the development of virtual reality (VR) and neural networks.

1 code implementation • NeurIPS 2020 • Manish V. Reddy, Andrzej Banburski, Nishka Pant, Tomaso Poggio

A convolutional neural network strongly robust to adversarial perturbations at reasonable computational and performance cost has not yet been demonstrated.

no code implementations • 28 Jun 2020 • Akshay Rangamani, Lorenzo Rosasco, Tomaso Poggio

We study the average $\mbox{CV}_{loo}$ stability of kernel ridge-less regression and derive corresponding risk bounds.

no code implementations • 24 Jun 2020 • Arturo Deza, Qianli Liao, Andrzej Banburski, Tomaso Poggio

For object recognition we find, as expected, that scrambling does not affect the performance of shallow or deep fully connected networks contrary to the out-performance of convolutional networks.

no code implementations • 12 Dec 2019 • Tomaso Poggio, Gil Kur, Andrzej Banburski

In solving a system of $n$ linear equations in $d$ variables $Ax=b$, the condition number of the $n, d$ matrix $A$ measures how much errors in the data $b$ affect the solution $x$.

no code implementations • 25 Aug 2019 • Tomaso Poggio, Andrzej Banburski, Qianli Liao

In approximation theory both shallow and deep networks have been shown to approximate any continuous functions on a bounded domain at the expense of an exponential number of parameters (exponential in the dimensionality of the function).

no code implementations • 12 Mar 2019 • Andrzej Banburski, Qianli Liao, Brando Miranda, Lorenzo Rosasco, Fernanda De La Torre, Jack Hidary, Tomaso Poggio

In particular, gradient descent induces a dynamics of the normalized weights which converge for $t \to \infty$ to an equilibrium which corresponds to a minimum norm (or maximum margin) solution.

2 code implementations • ICLR 2019 • Will Xiao, Honglin Chen, Qianli Liao, Tomaso Poggio

These results complement the study by Bartunov et al. (2018), and establish a new benchmark for future biologically plausible learning algorithms on more difficult datasets and more complex architectures.

Ranked #1 on Biologically-plausible Training on ImageNet

2 code implementations • 25 Jul 2018 • Qianli Liao, Brando Miranda, Andrzej Banburski, Jack Hidary, Tomaso Poggio

Given two networks with the same training loss on a dataset, when would they have drastically different test losses and errors?

no code implementations • 29 Jun 2018 • Tomaso Poggio, Qianli Liao, Brando Miranda, Andrzej Banburski, Xavier Boix, Jack Hidary

Here we prove a similar result for nonlinear multilayer DNNs near zero minima of the empirical loss.

no code implementations • 12 Jun 2018 • Charlie Frogner, Tomaso Poggio

We present a novel approximate inference method for diffusion processes, based on the Wasserstein gradient flow formulation of the diffusion.

no code implementations • 17 Feb 2018 • Hrushikesh Mhaskar, Tomaso Poggio

We argue that the minimal expected value of the square loss is inappropriate to measure the generalization error in approximation of compositional functions in order to take full advantage of the compositional structure.

no code implementations • 7 Jan 2018 • Chiyuan Zhang, Qianli Liao, Alexander Rakhlin, Brando Miranda, Noah Golowich, Tomaso Poggio

In Theory IIb we characterize with a mix of theory and experiments the optimization of deep convolutional networks by Stochastic Gradient Descent.

no code implementations • 30 Dec 2017 • Tomaso Poggio, Kenji Kawaguchi, Qianli Liao, Brando Miranda, Lorenzo Rosasco, Xavier Boix, Jack Hidary, Hrushikesh Mhaskar

In this note, we show that the dynamics associated to gradient descent minimization of nonlinear networks is topologically equivalent, near the asymptotically stable minima of the empirical error, to linear gradient system in a quadratic potential with a degenerate (for square loss) or almost degenerate (for logistic or crossentropy loss) Hessian.

1 code implementation • 5 Nov 2017 • Tengyuan Liang, Tomaso Poggio, Alexander Rakhlin, James Stokes

We study the relationship between geometry and capacity measures for deep neural networks from an invariance viewpoint.

no code implementations • 18 Jul 2017 • Gaurav Manek, Jie Lin, Vijay Chandrasekhar, Ling-Yu Duan, Sateesh Giduthuri, Xiao-Li Li, Tomaso Poggio

In this work, we focus on the problem of image instance retrieval with deep descriptors extracted from pruned Convolutional Neural Networks (CNN).

2 code implementations • NeurIPS 2017 • Anna Volokitin, Gemma Roig, Tomaso Poggio

Also, for all tested networks, when trained on targets in isolation, we find that recognition accuracy of the networks decreases the closer the flankers are to the target and the more flankers there are.

no code implementations • 28 Mar 2017 • Qianli Liao, Tomaso Poggio

Previous theoretical work on deep learning and neural network optimization tend to focus on avoiding saddle points and local minima.

no code implementations • 18 Jan 2017 • Vijay Chandrasekhar, Jie Lin, Qianli Liao, Olivier Morère, Antoine Veillard, Ling-Yu Duan, Tomaso Poggio

One major drawback of CNN-based {\it global descriptors} is that uncompressed deep neural network models require hundreds of megabytes of storage making them inconvenient to deploy in mobile applications or in custom hardware.

no code implementations • 2 Nov 2016 • Tomaso Poggio, Hrushikesh Mhaskar, Lorenzo Rosasco, Brando Miranda, Qianli Liao

The paper characterizes classes of functions for which deep learning can be exponentially better than shallow learning.

no code implementations • 19 Oct 2016 • Qianli Liao, Kenji Kawaguchi, Tomaso Poggio

We systematically explored a spectrum of normalization algorithms related to Batch Normalization (BN) and propose a generalized formulation that simultaneously solves two major limitations of BN: (1) online learning and (2) recurrent learning.

no code implementations • 10 Aug 2016 • Hrushikesh Mhaskar, Tomaso Poggio

The paper announces new results for a non-smooth activation function - the ReLU function - used in present-day neural networks, as well as for the Gaussian networks.

no code implementations • 5 Jun 2016 • Joel Z. Leibo, Qianli Liao, Winrich Freiwald, Fabio Anselmi, Tomaso Poggio

The primate brain contains a hierarchy of visual areas, dubbed the ventral stream, which rapidly computes object representations that are both specific for object identity and relatively robust against identity-preserving transformations like depth-rotations.

1 code implementation • 13 Apr 2016 • Qianli Liao, Tomaso Poggio

We discuss relations between Residual Networks (ResNet), Recurrent Neural Networks (RNNs) and the primate visual cortex.

no code implementations • 15 Mar 2016 • Olivier Morère, Jie Lin, Antoine Veillard, Vijay Chandrasekhar, Tomaso Poggio

The first one is Nested Invariance Pooling (NIP), a method inspired from i-theory, a mathematical theory for computing group invariant transformations with feed-forward neural networks.

no code implementations • 3 Mar 2016 • Hrushikesh Mhaskar, Qianli Liao, Tomaso Poggio

While the universal approximation property holds both for hierarchical and shallow networks, we prove that deep (hierarchical) networks can approximate the class of compositional functions with the same accuracy as shallow networks but with exponentially lower number of training parameters as well as VC-dimension.

no code implementations • 9 Jan 2016 • Olivier Morère, Antoine Veillard, Jie Lin, Julie Petta, Vijay Chandrasekhar, Tomaso Poggio

Based on a thorough empirical evaluation using several publicly available datasets, we show that our method is able to significantly and consistently improve retrieval results every time a new type of invariance is incorporated.

no code implementations • 19 Nov 2015 • Yan Luo, Xavier Boix, Gemma Roig, Tomaso Poggio, Qi Zhao

To see this, first, we report results in ImageNet that lead to a revision of the hypothesis that adversarial perturbations are a consequence of CNNs acting as a linear classifier: CNNs act locally linearly to changes in the image regions with objects recognized by the CNN, and in other regions the CNN may act non-linearly.

2 code implementations • 17 Oct 2015 • Qianli Liao, Joel Z. Leibo, Tomaso Poggio

Gradient backpropagation (BP) requires symmetric feedforward and feedback connections -- the same weights must be used for forward and backward passes.

Ranked #1 on Handwritten Digit Recognition on MNIST (PERCENTAGE ERROR metric)

3 code implementations • 16 Oct 2015 • Maximilian Nickel, Lorenzo Rosasco, Tomaso Poggio

Learning embeddings of entities and relations is an efficient and versatile method to perform machine learning on relational data such as knowledge graphs.

Ranked #7 on Link Prediction on FB15k

no code implementations • 5 Aug 2015 • Fabio Anselmi, Lorenzo Rosasco, Cheston Tan, Tomaso Poggio

In i-theory a typical layer of a hierarchical architecture consists of HW modules pooling the dot products of the inputs to the layer with the transformations of a few templates under a group.

no code implementations • NeurIPS 2015 • Charlie Frogner, Chiyuan Zhang, Hossein Mobahi, Mauricio Araya-Polo, Tomaso Poggio

In this paper we develop a loss function for multi-label learning, based on the Wasserstein distance.

no code implementations • NeurIPS 2015 • Youssef Mroueh, Stephen Voinea, Tomaso Poggio

Our analysis bridges invariant feature learning with kernel methods, as we show that this feature map defines an expected Haar integration kernel that is invariant to the specified group action.

1 code implementation • 13 Apr 2015 • Carlo Ciliberto, Youssef Mroueh, Tomaso Poggio, Lorenzo Rosasco

In this context a fundamental question is how to incorporate the tasks structure in the learning problem. We tackle this question by studying a general computational framework that allows to encode a-priori knowledge of the tasks structure in the form of a convex penalty; in this setting a variety of previously proposed methods can be recovered as special cases, including linear and non-linear approaches.

no code implementations • 19 Mar 2015 • Fabio Anselmi, Lorenzo Rosasco, Tomaso Poggio

We discuss data representation which can be learned automatically from data, are invariant to transformations, and at the same time selective, in the sense that two points have the same representation only if they are one the transformation of the other.

no code implementations • 12 Sep 2014 • Qianli Liao, Joel Z. Leibo, Tomaso Poggio

Populations of neurons in inferotemporal cortex (IT) maintain an explicit code for object identity that also tolerates transformations of object appearance e. g., position, scale, viewing angle [1, 2, 3].

no code implementations • 16 Jun 2014 • Georgios Evangelopoulos, Stephen Voinea, Chiyuan Zhang, Lorenzo Rosasco, Tomaso Poggio

Recognition of speech, and in particular the ability to generalize and learn from small sets of labelled examples like humans do, depends on an appropriate representation of the acoustic input.

no code implementations • 15 Jun 2014 • Cheston Tan, Tomaso Poggio

The main aim of this work is to further the fundamental understanding of what causes the visual processing of faces to be different from that of objects.

no code implementations • 6 Jun 2014 • Tomaso Poggio, Jim Mutch, Leyla Isik

From the slope of the inverse of the magnification factor, M-theory predicts a cortical "fovea" in V1 in the order of $40$ by $40$ basic units at each receptive field size -- corresponding to a foveola of size around $26$ minutes of arc at the highest resolution, $\approx 6$ degrees at the lowest resolution.

no code implementations • 1 Apr 2014 • Chiyuan Zhang, Georgios Evangelopoulos, Stephen Voinea, Lorenzo Rosasco, Tomaso Poggio

We present the main theoretical and computational aspects of a framework for unsupervised learning of invariant audio representations, empirically evaluated on music genre classification.

no code implementations • NeurIPS 2013 • Cheston Tan, Jedediah M. Singer, Thomas Serre, David Sheinberg, Tomaso Poggio

The macaque Superior Temporal Sulcus (STS) is a brain area that receives and integrates inputs from both the ventral and dorsal visual processing streams (thought to specialize in form and motion processing respectively).

no code implementations • NeurIPS 2013 • Qianli Liao, Joel Z. Leibo, Tomaso Poggio

Next, we apply the model to non-affine transformations: as expected, it performs well on face verification tasks requiring invariance to the relatively smooth transformations of 3D rotation-in-depth and changes in illumination direction.

no code implementations • 17 Nov 2013 • Fabio Anselmi, Joel Z. Leibo, Lorenzo Rosasco, Jim Mutch, Andrea Tacchetti, Tomaso Poggio

It also suggests that the main computational goal of the ventral stream of visual cortex is to provide a hierarchical representation of new objects/images which is invariant to transformations, stable, and discriminative for recognition---and that this representation may be continuously learned in an unsupervised way during development and visual experience.

no code implementations • 16 Nov 2013 • Qianli Liao, Joel Z. Leibo, Youssef Mroueh, Tomaso Poggio

The standard approach to unconstrained face recognition in natural photographs is via a detection, alignment, recognition pipeline.

no code implementations • 24 Mar 2013 • Silvia Villa, Lorenzo Rosasco, Tomaso Poggio

We consider the fundamental question of learnability of a hypotheses class in the supervised learning setting and in the general learning setting introduced by Vladimir Vapnik.

no code implementations • NeurIPS 2012 • Guillermo Canas, Tomaso Poggio, Lorenzo Rosasco

We study the problem of estimating a manifold from random samples.

no code implementations • NeurIPS 2012 • Youssef Mroueh, Tomaso Poggio, Lorenzo Rosasco, Jean-Jeacques Slotine

In this paper we dicuss a novel framework for multiclass learning, defined by a suitable coding/decoding strategy, namely the simplex coding, that allows to generalize to multiple classes a relaxation approach commonly used in binary classification.

no code implementations • NeurIPS 2011 • Joel Z. Leibo, Jim Mutch, Tomaso Poggio

Many studies have uncovered evidence that visual cortex contains specialized regions involved in processing faces but not other object classes.

no code implementations • NeurIPS 2009 • Jake Bouvrie, Lorenzo Rosasco, Tomaso Poggio

A goal of central importance in the study of hierarchical models for object recognition -- and indeed the visual cortex -- is that of understanding quantitatively the trade-off between invariance and selectivity, and how invariance and discrimination properties contribute towards providing an improved representation useful for learning from data.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.