no code implementations • 10 Sep 2024 • Lukas Muttenthaler, Klaus Greff, Frieda Born, Bernhard Spitzer, Simon Kornblith, Michael C. Mozer, Klaus-Robert Müller, Thomas Unterthiner, Andrew K. Lampinen
Deep neural networks have achieved success across a wide range of applications, including as models of human behavior in vision tasks.
1 code implementation • 14 Mar 2024 • Yanlai Yang, Matt Jones, Michael C. Mozer, Mengye Ren
We explore the training dynamics of neural networks in a structured non-IID setting where documents are presented cyclically in a fixed, repeated sequence.
no code implementations • 24 Oct 2023 • Katherine L. Hermann, Hossein Mobahi, Thomas Fel, Michael C. Mozer
Deep-learning models can extract a rich assortment of features from data.
1 code implementation • 18 Jul 2023 • Pratyush Maini, Michael C. Mozer, Hanie Sedghi, Zachary C. Lipton, J. Zico Kolter, Chiyuan Zhang
Recent efforts at explaining the interplay of memorization and generalization in deep overparametrized networks have posited that neural networks $\textit{memorize}$ "hard" examples in the final few layers of the model.
no code implementations • 31 May 2023 • Ayush Chakravarthy, Trang Nguyen, Anirudh Goyal, Yoshua Bengio, Michael C. Mozer
The aim of object-centric vision is to construct an explicit representation of the objects in a scene.
no code implementations • 18 Nov 2022 • Amr Khalifa, Michael C. Mozer, Hanie Sedghi, Behnam Neyshabur, Ibrahim Alabdulmohsin
Inspired by this, we show that extending temperature scaling across all layers improves both calibration and accuracy.
no code implementations • 9 Nov 2022 • Tyler R. Scott, Ting Liu, Michael C. Mozer, Andrew C. Gallagher
Recent research in clustering face embeddings has found that unsupervised, shallow, heuristic-based methods -- including $k$-means and hierarchical agglomerative clustering -- underperform supervised, deep, inductive methods.
1 code implementation • 15 Jun 2022 • Gamaleldin F. Elsayed, Aravindh Mahendran, Sjoerd van Steenkiste, Klaus Greff, Michael C. Mozer, Thomas Kipf
The visual world can be parsimoniously characterized in terms of distinct entities with sparse interactions.
no code implementations • 11 Mar 2022 • Shruthi Sukumar, Adrian F. Ward, Camden Elliott-Williams, Shabnam Hakimi, Michael C. Mozer
Individuals are often faced with temptations that can lead them astray from long-term goals.
2 code implementations • 10 Jan 2022 • Utku Evci, Vincent Dumoulin, Hugo Larochelle, Michael C. Mozer
We propose a method, Head-to-Toe probing (Head2Toe), that selects features from all layers of the source model to train a classification head for the target-domain.
1 code implementation • 13 Sep 2021 • Mengye Ren, Tyler R. Scott, Michael L. Iuzzolino, Michael C. Mozer, Richard Zemel
Real world learning scenarios involve a nonstationary distribution of classes with sequential dependencies among the samples, in contrast to the standard machine learning formulation of drawing samples independently from a fixed, typically uniform distribution.
1 code implementation • 6 Sep 2021 • Nino Scherrer, Olexa Bilaniuk, Yashas Annadani, Anirudh Goyal, Patrick Schwab, Bernhard Schölkopf, Michael C. Mozer, Yoshua Bengio, Stefan Bauer, Nan Rosemary Ke
Discovering causal structures from data is a challenging inference problem of fundamental importance in all areas of science.
no code implementations • NeurIPS 2021 • Archit Karandikar, Nicholas Cain, Dustin Tran, Balaji Lakshminarayanan, Jonathon Shlens, Michael C. Mozer, Becca Roelofs
When incorporated into training, these soft calibration losses achieve state-of-the-art single-model ECE across multiple datasets with less than 1% decrease in accuracy.
1 code implementation • ICCV 2021 • Tyler R. Scott, Andrew C. Gallagher, Michael C. Mozer
Recent work has argued that classification losses utilizing softmax cross-entropy are superior not only for fixed-set classification tasks, but also by outperforming losses developed specifically for open-set tasks including few-shot learning and retrieval.
no code implementations • 15 Mar 2021 • Piotr Teterwak, Chiyuan Zhang, Dilip Krishnan, Michael C. Mozer
We use our reconstruction model as a tool for exploring the nature of representations, including: the influence of model architecture and training objectives (specifically robust losses), the forms of invariance that networks achieve, representational differences between correctly and incorrectly classified images, and the effects of manipulating logits and images.
1 code implementation • NeurIPS 2021 • Michael L. Iuzzolino, Michael C. Mozer, Samy Bengio
Although deep feedforward neural networks share some characteristics with the primate visual system, a key distinction is their dynamics.
1 code implementation • 15 Dec 2020 • Rebecca Roelofs, Nicholas Cain, Jonathon Shlens, Michael C. Mozer
We find that binning-based estimators with bins of equal mass (number of instances) have lower bias than estimators with bins of equal width.
no code implementations • 13 Oct 2020 • Maria Attarian, Brett D. Roads, Michael C. Mozer
Deep-learning vision models have shown intriguing similarities and differences with respect to human vision.
1 code implementation • ICLR 2021 • Mengye Ren, Michael L. Iuzzolino, Michael C. Mozer, Richard S. Zemel
We aim to bridge the gap between typical human and machine-learning environments by extending the standard framework of few-shot learning to an online, continual setting.
no code implementations • 11 Feb 2020 • Zeqian Li, Michael C. Mozer, Jacob Whitehill
We present a compositional embedding framework that infers not just a single class per input image, but a set of classes, in the setting of one-shot learning.
2 code implementations • 8 Feb 2020 • Ziheng Jiang, Chiyuan Zhang, Kunal Talwar, Michael C. Mozer
We obtain empirical estimates of this score for individual instances in multiple data sets, and we show that the score identifies out-of-distribution and mislabeled examples at one end of the continuum and strongly regular examples at the other end.
2 code implementations • 2 Oct 2019 • Nan Rosemary Ke, Olexa Bilaniuk, Anirudh Goyal, Stefan Bauer, Hugo Larochelle, Bernhard Schölkopf, Michael C. Mozer, Chris Pal, Yoshua Bengio
Promising results have driven a recent surge of interest in continuous optimization methods for Bayesian network structure learning from observational data.
no code implementations • 25 Sep 2019 • Tyler R. Scott, Karl Ridgeway, Michael C. Mozer
We propose a probabilistic method that treats embeddings as random variables.
no code implementations • 8 Jun 2019 • Michael Iuzzolino, Yoram Singer, Michael C. Mozer
In human perception and cognition, a fundamental operation that brains perform is interpretation: constructing coherent neural states from noisy, incomplete, and intrinsically ambiguous evidence.
no code implementations • ICML Workshop Deep_Phenomen 2019 • Guy Davidson, Michael C. Mozer
We explore the behavior of a standard convolutional neural net in a setting that introduces classification tasks sequentially and requires the net to master new tasks while preserving mastery of previously learned tasks.
no code implementations • 26 May 2019 • Alex Lamb, Jonathan Binas, Anirudh Goyal, Sandeep Subramanian, Ioannis Mitliagkas, Denis Kazakov, Yoshua Bengio, Michael C. Mozer
Machine learning promises methods that generalize well from finite labeled data.
no code implementations • CVPR 2020 • Guy Davidson, Michael C. Mozer
Through simulations involving sequences of ten related visual tasks, we find reason for optimism that nets will scale well as they advance from having a single skill to becoming multi-skill domain experts.
1 code implementation • 4 Mar 2019 • Been Kim, Emily Reif, Martin Wattenberg, Samy Bengio, Michael C. Mozer
The Gestalt laws of perceptual organization, which describe how visual elements in an image are grouped and interpreted, have traditionally been thought of as innate despite their ecological validity.
no code implementations • ICLR 2020 • Chiyuan Zhang, Samy Bengio, Moritz Hardt, Michael C. Mozer, Yoram Singer
We study the interplay between memorization and generalization of overparameterized networks in the extreme case of a single training example and an identity-mapping task.
no code implementations • NeurIPS 2018 • Nan Rosemary Ke, Anirudh Goyal Alias Parth Goyal, Olexa Bilaniuk, Jonathan Binas, Michael C. Mozer, Chris Pal, Yoshua Bengio
We consider the hypothesis that such memory associations between past and present could be used for credit assignment through arbitrarily long sequences, propagating the credit assigned to the current state to the associated past state.
1 code implementation • NeurIPS 2018 • Tyler Scott, Karl Ridgeway, Michael C. Mozer
We hope our results will motivate a unification of research in weight transfer, deep metric learning, and few-shot learning.
no code implementations • ICLR 2019 • Karl Ridgeway, Michael C. Mozer
We present a domain-independent method that permits the open-ended recombination of style of one image with the content of another.
no code implementations • 11 Sep 2018 • Nan Rosemary Ke, Anirudh Goyal, Olexa Bilaniuk, Jonathan Binas, Michael C. Mozer, Chris Pal, Yoshua Bengio
We consider the hypothesis that such memory associations between past and present could be used for credit assignment through arbitrarily long sequences, propagating the credit assigned to the current state to the associated past state.
2 code implementations • 22 May 2018 • Tyler R. Scott, Karl Ridgeway, Michael C. Mozer
We hope our results will motivate a unification of research in weight transfer, deep metric learning, and few-shot learning.
no code implementations • ICLR 2019 • Michael C. Mozer, Denis Kazakov, Robert V. Lindsey
Attractor dynamics are incorporated into the hidden state to `clean up' representations at each step of a sequence.
3 code implementations • NeurIPS 2018 • Karl Ridgeway, Michael C. Mozer
Deep-embedding methods aim to discover representations of a domain that make explicit the domain's class structure and thereby support few-shot learning.
1 code implementation • 11 Oct 2017 • Michael C. Mozer, Denis Kazakov, Robert V. Lindsey
The CT-GRU arises by interpreting the gates of a GRU as selecting a time scale of memory, and the CT-GRU generalizes the GRU by incorporating multiple time scales of memory and performing context-dependent selection of time scales for information storage and retrieval.
no code implementations • 24 Dec 2016 • Ronald T. Kneusel, Michael C. Mozer
We describe a human-machine cooperative approach to visual search, the aim of which is to outperform either human or machine acting alone.
no code implementations • 14 Mar 2016 • Mohammad Khajah, Robert V. Lindsey, Michael C. Mozer
In theoretical cognitive science, there is a tension between highly structured models whose parameters have a direct psychological interpretation and highly complex, general-purpose models whose parameters and representations are difficult to interpret.
1 code implementation • 19 Nov 2015 • Jake Snell, Karl Ridgeway, Renjie Liao, Brett D. Roads, Michael C. Mozer, Richard S. Zemel
We propose instead to use a loss function that is better calibrated to human perceptual judgments of image quality: the multiscale structural-similarity score (MS-SSIM).
no code implementations • NeurIPS 2014 • Robert V. Lindsey, Mohammad Khajah, Michael C. Mozer
First, in three of the five datasets, the skills inferred by our technique support significantly improved predictions of student performance over the expert-provided skills.
no code implementations • NeurIPS 2013 • Robert V. Lindsey, Michael C. Mozer, William J. Huggins, Harold Pashler
For example, in the domain of concept learning, a policy might specify the nature of exemplars chosen over a training sequence.
no code implementations • NeurIPS 2011 • Michael C. Mozer, Benjamin Link, Harold Pashler
Psychologists have long been struck by individuals' limitations in expressing their internal sensations, impressions, and evaluations via rating scales.
no code implementations • NeurIPS 2010 • Michael C. Mozer, Harold Pashler, Matthew Wilder, Robert V. Lindsey, Matt Jones, Michael N. Jones
Our decontamination techniques yield an over 20% reduction in the error of human judgments.
no code implementations • NeurIPS 2009 • Harold Pashler, Nicholas Cepeda, Robert V. Lindsey, Ed Vul, Michael C. Mozer
MCM is intriguingly similar to a Bayesian multiscale model of memory (Kording, Tenenbaum, Shadmehr, 2007), yet MCM is better able to account for human declarative memory.
no code implementations • NeurIPS 2009 • Matthew Wilder, Matt Jones, Michael C. Mozer
The Dynamic Belief Model (DBM) (Yu & Cohen, 2008) explains sequential effects in 2AFC tasks as a rational consequence of a dynamic internal representation that tracks second-order statistics of the trial sequence (repetition rates) and predicts whether the upcoming trial will be a repetition or an alternation of the previous trial.
no code implementations • NeurIPS 2008 • Jeremy Reynolds, Michael C. Mozer
We show that our model provides a parsimonious account of behavioral and neuroimaging data, and suggest that it offers an elegant conceptualization of control in which behavior can be cast as optimal, subject to limitations on learning and the rate of information processing.
no code implementations • NeurIPS 2008 • Matt Jones, Sachiko Kinoshita, Michael C. Mozer
We propose a rationally motivated mathematical model of this sequential adaptation of control, based on a diffusion model of the decision process in which difficulty corresponds to the drift rate for the correct response.