no code implementations • ICML 2020 • Ching-Yao Chuang, Antonio Torralba, Stefanie Jegelka
We also propose a method for estimating how well a model based on domain-invariant representations will perform on the target domain, without having seen any target labels.
no code implementations • ICML 2020 • Jingzhao Zhang, Hongzhou Lin, Stefanie Jegelka, Suvrit Sra, Ali Jadbabaie
Therefore, we introduce the notion of (delta, epsilon)-stationarity, a generalization that allows for a point to be within distance delta of an epsilon-stationary point and reduces to epsilon-stationarity for smooth functions.
no code implementations • 24 Mar 2023 • Behrooz Tahmasebi, Stefanie Jegelka
For groups of positive dimension, the gain is observed by a reduction in the manifold's dimension, in addition to a factor proportional to the volume of the quotient space.
1 code implementation • 7 Feb 2023 • Ryotaro Okabe, Shangjie Xue, Jiankai Yu, Tongtong Liu, Benoit Forget, Stefanie Jegelka, Gordon Kohse, Lin-wen Hu, Mingda Li
Here we present a computational framework using Tetris-inspired detector pixels and machine learning for radiation mapping.
1 code implementation • 31 Jan 2023 • Ching-Yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba, Stefanie Jegelka
Machine learning models have been shown to inherit biases from their training datasets, which can be particularly problematic for vision-language foundation models trained on uncurated datasets scraped from the internet.
no code implementations • 26 Jan 2023 • Michael Murphy, Stefanie Jegelka, Ernest Fraenkel, Tobias Kind, David Healey, Thomas Butler
Identifying a small molecule from its mass spectrum is the primary open problem in computational metabolomics.
no code implementations • 28 Dec 2022 • Tasuku Soma, Khashayar Gatmiry, Stefanie Jegelka
Distributionally robust optimization (DRO) can improve the robustness and fairness of learning methods.
1 code implementation • 6 Oct 2022 • Ching-Yao Chuang, Stefanie Jegelka, David Alvarez-Melis
Optimal transport aligns samples across distributions by minimizing the transportation cost between them, e. g., the geometric distances.
1 code implementation • 4 Oct 2022 • Ching-Yao Chuang, Stefanie Jegelka
Understanding generalization and robustness of machine learning models fundamentally relies on assuming an appropriate metric on the data space.
no code implementations • 16 Aug 2022 • Nisha Chandramoorthy, Andreas Loukas, Khashayar Gatmiry, Stefanie Jegelka
To reduce this discrepancy between theory and practice, this paper focuses on the generalization of neural networks whose training dynamics do not necessarily converge to fixed points.
1 code implementation • 8 Aug 2022 • Nikolaos Karalias, Joshua Robinson, Andreas Loukas, Stefanie Jegelka
Integrating functions on discrete domains into neural networks is key to developing their capability to reason about discrete objects.
no code implementations • 16 Apr 2022 • Stefanie Jegelka
Graph Neural Networks (GNNs), neural network architectures targeted to learning representations of graphs, have become a popular learning model for prediction tasks on nodes, graphs and configurations of points, with wide success in practice.
2 code implementations • 25 Feb 2022 • Derek Lim, Joshua Robinson, Lingxiao Zhao, Tess Smidt, Suvrit Sra, Haggai Maron, Stefanie Jegelka
We introduce SignNet and BasisNet -- new neural architectures that are invariant to two key symmetries displayed by eigenvectors: (i) sign flips, since if $v$ is an eigenvector then so is $-v$; and (ii) more general basis symmetries, which occur in higher dimensional eigenspaces with infinitely many choices of basis eigenvectors.
Ranked #4 on
Graph Regression
on ZINC-500k
no code implementations • ICLR 2022 • Thien Le, Stefanie Jegelka
The implicit bias induced by the training of neural networks has become a topic of rigorous study.
1 code implementation • CVPR 2022 • Ching-Yao Chuang, R Devon Hjelm, Xin Wang, Vibhav Vineet, Neel Joshi, Antonio Torralba, Stefanie Jegelka, Yale Song
Contrastive learning relies on an assumption that positive pairs contain related views, e. g., patches of an image or co-occurring multimodal signals of a video, that share certain underlying information about an instance.
no code implementations • ICLR 2022 • Khashayar Gatmiry, Stefanie Jegelka, Jonathan Kelner
While there has been substantial recent work studying generalization of neural networks, the ability of deep nets in automating the process of feature extraction still evades a thorough mathematical understanding.
no code implementations • 29 Sep 2021 • Nikolaos Karalias, Joshua David Robinson, Andreas Loukas, Stefanie Jegelka
Our framework includes well-known extensions such as the Lovasz extension of submodular set functions and facilitates the design of novel continuous extensions based on problem-specific considerations, including constraints.
no code implementations • 29 Sep 2021 • Behrooz Tahmasebi, Stefanie Jegelka
Our theoretical results imply constraints on the model for exploiting random node IDs, and, conversely, insights into the tolerance of a given model class for retaining discrimination with perturbations of node attributes.
no code implementations • NeurIPS 2021 • Alkis Gotovos, Rebekka Burkholz, John Quackenbush, Stefanie Jegelka
Modeling the time evolution of discrete sets of items (e. g., genetic mutations) is a fundamental problem in many biomedical applications.
1 code implementation • NeurIPS 2021 • Joshua Robinson, Li Sun, Ke Yu, Kayhan Batmanghelich, Stefanie Jegelka, Suvrit Sra
However, we observe that the contrastive loss does not always sufficiently guide which features are extracted, a behavior that can negatively impact the performance on downstream tasks via "shortcuts", i. e., by inadvertently suppressing important predictive features.
1 code implementation • NeurIPS 2021 • Andreas Loukas, Marinos Poiitis, Stefanie Jegelka
This work explores the Benevolent Training Hypothesis (BTH) which argues that the complexity of the function a deep neural network (NN) is learning can be deduced by its training dynamics.
1 code implementation • NeurIPS 2021 • Ching-Yao Chuang, Youssef Mroueh, Kristjan Greenewald, Antonio Torralba, Stefanie Jegelka
Understanding the generalization of deep neural networks is one of the most important tasks in deep learning.
no code implementations • 10 May 2021 • Keyulu Xu, Mozhi Zhang, Stefanie Jegelka, Kenji Kawaguchi
Our results show that the training of GNNs is implicitly accelerated by skip connections, more depth, and/or a good label distribution.
no code implementations • 1 Jan 2021 • Behrooz Tahmasebi, Stefanie Jegelka
While Graph Neural Networks (GNNs) have become increasingly popular architectures for learning with graphs, recent works have revealed important shortcomings in their expressive power.
no code implementations • 6 Dec 2020 • Behrooz Tahmasebi, Derek Lim, Stefanie Jegelka
While message passing Graph Neural Networks (GNNs) have become increasingly popular architectures for learning with graphs, recent works have revealed important shortcomings in their expressive power.
1 code implementation • ICLR 2021 • Joshua Robinson, Ching-Yao Chuang, Suvrit Sra, Stefanie Jegelka
How can you sample good negative examples for contrastive learning?
1 code implementation • 28 Sep 2020 • Peiyuan Liao, Han Zhao, Keyulu Xu, Tommi Jaakkola, Geoffrey Gordon, Stefanie Jegelka, Ruslan Salakhutdinov
While the advent of Graph Neural Networks (GNNs) has greatly improved node and graph representation learning in many applications, the neighborhood aggregation scheme exposes additional vulnerabilities to adversaries seeking to extract node-level information about sensitive attributes.
3 code implementations • ICLR 2021 • Keyulu Xu, Mozhi Zhang, Jingling Li, Simon S. Du, Ken-ichi Kawarabayashi, Stefanie Jegelka
Second, in connection to analyzing the successes and limitations of GNNs, these results suggest a hypothesis for which we provide theoretical and empirical evidence: the success of GNNs in extrapolating algorithmic tasks to new data (e. g., larger graphs or edge weights) relies on encoding task-specific non-linearities in the architecture or features.
no code implementations • NeurIPS 2020 • Khashayar Gatmiry, Maryam Aliakbarpour, Stefanie Jegelka
Determinantal point processes (DPPs) are popular probabilistic models of diversity.
1 code implementation • 6 Jul 2020 • Ching-Yao Chuang, Antonio Torralba, Stefanie Jegelka
When machine learning models are deployed on a test distribution different from the training distribution, they can perform poorly, but overestimate their performance.
1 code implementation • NeurIPS 2020 • Ching-Yao Chuang, Joshua Robinson, Lin Yen-Chen, Antonio Torralba, Stefanie Jegelka
A prominent technique for self-supervised representation learning has been to contrast semantically similar and dissimilar pairs of samples.
no code implementations • NeurIPS 2020 • Yossi Arjevani, Joan Bruna, Bugra Can, Mert Gürbüzbalaban, Stefanie Jegelka, Hongzhou Lin
We introduce a framework for designing primal methods under the decentralized optimization setting where local functions are smooth and strongly convex.
no code implementations • 20 Feb 2020 • Johannes Kirschner, Ilija Bogunovic, Stefanie Jegelka, Andreas Krause
Attaining such robustness is the goal of distributionally robust optimization, which seeks a solution to an optimization problem that is worst-case robust under a specified distributional shift of an uncontrolled covariate.
no code implementations • ICML 2020 • Joshua Robinson, Stefanie Jegelka, Suvrit Sra
Our theoretical results are reflected empirically across a range of tasks and illustrate how weak labels speed up learning on the strong task.
no code implementations • ICML 2020 • Vikas K. Garg, Stefanie Jegelka, Tommi Jaakkola
We address two fundamental questions about graph neural networks (GNNs).
no code implementations • 10 Feb 2020 • Jingzhao Zhang, Hongzhou Lin, Stefanie Jegelka, Ali Jadbabaie, Suvrit Sra
In particular, we study the class of Hadamard semi-differentiable functions, perhaps the largest class of nonsmooth functions for which the chain rule of calculus holds.
no code implementations • 9 Feb 2020 • Yossi Arjevani, Amit Daniely, Stefanie Jegelka, Hongzhou Lin
Recent advances in randomized incremental methods for minimizing $L$-smooth $\mu$-strongly convex finite sums have culminated in tight complexity of $\tilde{O}((n+\sqrt{n L/\mu})\log(1/\epsilon))$ and $O(n+\sqrt{nL/\epsilon})$, where $\mu>0$ and $\mu=0$, respectively, and $n$ denotes the number of individual functions.
1 code implementation • NeurIPS 2020 • Sebastian Curi, Kfir. Y. Levy, Stefanie Jegelka, Andreas Krause
In high-stakes machine learning applications, it is crucial to not only perform well on average, but also when restricted to difficult examples.
1 code implementation • 13 Oct 2019 • Ching-Yao Chuang, Antonio Torralba, Stefanie Jegelka
In this work, we study, theoretically and empirically, the effect of the embedding complexity on generalization to the target domain.
no code implementations • 25 Sep 2019 • Hongzhou Lin, Joshua Robinson, Stefanie Jegelka
We propose a technique termed perceptual regularization that enables both visualization of the latent representation and control over the generality of the learned representation.
no code implementations • ACL 2019 • Mozhi Zhang, Keyulu Xu, Ken-ichi Kawarabayashi, Stefanie Jegelka, Jordan Boyd-Graber
Cross-lingual word embeddings (CLWE) underlie many multilingual natural language processing systems, often through orthogonal transformations of pre-trained monolingual embeddings.
1 code implementation • NeurIPS 2019 • Joshua Robinson, Suvrit Sra, Stefanie Jegelka
We propose SLC as the right extension of SR that enables easier, more intuitive control over diversity, illustrating this via examples of practical importance.
1 code implementation • 4 Jun 2019 • Mozhi Zhang, Keyulu Xu, Ken-ichi Kawarabayashi, Stefanie Jegelka, Jordan Boyd-Graber
Cross-lingual word embeddings (CLWE) underlie many multilingual natural language processing systems, often through orthogonal transformations of pre-trained monolingual embeddings.
2 code implementations • ICLR 2020 • Keyulu Xu, Jingling Li, Mozhi Zhang, Simon S. Du, Ken-ichi Kawarabayashi, Stefanie Jegelka
Neural networks have succeeded in many reasoning tasks.
1 code implementation • ICML 2020 • Marwa El Halabi, Stefanie Jegelka
Submodular function minimization is well studied, and existing algorithms solve it exactly or up to arbitrary accuracy.
1 code implementation • NeurIPS 2019 • Matthew Staib, Stefanie Jegelka
We show that MMD DRO is roughly equivalent to regularization by the Hilbert norm and, as a byproduct, reveal deep connections to classic results in statistical learning.
no code implementations • 14 May 2019 • Charlotte Bunne, David Alvarez-Melis, Andreas Krause, Stefanie Jegelka
Generative Adversarial Networks have shown remarkable success in learning a distribution that faithfully recovers a reference distribution in its entirety.
1 code implementation • 31 Dec 2018 • Edward Kim, Zach Jensen, Alexander van Grootel, Kevin Huang, Matthew Staib, Sheshera Mysore, Haw-Shiuan Chang, Emma Strubell, Andrew McCallum, Stefanie Jegelka, Elsa Olivetti
Leveraging new data sources is a key step in accelerating the pace of materials design and discovery.
no code implementations • NeurIPS 2018 • Zelda E. Mariet, Suvrit Sra, Stefanie Jegelka
Strongly Rayleigh (SR) measures are discrete probability distributions over the subsets of a ground set.
no code implementations • NeurIPS 2018 • Josip Djolonga, Stefanie Jegelka, Andreas Krause
Submodular maximization problems appear in several areas of machine learning and data science, as many useful modelling concepts such as diversity and coverage satisfy this natural diminishing returns property.
no code implementations • NeurIPS 2018 • Ilija Bogunovic, Jonathan Scarlett, Stefanie Jegelka, Volkan Cevher
In this paper, we consider the problem of Gaussian process (GP) optimization with an added robustness requirement: The returned point may be perturbed by an adversary, and we require the function value to remain as high as possible even after this perturbation.
18 code implementations • ICLR 2019 • Keyulu Xu, Weihua Hu, Jure Leskovec, Stefanie Jegelka
Here, we present a theoretical framework for analyzing the expressive power of GNNs to capture different graph structures.
Ranked #1 on
Graph Classification
on RE-M5K
no code implementations • 4 Jul 2018 • Alkis Gotovos, Hamed Hassani, Andreas Krause, Stefanie Jegelka
We consider the problem of inference in discrete probabilistic models, that is, distributions over subsets of a finite ground set.
1 code implementation • NeurIPS 2018 • Hongzhou Lin, Stefanie Jegelka
We demonstrate that a very deep ResNet with stacked modules with one neuron per hidden layer and ReLU activation functions can uniformly approximate any Lebesgue integrable function in $d$ dimensions, i. e. $\ell_1(\mathbb{R}^d)$.
no code implementations • 25 Jun 2018 • David Alvarez-Melis, Stefanie Jegelka, Tommi S. Jaakkola
Many problems in machine learning involve calculating correspondences between sets of objects, such as point clouds or images.
3 code implementations • ICML 2018 • Keyulu Xu, Chengtao Li, Yonglong Tian, Tomohiro Sonobe, Ken-ichi Kawarabayashi, Stefanie Jegelka
Furthermore, combining the JK framework with models like Graph Convolutional Networks, GraphSAGE and Graph Attention Networks consistently improves those models' performance.
Ranked #12 on
Node Classification
on PPI
no code implementations • 27 Feb 2018 • Zhi Xu, Chengtao Li, Stefanie Jegelka
We explore a notion of robustness for generative adversarial models that is pertinent to their internal interactive structure, and show that, perhaps surprisingly, the GAN in its original form is not robust.
no code implementations • 14 Feb 2018 • Matthew Staib, Bryan Wilder, Stefanie Jegelka
We also show compelling empirical evidence that DRO improves generalization to the unknown stochastic submodular function.
no code implementations • 17 Dec 2017 • David Alvarez-Melis, Tommi S. Jaakkola, Stefanie Jegelka
Optimal Transport has recently gained interest in machine learning for applications ranging from domain adaptation, sentence similarities to deep learning.
1 code implementation • 15 Dec 2017 • Alexander LeNail, Ludwig Schmidt, Johnathan Li, Tobias Ehrenberger, Karen Sachs, Stefanie Jegelka, Ernest Fraenkel
We introduce Graph-Sparse Logistic Regression, a new algorithm for classification for the case in which the support should be sparse but connected on a graph.
1 code implementation • ICLR 2018 • Chengtao Li, David Alvarez-Melis, Keyulu Xu, Stefanie Jegelka, Suvrit Sra
We propose a framework for adversarial training that relies on a sample rather than a single sample point as the fundamental unit of discrimination.
1 code implementation • 12 Jun 2017 • Baharan Mirzasoleiman, Stefanie Jegelka, Andreas Krause
The need for real time analysis of rapidly producing data streams (e. g., video and image streams) motivated the design of streaming algorithms that can efficiently extract and summarize useful information from massive data "on the fly".
Data Structures and Algorithms Information Retrieval
2 code implementations • 5 Jun 2017 • Zi Wang, Clement Gehring, Pushmeet Kohli, Stefanie Jegelka
Bayesian optimization (BO) has become an effective approach for black-box function optimization problems when function evaluations are expensive and the optimum can be achieved within a relatively small number of queries.
1 code implementation • NeurIPS 2017 • Matthew Staib, Sebastian Claici, Justin Solomon, Stefanie Jegelka
Our method is even robust to nonstationary input distributions and produces a barycenter estimate that tracks the input measures over time.
no code implementations • NeurIPS 2017 • Chengtao Li, Stefanie Jegelka, Suvrit Sra
We study dual volume sampling, a method for selecting k columns from an n x m short and wide matrix (n <= k <= m) such that the probability of selection is proportional to the volume spanned by the rows of the induced submatrix.
1 code implementation • ICML 2017 • Zi Wang, Chengtao Li, Stefanie Jegelka, Pushmeet Kohli
Optimization of high-dimensional black-box functions is an extremely challenging problem.
4 code implementations • ICML 2017 • Zi Wang, Stefanie Jegelka
We propose a new criterion, Max-value Entropy Search (MES), that instead uses the information about the maximum function value.
no code implementations • ICML 2017 • Matthew Staib, Stefanie Jegelka
The optimal allocation of resources for maximizing influence, spread of information or coverage, has gained attention in the past years, in particular in machine learning and data mining.
1 code implementation • CVPR 2017 • Hyun Oh Song, Stefanie Jegelka, Vivek Rathod, Kevin Murphy
Learning the representation and the similarity metric in an end-to-end fashion with deep networks have demonstrated outstanding results for clustering and retrieval.
no code implementations • NeurIPS 2016 • Josip Djolonga, Stefanie Jegelka, Sebastian Tschiatschek, Andreas Krause
We study a rich family of distributions that capture variable interactions significantly more expressive than those representable with low-treewidth or pairwise graphical models, or log-supermodular models.
no code implementations • NeurIPS 2016 • Chengtao Li, Stefanie Jegelka, Suvrit Sra
We consider the task of rapidly sampling from such constrained measures, and develop fast Markov chain samplers for them.
no code implementations • 26 Jul 2016 • Zi Wang, Stefanie Jegelka, Leslie Pack Kaelbling, Tomás Lozano-Pérez
We introduce a framework for model learning and planning in stochastic domains with continuous state and action spaces and non-Gaussian transition models.
no code implementations • 13 Jul 2016 • Chengtao Li, Stefanie Jegelka, Suvrit Sra
In this note we consider sampling from (non-homogeneous) strongly Rayleigh probability measures.
no code implementations • 19 Mar 2016 • Chengtao Li, Stefanie Jegelka, Suvrit Sra
Its theoretical guarantees and empirical performance rely critically on the quality of the landmarks selected.
no code implementations • 7 Dec 2015 • Chengtao Li, Suvrit Sra, Stefanie Jegelka
We present a framework for accelerating a spectrum of machine learning algorithms that require computation of bilinear inverse forms $u^\top A^{-1}u$, where $A$ is a positive definite matrix and $u$ a given vector.
no code implementations • 22 Nov 2015 • Samaneh Azadi, Jiashi Feng, Stefanie Jegelka, Trevor Darrell
Precisely-labeled data sets with sufficient amount of samples are very important for training deep convolutional neural networks (CNNs).
3 code implementations • CVPR 2016 • Hyun Oh Song, Yu Xiang, Stefanie Jegelka, Silvio Savarese
Additionally, we collected Online Products dataset: 120k images of 23k classes of online products for metric learning.
1 code implementation • 21 Oct 2015 • Zi Wang, Bolei Zhou, Stefanie Jegelka
Recently, there has been rising interest in Bayesian optimization -- the optimization of an unknown function with assumptions usually expressed by a Gaussian Process (GP) prior.
no code implementations • 4 Sep 2015 • Chengtao Li, Stefanie Jegelka, Suvrit Sra
Our method takes advantage of the diversity property of subsets sampled from a DPP, and proceeds in two stages: first it constructs coresets for the ground set of items; thereafter, it efficiently samples subsets based on the constructed coresets.
no code implementations • 5 Mar 2015 • K. S. Sesh Kumar, Alvaro Barbero, Stefanie Jegelka, Suvrit Sra, Francis Bach
By exploiting results from convex and submodular theory, we reformulate the quadratic energy minimization problem as a total variation denoising problem, which, when viewed geometrically, enables the use of projection and reflection based convex methods.
no code implementations • 23 Jan 2015 • Ashish Kapoor, E. Paxon Frady, Stefanie Jegelka, William B. Kristan, Eric Horvitz
We introduce and study methods for inferring and learning from correspondences among neurons.
no code implementations • NeurIPS 2014 • Xinghao Pan, Stefanie Jegelka, Joseph E. Gonzalez, Joseph K. Bradley, Michael. I. Jordan
Many machine learning problems can be reduced to the maximization of submodular functions.
no code implementations • NeurIPS 2014 • Adarsh Prasad, Stefanie Jegelka, Dhruv Batra
To cope with the high level of ambiguity faced in domains such as Computer Vision or Natural Language processing, robust prediction methods often search for a diverse set of high-quality candidate solutions or proposals.
no code implementations • NeurIPS 2014 • Hyun Oh Song, Yong Jae Lee, Stefanie Jegelka, Trevor Darrell
The increasing prominence of weakly labeled data nurtures a growing demand for object detection methods that can cope with minimal supervision.
no code implementations • NeurIPS 2014 • Robert Nishihara, Stefanie Jegelka, Michael. I. Jordan
Submodular functions describe a variety of discrete problems in machine learning, signal processing, and computer vision.
no code implementations • CVPR 2014 • Jiashi Feng, Stefanie Jegelka, Shuicheng Yan, Trevor Darrell
We use sample relatedness information to improve the generalization of the learned dictionary.
no code implementations • 5 Mar 2014 • Hyun Oh Song, Ross Girshick, Stefanie Jegelka, Julien Mairal, Zaid Harchaoui, Trevor Darrell
Learning to localize objects with minimal supervision is an important problem in computer vision, since large fully annotated datasets are extremely costly to obtain.
Ranked #35 on
Weakly Supervised Object Detection
on PASCAL VOC 2007
no code implementations • 2 Feb 2014 • Stefanie Jegelka, Jeff Bilmes
We study an extension of the classical graph cut problem, wherein we replace the modular (sum of edge weights) cost function by a submodular set function defined over graph edges.
no code implementations • NeurIPS 2013 • Stefanie Jegelka, Francis Bach, Suvrit Sra
A key component of our method is a formulation of the discrete submodular minimization problem as a continuous best approximation problem that is solved through a sequence of reflections, and its solution can be easily thresholded to obtain an optimal discrete solution.
no code implementations • NeurIPS 2013 • Rishabh Iyer, Stefanie Jegelka, Jeff Bilmes
We either use a black-box transformation of the function (for approximation and learning), or a transformation of algorithms to use an appropriate surrogate function (for minimization).
no code implementations • 5 Aug 2013 • Rishabh Iyer, Stefanie Jegelka, Jeff Bilmes
We present a practical and powerful new framework for both unconstrained and constrained submodular function optimization based on discrete semidifferentials (sub- and super-differentials).
no code implementations • NeurIPS 2013 • Xinghao Pan, Joseph E. Gonzalez, Stefanie Jegelka, Tamara Broderick, Michael. I. Jordan
Research on distributed machine learning algorithms has focused primarily on one of two extremes - algorithms that obey strict concurrency constraints or algorithms that obey few or no such constraints.
no code implementations • CVPR 2013 • Pushmeet Kohli, Anton Osokin, Stefanie Jegelka
We discuss a model for image segmentation that is able to overcome the short-boundary bias observed in standard pairwise random field based approaches.
no code implementations • NeurIPS 2011 • Stefanie Jegelka, Hui Lin, Jeff A. Bilmes
We are motivated by an application to extract a representative subset of machine learning training data and by the poor empirical performance we observe of the popular minimum norm algorithm.