no code implementations • 4 Oct 2022 • Tavish McDonald, Brian Tsan, Amar Saini, Juanita Ordonez, Luis Gutierrez, Phan Nguyen, Blake Mason, Brenda Ng
However, data curation for document QA is uniquely challenging because the context (i. e. answer evidence passage) needs to be retrieved from potentially long, ill-formatted documents.
2 code implementations • 7 Jul 2022 • Gregory Canal, Blake Mason, Ramya Korlakai Vinayak, Robert Nowak
This paper investigates simultaneous preference and metric learning from a crowd of respondents.
no code implementations • 27 May 2022 • Jasper Tan, Daniel LeJeune, Blake Mason, Hamid Javadi, Richard G. Baraniuk
Is overparameterization a privacy liability?
no code implementations • 4 Feb 2022 • Blake Mason, Kwang-Sung Jun, Lalit Jain
Finally, we discuss the impact of the bias of the MLE on the logistic bandit problem, providing an example where $d^2$ lower order regret (cf., it is $d$ for linear bandits) may not be improved as long as the MLE is used and how bias-corrected estimators may be used to make it closer to $d$.
1 code implementation • 2 Feb 2022 • Jasper Tan, Blake Mason, Hamid Javadi, Richard G. Baraniuk
A surprising phenomenon in modern machine learning is the ability of a highly overparameterized model to generalize well (small error on the test data) even when it is trained to memorize the training data (zero error on the training data).
no code implementations • NeurIPS 2021 • Julian Katz-Samuels, Blake Mason, Kevin Jamieson, Rob Nowak
We begin our investigation with the observation that agnostic algorithms \emph{cannot} be minimax-optimal in the realizable setting.
no code implementations • 2 Nov 2021 • Blake Mason, Romain Camilleri, Subhojyoti Mukherjee, Kevin Jamieson, Robert Nowak, Lalit Jain
The threshold value $\alpha$ can either be \emph{explicit} and provided a priori, or \emph{implicit} and defined relative to the optimal function value, i. e. $\alpha = (1-\epsilon)f(x_\ast)$ for a given $\epsilon > 0$ where $f(x_\ast)$ is the maximal function value and is unknown.
1 code implementation • 11 Oct 2021 • Sina AlEMohammad, Hossein Babaei, CJ Barberan, Naiming Liu, Lorenzo Luzi, Blake Mason, Richard G. Baraniuk
To further contribute interpretability with respect to classification and the layers, we develop a new network as a combination of multiple neural tangent kernels, one to model each layer of the deep neural network individually as opposed to past work which attempts to represent the entire network via a single neural tangent kernel.
no code implementations • 8 Mar 2021 • Blake Mason, Ardhendu Tripathy, Robert Nowak
Specifically, consider the setting in which an NNS algorithm has access only to a stochastic distance oracle that provides a noisy, unbiased estimate of the distance between any pair of points, rather than the exact distance.
no code implementations • NeurIPS 2020 • Blake Mason, Lalit Jain, Ardhendu Tripathy, Robert Nowak
The pure-exploration problem in stochastic multi-armed bandits aims to find one or more arms with the largest (or near largest) means.
no code implementations • 23 Nov 2020 • Kwang-Sung Jun, Lalit Jain, Blake Mason, Houssam Nassif
Specifically, our confidence bound avoids a direct dependence on $1/\kappa$, where $\kappa$ is the minimal variance over all arms' reward distributions.
1 code implementation • 16 Jun 2020 • Blake Mason, Lalit Jain, Ardhendu Tripathy, Robert Nowak
Mathematically, the all-{\epsilon}-good arm identification problem presents significant new challenges and surprises that do not arise in the pure-exploration objectives studied in the past.
1 code implementation • NeurIPS 2019 • Blake Mason, Ardhendu Tripathy, Robert Nowak
We consider the problem of learning the nearest neighbor graph of a dataset of n items.
no code implementations • NeurIPS 2017 • Lalit Jain, Blake Mason, Robert Nowak
This paper investigates the theoretical foundations of metric learning, focused on three key questions that are not fully addressed in prior work: 1) we consider learning general low-dimensional (low-rank) metrics as well as sparse metrics; 2) we develop upper and lower (minimax)bounds on the generalization error; 3) we quantify the sample complexity of metric learning in terms of the dimension of the feature space and the dimension/rank of the underlying metric;4) we also bound the accuracy of the learned metric relative to the underlying true generative metric.