Search Results for author: Varun Kanade

Found 40 papers, 6 papers with code

Separations in the Representational Capabilities of Transformers and Recurrent Architectures

no code implementations13 Jun 2024 Satwik Bhattamishra, Michael Hahn, Phil Blunsom, Varun Kanade

Furthermore, we show that two-layer Transformers of logarithmic size can perform decision tasks such as string equality or disjointness, whereas both one-layer Transformers and recurrent models require linear size for these tasks.

Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions

no code implementations4 Oct 2023 Satwik Bhattamishra, Arkil Patel, Phil Blunsom, Varun Kanade

In this work, we take a step towards answering these questions by demonstrating the following: (a) On a test-bed with a variety of Boolean function classes, we find that Transformers can nearly match the optimal learning algorithm for 'simpler' tasks, while their performance deteriorates on more 'complex' tasks.

In-Context Learning

Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions

1 code implementation22 Nov 2022 Satwik Bhattamishra, Arkil Patel, Varun Kanade, Phil Blunsom

(ii) When trained on Boolean functions, both Transformers and LSTMs prioritize learning functions of low sensitivity, with Transformers ultimately converging to functions of lower sensitivity.

When are Local Queries Useful for Robust Learning?

no code implementations12 Oct 2022 Pascale Gourdeau, Varun Kanade, Marta Kwiatkowska, James Worrell

We finish by giving robust learning algorithms for halfspaces on $\{0, 1\}^n$ and then obtaining robustness guarantees for halfspaces in $\mathbb{R}^n$ against precision-bounded adversaries.

Partial Matrix Completion

no code implementations NeurIPS 2023 Elad Hazan, Adam Tauman Kalai, Varun Kanade, Clara Mohri, Y. Jennifer Sun

This work establishes a new framework of partial matrix completion, where the goal is to identify a large subset of the entries that can be completed with high confidence.

Matrix Completion

Beyond Impossibility: Balancing Sufficiency, Separation and Accuracy

no code implementations24 May 2022 Limor Gultchin, Vincent Cohen-Addad, Sophie Giffard-Roisin, Varun Kanade, Frederik Mallmann-Trenn

Among the various aspects of algorithmic fairness studied in recent years, the tension between satisfying both \textit{sufficiency} and \textit{separation} -- e. g. the ratios of positive or negative predictive values, and false positive or false negative rates across groups -- has received much attention.

Fairness

Sample Complexity Bounds for Robustly Learning Decision Lists against Evasion Attacks

no code implementations12 May 2022 Pascale Gourdeau, Varun Kanade, Marta Kwiatkowska, James Worrell

A fundamental problem in adversarial machine learning is to quantify how much training data is needed in the presence of evasion attacks.

PAC learning

Exponential Tail Local Rademacher Complexity Risk Bounds Without the Bernstein Condition

no code implementations23 Feb 2022 Varun Kanade, Patrick Rebeschini, Tomas Vaskevicius

Our main result is an exponential-tail excess risk bound expressed in terms of the offset Rademacher complexity that yields results at least as sharp as those obtainable via the classical theory.

Model Selection

Towards optimally abstaining from prediction with OOD test examples

no code implementations NeurIPS 2021 Adam Tauman Kalai, Varun Kanade

Our work builds on a recent abstention algorithm of Goldwasser, Kalais, and Montasser (2020) for transductive binary classification.

Binary Classification Generalization Bounds

Efficient Learning with Arbitrary Covariate Shift

no code implementations15 Feb 2021 Adam Kalai, Varun Kanade

We give an efficient algorithm for learning a binary function in a given class C of bounded VC dimension, with training data distributed according to P and test data according to Q, where P and Q may be arbitrary distributions over X.

How Benign is Benign Overfitting ?

no code implementations ICLR 2021 Amartya Sanyal, Puneet K. Dokania, Varun Kanade, Philip Torr

We investigate two causes for adversarial vulnerability in deep neural networks: bad data and (poorly) trained models.

Adversarial Robustness Representation Learning

Lottery Tickets in Linear Models: An Analysis of Iterative Magnitude Pruning

no code implementations16 Jul 2020 Bryn Elesedy, Varun Kanade, Yee Whye Teh

We analyse the pruning procedure behind the lottery ticket hypothesis arXiv:1803. 03635v5, iterative magnitude pruning (IMP), when applied to linear models trained by gradient flow.

How benign is benign overfitting?

no code implementations8 Jul 2020 Amartya Sanyal, Puneet K. Dokania, Varun Kanade, Philip H. S. Torr

We investigate two causes for adversarial vulnerability in deep neural networks: bad data and (poorly) trained models.

Adversarial Robustness Representation Learning

Differentiable Causal Backdoor Discovery

1 code implementation3 Mar 2020 Limor Gultchin, Matt J. Kusner, Varun Kanade, Ricardo Silva

Discovering the causal effect of a decision is critical to nearly all forms of decision-making.

Decision Making

The Statistical Complexity of Early-Stopped Mirror Descent

no code implementations NeurIPS 2020 Tomas Vaškevičius, Varun Kanade, Patrick Rebeschini

Recently there has been a surge of interest in understanding implicit regularization properties of iterative gradient-based optimization algorithms.

Online k-means Clustering

no code implementations15 Sep 2019 Vincent Cohen-Addad, Benjamin Guedj, Varun Kanade, Guy Rom

The specific formulation we use is the $k$-means objective: At each time step the algorithm has to maintain a set of k candidate centers and the loss incurred is the squared distance between the new point and the closest center.

Clustering Online Clustering

On the Hardness of Robust Classification

no code implementations NeurIPS 2019 Pascale Gourdeau, Varun Kanade, Marta Kwiatkowska, James Worrell

However if the adversary is restricted to perturbing $O(\log n)$ bits, then the class of monotone conjunctions can be robustly learned with respect to a general class of distributions (that includes the uniform distribution).

Classification General Classification +2

Implicit Regularization for Optimal Sparse Recovery

1 code implementation NeurIPS 2019 Tomas Vaškevičius, Varun Kanade, Patrick Rebeschini

We investigate implicit regularization schemes for gradient descent methods applied to unpenalized least squares regression to solve the problem of reconstructing a sparse signal from an underdetermined system of linear measurements under the restricted isometry assumption.

Computational Efficiency

Adaptive Reduced Rank Regression

1 code implementation NeurIPS 2020 Qiong Wu, Felix Ming Fai Wong, Zhenming Liu, Yanhua Li, Varun Kanade

We study the low rank regression problem $\my = M\mx + \epsilon$, where $\mx$ and $\my$ are $d_1$ and $d_2$ dimensional vectors respectively.

regression

Clustering Redemption–Beyond the Impossibility of Kleinberg’s Axioms

no code implementations NeurIPS 2018 Vincent Cohen-Addad, Varun Kanade, Frederik Mallmann-Trenn

In this work, we take a different approach, based on the observation that the consistency axiom fails to be satisfied when the “correct” number of clusters changes.

Clustering

Decentralized Cooperative Stochastic Bandits

1 code implementation NeurIPS 2019 David Martínez-Rubio, Varun Kanade, Patrick Rebeschini

We design a fully decentralized algorithm that uses an accelerated consensus procedure to compute (delayed) estimates of the average of rewards obtained by all the agents for each arm, and then uses an upper confidence bound (UCB) algorithm that accounts for the delay and error of the estimates.

Multi-Armed Bandits

Statistical Windows in Testing for the Initial Distribution of a Reversible Markov Chain

no code implementations6 Aug 2018 Quentin Berthet, Varun Kanade

We study the problem of hypothesis testing between two discrete distributions, where we only have access to samples after the action of a known reversible Markov chain, playing the role of noise.

Two-sample testing

TAPAS: Tricks to Accelerate (encrypted) Prediction As a Service

1 code implementation ICML 2018 Amartya Sanyal, Matt J. Kusner, Adrià Gascón, Varun Kanade

The main drawback of using fully homomorphic encryption is the amount of time required to evaluate large machine learning models on encrypted data.

BIG-bench Machine Learning Binarization

Robustness via Deep Low-Rank Representations

no code implementations ICLR 2019 Amartya Sanyal, Varun Kanade, Philip H. S. Torr, Puneet K. Dokania

To achieve low dimensionality of learned representations, we propose an easy-to-use, end-to-end trainable, low-rank regularizer (LR) that can be applied to any intermediate layer representation of a DNN.

Clustering General Classification +2

Learning DNFs under product distributions via μ-biased quantum Fourier sampling

no code implementations15 Feb 2018 Varun Kanade, Andrea Rocchetto, Simone Severini

We show that DNF formulae can be quantum PAC-learned in polynomial time under product distributions using a quantum example oracle.

Hierarchical Clustering Beyond the Worst-Case

no code implementations NeurIPS 2017 Vincent Cohen-Addad, Varun Kanade, Frederik Mallmann-Trenn

Hiererachical clustering, that is computing a recursive partitioning of a dataset to obtain clusters at increasingly finer granularity is a fundamental problem in data analysis.

Clustering General Classification +1

From which world is your graph

no code implementations NeurIPS 2017 Cheng Li, Felix Mf Wong, Zhenming Liu, Varun Kanade

This work focuses on unifying two of the most widely used link-formation models: the stochastic block model (SBM) and the small world (or latent space) model (SWM).

Dimensionality Reduction Position +1

From which world is your graph?

no code implementations3 Nov 2017 Cheng Li, Felix Wong, Zhenming Liu, Varun Kanade

Discovering statistical structure from links is a fundamental problem in the analysis of social networks.

Dimensionality Reduction Position

Hierarchical Clustering: Objective Functions and Algorithms

no code implementations7 Apr 2017 Vincent Cohen-Addad, Varun Kanade, Frederik Mallmann-Trenn, Claire Mathieu

For similarity-based hierarchical clustering, Dasgupta showed that the divisive sparsest-cut approach achieves an $O(\log^{3/2} n)$-approximation.

Clustering Combinatorial Optimization +1

Reliably Learning the ReLU in Polynomial Time

no code implementations30 Nov 2016 Surbhi Goel, Varun Kanade, Adam Klivans, Justin Thaler

These results are in contrast to known efficient algorithms for reliably learning linear threshold functions, where $\epsilon$ must be $\Omega(1)$ and strong assumptions are required on the marginal distribution.

Online Optimization of Smoothed Piecewise Constant Functions

no code implementations7 Apr 2016 Vincent Cohen-Addad, Varun Kanade

We study online optimization of smoothed piecewise constant functions over the domain [0, 1).

Learning with a Drifting Target Concept

no code implementations20 May 2015 Steve Hanneke, Varun Kanade, Liu Yang

Some of the results also describe an active learning variant of this setting, and provide bounds on the number of queries for the labels of points in the sequence sufficient to obtain the stated bounds on the error rates.

Active Learning

Distribution-Independent Reliable Learning

no code implementations20 Feb 2014 Varun Kanade, Justin Thaler

The goal in the positive reliable agnostic framework is to output a hypothesis with the following properties: (i) its false positive error rate is at most $\epsilon$, (ii) its false negative error rate is at most $\epsilon$ more than that of the best positive reliable classifier from the class.

Attribute PAC learning

Attribute-Efficient Evolvability of Linear Functions

no code implementations16 Sep 2013 Elaine Angelino, Varun Kanade

In a seminal paper, Valiant (2006) introduced a computational model for evolution to address the question of complexity that can arise through Darwinian mechanisms.

Attribute Evolutionary Algorithms

MCMC Learning

no code implementations13 Jul 2013 Varun Kanade, Elchanan Mossel

The theory of learning under the uniform distribution is rich and deep, with connections to cryptography, computational complexity, and the analysis of boolean functions to name a few areas.

Distributed Non-Stochastic Experts

no code implementations NeurIPS 2012 Varun Kanade, Zhenming Liu, Bozidar Radunovic

This paper shows the difficulty of simultaneously achieving regret asymptotically better than \sqrt{kT} and communication better than T. We give a novel algorithm that for an oblivious adversary achieves a non-trivial trade-off: regret O(\sqrt{k^{5(1+\epsilon)/6} T}) and communication O(T/k^\epsilon), for any value of \epsilon in (0, 1/5).

Learning using Local Membership Queries

no code implementations5 Nov 2012 Pranjal Awasthi, Vitaly Feldman, Varun Kanade

We introduce a new model of membership query (MQ) learning, where the learning algorithm is restricted to query points that are \emph{close} to random examples drawn from the underlying distribution.

Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression

no code implementations NeurIPS 2011 Sham M. Kakade, Varun Kanade, Ohad Shamir, Adam Kalai

In this paper, we provide algorithms for learning GLMs and SIMs, which are both computationally and statistically efficient.

regression

Potential-Based Agnostic Boosting

no code implementations NeurIPS 2009 Varun Kanade, Adam Kalai

We prove strong noise-tolerance properties of a potential-based boosting algorithm, similar to MadaBoost (Domingo and Watanabe, 2000) and SmoothBoost (Servedio, 2003).

Learning Theory

Cannot find the paper you are looking for? You can Submit a new open access paper.