no code implementations • 13 Jun 2024 • Mehran Kazemi, Nishanth Dikkala, Ankit Anand, Petar Devic, Ishita Dasgupta, Fangyu Liu, Bahare Fatemi, Pranjal Awasthi, Dee Guo, Sreenivas Gollapudi, Ahmed Qureshi

With the continuous advancement of large language models (LLMs), it is essential to create new benchmarks to effectively evaluate their expanding capabilities and identify areas for improvement.

1 code implementation • 31 May 2024 • Hanseul Cho, Jaeyoung Cha, Pranjal Awasthi, Srinadh Bhojanapalli, Anupam Gupta, Chulhee Yun

Even for simple arithmetic tasks like integer addition, it is challenging for Transformers to generalize to longer sequences than those encountered during training.

no code implementations • 31 May 2024 • Bernd Bohnet, Kevin Swersky, Rosanne Liu, Pranjal Awasthi, Azade Nova, Javier Snaider, Hanie Sedghi, Aaron T Parisi, Michael Collins, Angeliki Lazaridou, Orhan Firat, Noah Fiedel

We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books.

no code implementations • 8 Mar 2024 • Naman Agarwal, Pranjal Awasthi, Satyen Kale, Eric Zhao

Stacking, a heuristic technique for training deep residual networks by progressively increasing the number of layers and initializing new layers by copying parameters from older layers, has proven quite successful in improving the efficiency of training deep neural networks.

no code implementations • 26 Feb 2024 • Maximilian Böther, Abraham Sebastian, Pranjal Awasthi, Ana Klimovic, Srikumar Ramalingam

In this paper, we relax the requirement of having a central machine for the target subset by proposing a novel distributed bounding algorithm with provable approximation guarantees.

no code implementations • 7 Feb 2024 • Hanna Mazzawi, Pranjal Awasthi, Xavi Gonzalvo, Srikumar Ramalingam

Building upon this framework, we present a novel, architecture agnostic algorithm called "majority kernels", which seamlessly integrates with predominant architectures, including Transformer models.

no code implementations • 17 Dec 2023 • Srikumar Ramalingam, Pranjal Awasthi, Sanjiv Kumar

The success of deep learning hinges on enormous data and large models, which require labor-intensive annotations and heavy computation costs.

no code implementations • 1 Oct 2023 • Pranjal Awasthi, Anupam Gupta

For sorting we show that it is possible to train models on data consisting of sequences having length at most $20$, and improve the test accuracy on sequences of length $100$ from less than 1% (for standard training) to more than 92% (via task hinting).

no code implementations • 22 Jul 2023 • Pranjal Awasthi, Nika Haghtalab, Eric Zhao

Multi-distribution learning is a natural generalization of PAC learning to settings with multiple data distributions.

no code implementations • 10 May 2023 • Pranjal Awasthi, Corinna Cortes, Mehryar Mohri

We show how these bounds can guide the design of learning algorithms that we discuss in detail.

no code implementations • 23 Jan 2023 • Pranjal Awasthi, Kush Bhatia, Sreenivas Gollapudi, Kostas Kollias

For the linear contextual bandit setup, our algorithm, based on an iterative least squares planner, achieves policy regret $\tilde{O}(\sqrt{dT} + \Delta)$.

no code implementations • 19 Oct 2022 • Joan Puigcerver, Rodolphe Jenatton, Carlos Riquelme, Pranjal Awasthi, Srinadh Bhojanapalli

We next empirically evaluate the robustness of MoEs on ImageNet using adversarial attacks and show they are indeed more robust than dense models with the same computational cost.

no code implementations • 4 Aug 2022 • Pranjal Awasthi, Alex Tang, Aravindan Vijayaraghavan

We provide a convergence analysis of gradient descent for the problem of agnostically learning a single ReLU function under Gaussian distributions.

1 code implementation • 7 Jul 2022 • Saba Ahmadi, Pranjal Awasthi, Samir Khuller, Matthäus Kleindessner, Jamie Morgenstern, Pattara Sukprasert, Ali Vakilian

In this paper, we propose a natural notion of individual preference (IP) stability for clustering, which asks that every data point, on average, is closer to the points in its own cluster than to the points in any other cluster.

no code implementations • 9 Jun 2022 • Pranjal Awasthi, Abhimanyu Das, Weihao Kong, Rajat Sen

We study the problem of learning generalized linear models under adversarial corruptions.

no code implementations • 16 May 2022 • Pranjal Awasthi, Anqi Mao, Mehryar Mohri, Yutao Zhong

We also show that previous excess error bounds can be recovered as special cases of our general results.

no code implementations • 3 May 2022 • Pranjal Awasthi, Nishanth Dikkala, Pritish Kamath

Recent investigations in noise contrastive estimation suggest, both empirically as well as theoretically, that while having more "negative samples" in the contrastive loss improves downstream classification performance initially, beyond a threshold, it hurts downstream performance due to a "collision-coverage" trade-off.

1 code implementation • 11 Feb 2022 • Pranjal Awasthi, Christopher Jung, Jamie Morgenstern

Suppose we are given two datasets: a labeled dataset and unlabeled dataset which also has additional auxiliary features not present in the first dataset.

no code implementations • 31 Jan 2022 • Ziwei Ji, Kwangjun Ahn, Pranjal Awasthi, Satyen Kale, Stefani Karp

In this paper, we close this gap by constructing a well-behaved distribution such that the global minimizer of the logistic risk over this distribution only achieves $\Omega(\sqrt{\textrm{OPT}})$ misclassification risk, matching the upper bound in (Frei et al., 2021).

no code implementations • 3 Dec 2021 • Pranjal Awasthi, Natalie S. Frank, Mehryar Mohri

Our results can provide a useful tool for a subsequent study of surrogate losses in adversarial robustness and their consistency properties.

no code implementations • NeurIPS 2021 • Pranjal Awasthi, Abhimanyu Das, Sreenivas Gollapudi

Graph Neural Networks~(GNNs) are a powerful class of architectures for solving learning problems on graphs.

no code implementations • NeurIPS 2021 • Pranjal Awasthi, Natalie Frank, Mehryar Mohri

Adversarial robustness is a critical property in a variety of modern machine learning applications.

no code implementations • NeurIPS 2021 • Pranjal Awasthi, Alex Tang, Aravindan Vijayaraghavan

We present polynomial time and sample efficient algorithms for learning an unknown depth-2 feedforward neural network with general ReLU activations, under mild non-degeneracy assumptions.

no code implementations • ICLR 2022 • Pranjal Awasthi, Abhimanyu Das, Rajat Sen, Ananda Theertha Suresh

We also demonstrate empirically that our method instantiated with a well-designed general purpose mixture likelihood family can obtain superior performance for a variety of tasks across time-series forecasting and regression datasets with different data distributions.

no code implementations • 12 Jun 2021 • Fnu Devvrit, Nived Rajaraman, Pranjal Awasthi

In this setting, the learner has access to a dataset $X \in \mathbb{R}^{(n_1+n_2) \times d}$ which is composed of $n_1$ unlabelled examples that an algorithm can actively query, and $n_2$ examples labelled a-priori.

no code implementations • NeurIPS 2021 • Pranjal Awasthi, Christoph Dann, Claudio Gentile, Ayush Sekhari, Zhilei Wang

We investigate the problem of active learning in the streaming setting in non-parametric regimes, where the labels are stochastically generated from a class of functions on which we make no assumptions whatsoever.

no code implementations • 20 May 2021 • Flavien Prost, Pranjal Awasthi, Nick Blumm, Aditee Kumthekar, Trevor Potter, Li Wei, Xuezhi Wang, Ed H. Chi, Jilin Chen, Alex Beutel

In this work we study the problem of measuring the fairness of a machine learning model under noisy information.

no code implementations • 4 May 2021 • Pranjal Awasthi, Anqi Mao, Mehryar Mohri, Yutao Zhong

Moreover, our calibration results, combined with the previous study of consistency by Awasthi et al. (2021), also lead to more general $H$-consistency results covering common hypothesis sets.

no code implementations • NeurIPS 2021 • Pranjal Awasthi, Natalie Frank, Anqi Mao, Mehryar Mohri, Yutao Zhong

We then give a characterization of H-calibration and prove that some surrogate losses are indeed H-calibrated for the adversarial loss, with these hypothesis sets.

no code implementations • 1 Mar 2021 • Jacob Abernethy, Pranjal Awasthi, Satyen Kale

This apparent lack of robustness has led researchers to propose methods that can help to prevent an adversary from having such capabilities.

no code implementations • 16 Feb 2021 • Pranjal Awasthi, Alex Beutel, Matthaeus Kleindessner, Jamie Morgenstern, Xuezhi Wang

An alternate approach that is commonly used is to separately train an attribute classifier on data with sensitive attribute information, and then use it later in the ML pipeline to evaluate the bias of a given classifier.

no code implementations • 1 Jan 2021 • Pranjal Awasthi, Sreenivas Gollapudi, Kostas Kollias, Apaar Sadhwani

We study the design of efficient online learning algorithms tolerant to adversarially corrupted rewards.

no code implementations • 1 Jan 2021 • Pranjal Awasthi, Abhimanyu Das, Sreenivas Gollapudi

Finally, we empirically demonstrate the effectiveness of our proposed architecture for a variety of graph problems.

no code implementations • NeurIPS 2020 • Pranjal Awasthi, Satyen Kale, Stefani Karp, Mehryar Mohri

We present a series of new PAC-Bayes learning guarantees for randomized algorithms with sample-dependent priors.

no code implementations • CVPR 2021 • Pranjal Awasthi, George Yu, Chun-Sung Ferng, Andrew Tomkins, Da-Cheng Juan

In this work we extend the above setting to consider the problem of training of deep neural networks that can be made simultaneously robust to perturbations applied in multiple natural representation spaces.

no code implementations • 21 Aug 2020 • Pranjal Awasthi, Corinna Cortes, Yishay Mansour, Mehryar Mohri

In the adversarial setting, we design efficient algorithms with competitive ratio guarantees.

no code implementations • 21 Jul 2020 • Pranjal Awasthi, Natalie Frank, Mehryar Mohri

Linear predictors form a rich class of hypotheses used in a variety of learning algorithms.

no code implementations • NeurIPS 2020 • Pranjal Awasthi, Himanshu Jain, Ankit Singh Rawat, Aravindan Vijayaraghavan

Adversarial robustness measures the susceptibility of a classifier to imperceptible perturbations made to the inputs at test time.

1 code implementation • 11 Jun 2020 • Jacob Abernethy, Pranjal Awasthi, Matthäus Kleindessner, Jamie Morgenstern, Chris Russell, Jie Zhang

We propose simple active sampling and reweighting strategies for optimizing min-max fairness that can be applied to any classification or regression model learned via loss minimization.

no code implementations • 8 Jun 2020 • Matthäus Kleindessner, Pranjal Awasthi, Jamie Morgenstern

A common distinction in fair machine learning, in particular in fair classification, is between group fairness and individual fairness.

no code implementations • 31 May 2020 • Pranjal Awasthi, Xue Chen, Aravindan Vijayaraghavan

We design a computationally efficient algorithm that given corrupted data, recovers an estimate of the top-$r$ principal subspace with error that depends on a robustness parameter $\kappa$ that we identify.

no code implementations • ICML 2020 • Pranjal Awasthi, Natalie Frank, Mehryar Mohri

We give upper and lower bounds for the adversarial empirical Rademacher complexity of linear hypotheses with adversarial perturbations measured in $l_r$-norm for an arbitrary $r \geq 1$.

no code implementations • NeurIPS 2020 • Chicheng Zhang, Jie Shen, Pranjal Awasthi

Even in the presence of mild label noise, i. e. $\eta$ is a small constant, this is a challenging problem and only recently have label complexity bounds of the form $\tilde{O}\big(s \cdot \mathrm{polylog}(d, \frac{1}{\epsilon})\big)$ been established in [Zhang, 2018] for computationally efficient algorithms.

no code implementations • 4 Feb 2020 • Naman Agarwal, Pranjal Awasthi, Satyen Kale

We study the role of depth in training randomly initialized overparameterized neural networks.

no code implementations • 29 Nov 2019 • Pranjal Awasthi, Vaggos Chatziafratis, Xue Chen, Aravindan Vijayaraghavan

In particular, our adversarially robust PCA primitive leads to computationally efficient and robust algorithms for both unsupervised and supervised learning problems such as clustering and learning adversarially robust classifiers.

1 code implementation • NeurIPS 2019 • Pranjal Awasthi, Abhratanu Dutta, Aravindan Vijayaraghavan

In particular, we leverage this connection to (a) design computationally efficient robust algorithms with provable guarantees for a large class of hypothesis, namely linear classifiers and degree-2 polynomial threshold functions (PTFs), (b) give a precise characterization of the price of achieving robustness in a computationally efficient manner for these classes, (c) design efficient algorithms to certify robustness and generate adversarial attacks in a principled manner for 2-layer neural networks.

2 code implementations • 7 Jun 2019 • Pranjal Awasthi, Matthäus Kleindessner, Jamie Morgenstern

We identify conditions on the perturbation that guarantee that the bias of a classifier is reduced even by running equalized odds with the perturbed attribute.

1 code implementation • 24 Jan 2019 • Matthäus Kleindessner, Pranjal Awasthi, Jamie Morgenstern

In data summarization we want to choose $k$ prototypes in order to summarize a data set.

1 code implementation • 24 Jan 2019 • Matthäus Kleindessner, Samira Samadi, Pranjal Awasthi, Jamie Morgenstern

Given the widespread popularity of spectral clustering (SC) for partitioning graph data, we study a version of constrained SC in which we try to incorporate the fairness notion proposed by Chierichetti et al. (2017).

no code implementations • ICML 2018 • Matthaeus Kleindessner, Pranjal Awasthi

Most existing works on crowdsourcing assume that the workers follow the Dawid-Skene model, or the one-coin model as its special case, where every worker makes mistakes independently of other workers and with the same error probability for every task.

no code implementations • 23 Apr 2018 • Pranjal Awasthi, Aravindan Vijayaraghavan

To address this question while circumventing the issue of non-identifiability, we study a natural semirandom model for dictionary learning where there are a large number of samples $y=Ax$ with arbitrary k-sparse supports for x, along with a few samples where the sparse supports are chosen uniformly at random.

no code implementations • ICML 2018 • Pranjal Awasthi, Aravindan Vijayaraghavan

Gaussian mixture models (GMM) are the most widely used statistical model for the $k$-means clustering problem and form a popular framework for clustering in machine learning and data analysis.

no code implementations • 21 Mar 2017 • Pranjal Awasthi, Avrim Blum, Nika Haghtalab, Yishay Mansour

When a noticeable fraction of the labelers are perfect, and the rest behave arbitrarily, we show that any $\mathcal{F}$ that can be efficiently learned in the traditional realizable PAC model can be learned in a computationally efficient manner by querying the crowd, despite high amounts of noise in the responses.

no code implementations • 2 Mar 2017 • Pranjal Awasthi, Ainesh Bakshi, Maria-Florina Balcan, Colin White, David Woodruff

In this work, we study the $k$-median and $k$-means clustering problems when the data is distributed across many servers and can contain outliers.

no code implementations • NeurIPS 2015 • Pranjal Awasthi, Andrej Risteski

The assumptions on the topic priors are related to the well known Dirichlet prior, introduced to the area of topic modeling by (Blei et al., 2003).

no code implementations • 12 Mar 2015 • Pranjal Awasthi, Maria-Florina Balcan, Nika Haghtalab, Ruth Urner

We provide the first polynomial time algorithm that can learn linear separators to arbitrarily small excess error in this noise model under the uniform distribution over the unit ball in $\Re^d$, for some constant value of $\eta$.

no code implementations • 7 Mar 2015 • Pranjal Awasthi, Moses Charikar, Kevin A. Lai, Andrej Risteski

We resolve an open question from (Christiano, 2014b) posed in COLT'14 regarding the optimal dependency of the regret achievable for online local learning on the size of the label set.

no code implementations • NeurIPS 2014 • Pranjal Awasthi, Avrim Blum, Or Sheffet, Aravindan Vijayaraghavan

We present the first polynomial time algorithm which provably learns the parameters of a mixture of two Mallows models.

no code implementations • 18 Aug 2014 • Pranjal Awasthi, Afonso S. Bandeira, Moses Charikar, Ravishankar Krishnaswamy, Soledad Villar, Rachel Ward

Under the same distributional model, the $k$-means LP relaxation fails to recover such clusters at separation as large as $\Delta = 4$.

no code implementations • 24 Dec 2013 • Pranjal Awasthi, Maria-Florina Balcan, Konstantin Voevodski

We study the design of interactive clustering algorithms for data sets satisfying natural stability assumptions.

no code implementations • 31 Jul 2013 • Pranjal Awasthi, Maria Florina Balcan, Philip M. Long

For malicious noise, where the adversary can corrupt both the label and the features, we provide a polynomial-time algorithm for learning linear separators in $\Re^d$ under isotropic log-concave distributions that can tolerate a nearly information-theoretically optimal noise rate of $\eta = \Omega(\epsilon)$.

no code implementations • 5 Nov 2012 • Pranjal Awasthi, Vitaly Feldman, Varun Kanade

We introduce a new model of membership query (MQ) learning, where the learning algorithm is restricted to query points that are \emph{close} to random examples drawn from the underlying distribution.

no code implementations • NeurIPS 2010 • Pranjal Awasthi, Reza B. Zadeh

We also propose a dynamic model where the teacher sees a random subset of the points.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.