Search Results for author: Arya Mazumdar

Found 35 papers, 3 papers with code

Random Subgraph Detection Using Queries

no code implementations2 Oct 2021 Wasim Huleihel, Arya Mazumdar, Soumyabrata Pal

Specifically, we show that any (possibly randomized) algorithm must make $\mathsf{Q} = \Omega(\frac{n^2}{k^2\chi^4(p||q)}\log^2n)$ adaptive queries (on expectation) to the adjacency matrix of the graph to detect the planted subgraph with probability more than $1/2$, where $\chi^2(p||q)$ is the Chi-Square distance.

Lower Bounds on the Total Variation Distance Between Mixtures of Two Gaussians

no code implementations2 Sep 2021 Sami Davies, Arya Mazumdar, Soumyabrata Pal, Cyrus Rashtchian

Mixtures of high dimensional Gaussian distributions have been studied extensively in statistics and learning theory.

Learning Theory

Support Recovery in Universal One-bit Compressed Sensing

no code implementations19 Jul 2021 Arya Mazumdar, Soumyabrata Pal

With universality, it is known that $\tilde{\Theta}(k^2)$ 1bCS measurements are necessary and sufficient for support recovery (where $k$ denotes the sparsity).

Quantization

Support Recovery of Sparse Signals from a Mixture of Linear Measurements

no code implementations NeurIPS 2021 Venkata Gandikota, Arya Mazumdar, Soumyabrata Pal

In this work, we study the number of measurements sufficient for recovering the supports of all the component vectors in a mixture in both these models.

Fuzzy Clustering with Similarity Queries

1 code implementation NeurIPS 2021 Wasim Huleihel, Arya Mazumdar, Soumyabrata Pal

In particular, we provide algorithms for fuzzy clustering in this setting that asks $O(\mathsf{poly}(k)\log n)$ similarity queries and run with polynomial-time-complexity, where $n$ is the number of items.

Escaping Saddle Points in Distributed Newton's Method with Communication efficiency and Byzantine Resilience

no code implementations17 Mar 2021 Avishek Ghosh, Raj Kumar Maity, Arya Mazumdar, Kannan Ramchandran

Furthermore, our algorithm resists the presence of Byzantine machines, which may create \emph{fake local minima} near the saddle points of the loss function, also known as saddle-point attack.

Session-Aware Query Auto-completion using Extreme Multi-label Ranking

1 code implementation9 Dec 2020 Nishant Yadav, Rajat Sen, Daniel N. Hill, Arya Mazumdar, Inderjit S. Dhillon

Previous queries in the user session can provide useful context for the user's intent and can be leveraged to suggest auto-completions that are more relevant while adhering to the user's prefix.

Recovery of sparse linear classifiers from mixture of responses

no code implementations NeurIPS 2020 Venkata Gandikota, Arya Mazumdar, Soumyabrata Pal

We look at a hitherto unstudied problem of query complexity upper bound of recovering all the hyperplanes, especially for the case when the hyperplanes are sparse.

Quantization

Recovery of Sparse Signals from a Mixture of Linear Samples

no code implementations ICML 2020 Arya Mazumdar, Soumyabrata Pal

Mixture of linear regressions is a popular learning theoretic model that is used widely to represent heterogeneous data.

Experimental Design

Multilabel Classification by Hierarchical Partitioning and Data-dependent Grouping

1 code implementation NeurIPS 2020 Shashanka Ubaru, Sanjeeb Dash, Arya Mazumdar, Oktay Gunluk

We then present a hierarchical partitioning approach that exploits the label hierarchy in large scale problems to divide up the large label space and create smaller sub-problems, which can then be solved independently via the grouping approach.

Classification General Classification +1

Distributed Newton Can Communicate Less and Resist Byzantine Workers

no code implementations NeurIPS 2020 Avishek Ghosh, Raj Kumar Maity, Arya Mazumdar

We develop a distributed second order optimization algorithm that is communication-efficient as well as robust against Byzantine failures of the worker machines.

Distributed Optimization

Reliable Distributed Clustering with Redundant Data Assignment

no code implementations20 Feb 2020 Venkata Gandikota, Arya Mazumdar, Ankit Singh Rawat

In this paper, we present distributed generalized clustering algorithms that can handle large scale data across multiple machines in spite of straggling or unreliable machines.

Dimensionality Reduction

Algebraic and Analytic Approaches for Parameter Learning in Mixture Models

no code implementations19 Jan 2020 Akshay Krishnamurthy, Arya Mazumdar, Andrew Mcgregor, Soumyabrata Pal

Our second approach uses algebraic and combinatorial tools and applies to binomial mixtures with shared trial parameter $N$ and differing success parameters, as well as to mixtures of geometric distributions.

Sample Complexity of Learning Mixture of Sparse Linear Regressions

no code implementations NeurIPS 2019 Akshay Krishnamurthy, Arya Mazumdar, Andrew Mcgregor, Soumyabrata Pal

Ourtechniques are quite different from those in the previous work: for the noiselesscase, we rely on a property of sparse polynomials and for the noisy case, we providenew connections to learning Gaussian mixtures and use ideas from the theory of

Communication-Efficient and Byzantine-Robust Distributed Learning with Error Feedback

no code implementations21 Nov 2019 Avishek Ghosh, Raj Kumar Maity, Swanand Kadhe, Arya Mazumdar, Kannan Ramchandran

Moreover, we analyze the compressed gradient descent algorithm with error feedback (proposed in \cite{errorfeed}) in a distributed setting and in the presence of Byzantine worker machines.

vqSGD: Vector Quantized Stochastic Gradient Descent

no code implementations18 Nov 2019 Venkata Gandikota, Daniel Kane, Raj Kumar Maity, Arya Mazumdar

In this work, we present a family of vector quantization schemes \emph{vqSGD} (Vector-Quantized Stochastic Gradient Descent) that provide an asymptotic reduction in the communication cost with convergence guarantees in first-order distributed optimization.

Distributed Optimization Quantization

Sample Complexity of Learning Mixtures of Sparse Linear Regressions

no code implementations30 Oct 2019 Akshay Krishnamurthy, Arya Mazumdar, Andrew Mcgregor, Soumyabrata Pal

In the problem of learning mixtures of linear regressions, the goal is to learn a collection of signal vectors from a sequence of (possibly noisy) linear measurements, where each measurement is evaluated on an unknown signal drawn uniformly from this collection.

Same-Cluster Querying for Overlapping Clusters

no code implementations NeurIPS 2019 Wasim Huleihel, Arya Mazumdar, Muriel Médard, Soumyabrata Pal

In this paper, we look at the more practical scenario of overlapping clusters, and provide upper bounds (with algorithms) on the sufficient number of queries.

Learning Network Parameters in the ReLU Model

no code implementations NeurIPS Workshop Deep_Invers 2019 Arya Mazumdar, Ankit Singh Rawat

Rectified linear units, or ReLUs, have become a preferred activation function for artificial neural networks.

Semisupervised Clustering by Queries and Locally Encodable Source Coding

no code implementations31 Mar 2019 Arya Mazumdar, Soumyabrata Pal

In this paper, we show that a recently popular model of semi-supervised clustering is equivalent to locally encodable source coding.

Data Compression

High Dimensional Discrete Integration over the Hypergrid

no code implementations29 Jun 2018 Raj Kumar Maity, Arya Mazumdar, Soumyabrata Pal

Recently Ermon et al. (2013) pioneered a way to practically compute approximations to large scale counting or discrete integration problems by using random hashes.

Robust Gradient Descent via Moment Encoding with LDPC Codes

no code implementations22 May 2018 Raj Kumar Maity, Ankit Singh Rawat, Arya Mazumdar

We, instead, propose to encode the second-moment of the data with a low density parity-check (LDPC) code.

Distributed Computing

Connectivity in Random Annulus Graphs and the Geometric Block Model

no code implementations12 Apr 2018 Sainyam Galhotra, Arya Mazumdar, Soumyabrata Pal, Barna Saha

Our next contribution is in using the connectivity of random annulus graphs to provide necessary and sufficient conditions for efficient recovery of communities for {\em the geometric block model} (GBM).

Community Detection Stochastic Block Model

Representation Learning and Recovery in the ReLU Model

no code implementations12 Mar 2018 Arya Mazumdar, Ankit Singh Rawat

Given a set of observation vectors $\mathbf{y}^i \in \mathbb{R}^d, i =1, 2, \dots , n$, we aim to recover $d\times k$ matrix $A$ and the latent vectors $\{\mathbf{c}^i\} \subset \mathbb{R}^k$ under the model $\mathbf{y}^i = \mathrm{ReLU}(A\mathbf{c}^i +\mathbf{b})$, where $\mathbf{b}\in \mathbb{R}^d$ is a random bias.

Dictionary Learning Representation Learning

Semisupervised Clustering, AND-Queries and Locally Encodable Source Coding

no code implementations NeurIPS 2017 Arya Mazumdar, Soumyabrata Pal

In this paper, we show that a recently popular model of semisupervised clustering is equivalent to locally encodable source coding.

Data Compression

The Geometric Block Model

no code implementations16 Sep 2017 Sainyam Galhotra, Arya Mazumdar, Soumyabrata Pal, Barna Saha

To capture the inherent geometric features of many community detection problems, we propose to use a new random graph model of communities that we call a Geometric Block Model.

Community Detection Stochastic Block Model

Multilabel Classification with Group Testing and Codes

no code implementations ICML 2017 Shashanka Ubaru, Arya Mazumdar

In this work, we propose a novel approach based on group testing to solve such large multilabel classification problems with sparse label vectors.

Classification General Classification

Query Complexity of Clustering with Side Information

no code implementations NeurIPS 2017 Arya Mazumdar, Barna Saha

A natural noisy model is where similarity values are drawn independently from some arbitrary probability distribution $f_+$ when the underlying pair of elements belong to the same cluster, and from some $f_-$ otherwise.

Community Detection Stochastic Block Model

Clustering with Noisy Queries

no code implementations NeurIPS 2017 Arya Mazumdar, Barna Saha

In this paper, we provide the first information theoretic lower bound on the number of queries for clustering with noisy oracle in both situations.

Entity Resolution Stochastic Block Model

A Theoretical Analysis of First Heuristics of Crowdsourced Entity Resolution

no code implementations3 Feb 2017 Arya Mazumdar, Barna Saha

Entity resolution (ER) is the task of identifying all records in a database that refer to the same underlying entity, and are therefore duplicates of each other.

Entity Resolution

Efficient Rank Aggregation via Lehmer Codes

no code implementations28 Jan 2017 Pan Li, Arya Mazumdar, Olgica Milenkovic

We propose a novel rank aggregation method based on converting permutations into their corresponding Lehmer codes or other subdiagonal images.

Associative Memory using Dictionary Learning and Expander Decoding

no code implementations29 Nov 2016 Arya Mazumdar, Ankit Singh Rawat

Designing an associative memory requires addressing two main tasks: 1) learning phase: given a dataset, learn a concise representation of the dataset in the form of a graphical model (or a neural network), 2) recall phase: given a noisy version of a message vector from the dataset, output the correct message vector via a neurally feasible algorithm over the network learnt during the learning phase.

Dictionary Learning

Clustering Via Crowdsourcing

no code implementations7 Apr 2016 Arya Mazumdar, Barna Saha

A major contribution of this paper is to reduce the query complexity to linear or even sublinear in $n$ when mild side information is provided by a machine, and even in presence of crowd errors which are not correctable via resampling.

Entity Resolution

Low rank approximation and decomposition of large matrices using error correcting codes

no code implementations30 Dec 2015 Shashanka Ubaru, Arya Mazumdar, Yousef Saad

In this paper, we show how matrices from error correcting codes can be used to find such low rank approximations and matrix decompositions, and extend the framework to linear least squares regression problems.

Associative Memory via a Sparse Recovery Model

no code implementations NeurIPS 2015 Arya Mazumdar, Ankit Singh Rawat

An associative memory is a structure learned from a dataset $\mathcal{M}$ of vectors (signals) in a way such that, given a noisy version of one of the vectors as input, the nearest valid vector from $\mathcal{M}$ (nearest neighbor) is provided as output, preferably via a fast iterative algorithm.

Cannot find the paper you are looking for? You can Submit a new open access paper.