no code implementations • 6 Feb 2023 • Yash Chandak, Shiv Shankar, Venkata Gandikota, Philip S. Thomas, Arya Mazumdar
We propose a first-order method for convex optimization, where instead of being restricted to the gradient from a single parameter, gradients from multiple parameters can be used during each step of gradient descent.
no code implementations • 17 Jan 2023 • Xiaofan Yu, Ludmila Cherkasova, Harsh Vardhan, Quanling Zhao, Emily Ekaireb, Xiyuan Zhang, Arya Mazumdar, Tajana Rosing
To fully unleash the potential of Async-HFL in converging speed under system heterogeneities and stragglers, we design device selection at the gateway level and device-gateway association at the cloud level.
no code implementations • 29 Oct 2022 • Namiko Matsumoto, Arya Mazumdar, Soumyabrata Pal
A {\em universal} measurement matrix for 1bCS refers to one set of measurements that work for all sparse signals.
no code implementations • 20 Oct 2022 • Harshvardhan, Avishek Ghosh, Arya Mazumdar
\texttt{SR-FCA} treats each user as a singleton cluster as an initialization, and then successively refine the cluster estimation via exploiting similar users belonging to the same cluster.
no code implementations • 7 Jul 2022 • Namiko Matsumoto, Arya Mazumdar
Note that, this dependence on $k$ and $\epsilon$ is optimal for any recovery method in 1-bit compressed sensing.
no code implementations • 22 Jun 2022 • Sainyam Galhotra, Arya Mazumdar, Soumyabrata Pal, Barna Saha
We show that a simple triangle-counting algorithm to detect communities in the geometric block model is near-optimal.
no code implementations • 31 May 2022 • Avishek Ghosh, Abishek Sankararaman, Kannan Ramchandran, Tara Javidi, Arya Mazumdar
We propose and analyze a decentralized and asynchronous learning algorithm, namely Decentralized Non-stationary Competing Bandits (\texttt{DNCB}), where the agents play (restrictive) successive elimination type learning algorithms to learn their preference over the arms.
no code implementations • 26 May 2022 • Avishek Ghosh, Arya Mazumdar, Soumyabrata Pal, Rajat Sen
In this paper we show that a version of the popular alternating minimization (AM) algorithm finds the best fit lines in a dataset even when a realizable model is not assumed, under some regularity conditions on the dataset and the initial points, and thereby provides a solution for the ERM.
no code implementations • 24 Feb 2022 • Arya Mazumdar, Soumyabrata Pal
Sparsity of parameter vectors is a natural constraint in variety of settings, and support recovery is a major step towards parameter estimation.
no code implementations • 2 Oct 2021 • Wasim Huleihel, Arya Mazumdar, Soumyabrata Pal
Specifically, we show that any (possibly randomized) algorithm must make $\mathsf{Q} = \Omega(\frac{n^2}{k^2\chi^4(p||q)}\log^2n)$ adaptive queries (on expectation) to the adjacency matrix of the graph to detect the planted subgraph with probability more than $1/2$, where $\chi^2(p||q)$ is the Chi-Square distance.
no code implementations • 2 Sep 2021 • Sami Davies, Arya Mazumdar, Soumyabrata Pal, Cyrus Rashtchian
Mixtures of high dimensional Gaussian distributions have been studied extensively in statistics and learning theory.
no code implementations • 19 Jul 2021 • Arya Mazumdar, Soumyabrata Pal
With universality, it is known that $\tilde{\Theta}(k^2)$ 1bCS measurements are necessary and sufficient for support recovery (where $k$ denotes the sparsity).
no code implementations • NeurIPS 2021 • Venkata Gandikota, Arya Mazumdar, Soumyabrata Pal
In this work, we study the number of measurements sufficient for recovering the supports of all the component vectors in a mixture in both these models.
1 code implementation • NeurIPS 2021 • Wasim Huleihel, Arya Mazumdar, Soumyabrata Pal
In particular, we provide algorithms for fuzzy clustering in this setting that asks $O(\mathsf{poly}(k)\log n)$ similarity queries and run with polynomial-time-complexity, where $n$ is the number of items.
no code implementations • 17 Mar 2021 • Avishek Ghosh, Raj Kumar Maity, Arya Mazumdar, Kannan Ramchandran
Moreover, we validate our theoretical findings with experiments using standard datasets and several types of Byzantine attacks, and obtain an improvement of $25\%$ with respect to first order methods in iteration complexity.
1 code implementation • 9 Dec 2020 • Nishant Yadav, Rajat Sen, Daniel N. Hill, Arya Mazumdar, Inderjit S. Dhillon
Previous queries in the user session can provide useful context for the user's intent and can be leveraged to suggest auto-completions that are more relevant while adhering to the user's prefix.
no code implementations • NeurIPS 2020 • Venkata Gandikota, Arya Mazumdar, Soumyabrata Pal
We look at a hitherto unstudied problem of query complexity upper bound of recovering all the hyperplanes, especially for the case when the hyperplanes are sparse.
no code implementations • ICML 2020 • Arya Mazumdar, Soumyabrata Pal
Mixture of linear regressions is a popular learning theoretic model that is used widely to represent heterogeneous data.
1 code implementation • NeurIPS 2020 • Shashanka Ubaru, Sanjeeb Dash, Arya Mazumdar, Oktay Gunluk
We then present a hierarchical partitioning approach that exploits the label hierarchy in large scale problems to divide up the large label space and create smaller sub-problems, which can then be solved independently via the grouping approach.
no code implementations • NeurIPS 2020 • Avishek Ghosh, Raj Kumar Maity, Arya Mazumdar
We develop a distributed second order optimization algorithm that is communication-efficient as well as robust against Byzantine failures of the worker machines.
no code implementations • 20 Feb 2020 • Venkata Gandikota, Arya Mazumdar, Ankit Singh Rawat
In this paper, we present distributed generalized clustering algorithms that can handle large scale data across multiple machines in spite of straggling or unreliable machines.
no code implementations • 19 Jan 2020 • Akshay Krishnamurthy, Arya Mazumdar, Andrew Mcgregor, Soumyabrata Pal
Our second approach uses algebraic and combinatorial tools and applies to binomial mixtures with shared trial parameter $N$ and differing success parameters, as well as to mixtures of geometric distributions.
no code implementations • NeurIPS 2019 • Akshay Krishnamurthy, Arya Mazumdar, Andrew Mcgregor, Soumyabrata Pal
Ourtechniques are quite different from those in the previous work: for the noiselesscase, we rely on a property of sparse polynomials and for the noisy case, we providenew connections to learning Gaussian mixtures and use ideas from the theory of
no code implementations • 21 Nov 2019 • Avishek Ghosh, Raj Kumar Maity, Swanand Kadhe, Arya Mazumdar, Kannan Ramchandran
Moreover, we analyze the compressed gradient descent algorithm with error feedback (proposed in \cite{errorfeed}) in a distributed setting and in the presence of Byzantine worker machines.
no code implementations • 18 Nov 2019 • Venkata Gandikota, Daniel Kane, Raj Kumar Maity, Arya Mazumdar
In this work, we present a family of vector quantization schemes \emph{vqSGD} (Vector-Quantized Stochastic Gradient Descent) that provide an asymptotic reduction in the communication cost with convergence guarantees in first-order distributed optimization.
no code implementations • 30 Oct 2019 • Akshay Krishnamurthy, Arya Mazumdar, Andrew Mcgregor, Soumyabrata Pal
In the problem of learning mixtures of linear regressions, the goal is to learn a collection of signal vectors from a sequence of (possibly noisy) linear measurements, where each measurement is evaluated on an unknown signal drawn uniformly from this collection.
no code implementations • NeurIPS 2019 • Wasim Huleihel, Arya Mazumdar, Muriel Médard, Soumyabrata Pal
In this paper, we look at the more practical scenario of overlapping clusters, and provide upper bounds (with algorithms) on the sufficient number of queries.
no code implementations • NeurIPS Workshop Deep_Invers 2019 • Arya Mazumdar, Ankit Singh Rawat
Rectified linear units, or ReLUs, have become a preferred activation function for artificial neural networks.
no code implementations • 31 Mar 2019 • Arya Mazumdar, Soumyabrata Pal
In this paper, we show that a recently popular model of semi-supervised clustering is equivalent to locally encodable source coding.
no code implementations • 29 Jun 2018 • Raj Kumar Maity, Arya Mazumdar, Soumyabrata Pal
Recently Ermon et al. (2013) pioneered a way to practically compute approximations to large scale counting or discrete integration problems by using random hashes.
no code implementations • 22 May 2018 • Raj Kumar Maity, Ankit Singh Rawat, Arya Mazumdar
We, instead, propose to encode the second-moment of the data with a low density parity-check (LDPC) code.
no code implementations • 12 Apr 2018 • Sainyam Galhotra, Arya Mazumdar, Soumyabrata Pal, Barna Saha
Our next contribution is in using the connectivity of random annulus graphs to provide necessary and sufficient conditions for efficient recovery of communities for {\em the geometric block model} (GBM).
no code implementations • 12 Mar 2018 • Arya Mazumdar, Ankit Singh Rawat
Given a set of observation vectors $\mathbf{y}^i \in \mathbb{R}^d, i =1, 2, \dots , n$, we aim to recover $d\times k$ matrix $A$ and the latent vectors $\{\mathbf{c}^i\} \subset \mathbb{R}^k$ under the model $\mathbf{y}^i = \mathrm{ReLU}(A\mathbf{c}^i +\mathbf{b})$, where $\mathbf{b}\in \mathbb{R}^d$ is a random bias.
no code implementations • NeurIPS 2017 • Arya Mazumdar, Soumyabrata Pal
In this paper, we show that a recently popular model of semisupervised clustering is equivalent to locally encodable source coding.
no code implementations • 16 Sep 2017 • Sainyam Galhotra, Arya Mazumdar, Soumyabrata Pal, Barna Saha
To capture the inherent geometric features of many community detection problems, we propose to use a new random graph model of communities that we call a Geometric Block Model.
no code implementations • ICML 2017 • Shashanka Ubaru, Arya Mazumdar
In this work, we propose a novel approach based on group testing to solve such large multilabel classification problems with sparse label vectors.
no code implementations • NeurIPS 2017 • Arya Mazumdar, Barna Saha
A natural noisy model is where similarity values are drawn independently from some arbitrary probability distribution $f_+$ when the underlying pair of elements belong to the same cluster, and from some $f_-$ otherwise.
no code implementations • NeurIPS 2017 • Arya Mazumdar, Barna Saha
In this paper, we provide the first information theoretic lower bound on the number of queries for clustering with noisy oracle in both situations.
no code implementations • 3 Feb 2017 • Arya Mazumdar, Barna Saha
Entity resolution (ER) is the task of identifying all records in a database that refer to the same underlying entity, and are therefore duplicates of each other.
no code implementations • 28 Jan 2017 • Pan Li, Arya Mazumdar, Olgica Milenkovic
We propose a novel rank aggregation method based on converting permutations into their corresponding Lehmer codes or other subdiagonal images.
no code implementations • 29 Nov 2016 • Arya Mazumdar, Ankit Singh Rawat
Designing an associative memory requires addressing two main tasks: 1) learning phase: given a dataset, learn a concise representation of the dataset in the form of a graphical model (or a neural network), 2) recall phase: given a noisy version of a message vector from the dataset, output the correct message vector via a neurally feasible algorithm over the network learnt during the learning phase.
no code implementations • 7 Apr 2016 • Arya Mazumdar, Barna Saha
A major contribution of this paper is to reduce the query complexity to linear or even sublinear in $n$ when mild side information is provided by a machine, and even in presence of crowd errors which are not correctable via resampling.
no code implementations • 30 Dec 2015 • Shashanka Ubaru, Arya Mazumdar, Yousef Saad
In this paper, we show how matrices from error correcting codes can be used to find such low rank approximations and matrix decompositions, and extend the framework to linear least squares regression problems.
no code implementations • NeurIPS 2015 • Arya Mazumdar, Ankit Singh Rawat
An associative memory is a structure learned from a dataset $\mathcal{M}$ of vectors (signals) in a way such that, given a noisy version of one of the vectors as input, the nearest valid vector from $\mathcal{M}$ (nearest neighbor) is provided as output, preferably via a fast iterative algorithm.