3 code implementations • 28 May 2019 • Ari Kobren, Barna Saha, Andrew McCallum
Automatically matching reviewers to papers is a crucial step of the peer review process for venues receiving thousands of submissions.
Data Structures and Algorithms Digital Libraries
no code implementations • 12 Apr 2018 • Sainyam Galhotra, Arya Mazumdar, Soumyabrata Pal, Barna Saha
Our next contribution is in using the connectivity of random annulus graphs to provide necessary and sufficient conditions for efficient recovery of communities for {\em the geometric block model} (GBM).
no code implementations • 16 Sep 2017 • Sainyam Galhotra, Arya Mazumdar, Soumyabrata Pal, Barna Saha
To capture the inherent geometric features of many community detection problems, we propose to use a new random graph model of communities that we call a Geometric Block Model.
no code implementations • NeurIPS 2017 • Arya Mazumdar, Barna Saha
A natural noisy model is where similarity values are drawn independently from some arbitrary probability distribution $f_+$ when the underlying pair of elements belong to the same cluster, and from some $f_-$ otherwise.
no code implementations • NeurIPS 2017 • Arya Mazumdar, Barna Saha
In this paper, we provide the first information theoretic lower bound on the number of queries for clustering with noisy oracle in both situations.
no code implementations • 3 Feb 2017 • Arya Mazumdar, Barna Saha
Entity resolution (ER) is the task of identifying all records in a database that refer to the same underlying entity, and are therefore duplicates of each other.
no code implementations • 7 Apr 2016 • Arya Mazumdar, Barna Saha
A major contribution of this paper is to reduce the query complexity to linear or even sublinear in $n$ when mild side information is provided by a machine, and even in presence of crowd errors which are not correctable via resampling.
1 code implementation • 14 Aug 2019 • Barna Saha, Sanjay Subramanian
An algorithm in this setting has access to an oracle with full knowledge of an optimal clustering, and the algorithm can ask the oracle queries of the form, "Does the optimal clustering put vertices $ u $ and $ v $ in the same cluster?"
no code implementations • 10 Feb 2020 • Saba Ahmadi, Sainyam Galhotra, Barna Saha, Roy Schwartz
We consider two variations of fairness constraint for the problem of correlation clustering where each node has a color, and the goal is to form clusters that do not over-represent vertices of any color.
no code implementations • 24 Jul 2020 • Tomasz Kociumaka, Barna Saha
For the gap edit distance problem, we give a greedy algorithm that distinguishes in time $\tilde{O}(\frac{n}{k}+k^2)$ between length-$n$ input strings with edit distance at most $k$ and those with edit distance more than $4k^2$.
Data Structures and Algorithms
no code implementations • 12 May 2021 • Raghavendra Addanki, Sainyam Galhotra, Barna Saha
Metric based comparison operations such as finding maximum, nearest and farthest neighbor are fundamental to studying various clustering techniques such as $k$-center clustering and agglomerative hierarchical clustering.
no code implementations • 22 Jun 2022 • Sainyam Galhotra, Arya Mazumdar, Soumyabrata Pal, Barna Saha
We show that a simple triangle-counting algorithm to detect communities in the geometric block model is near-optimal.
no code implementations • 12 Feb 2024 • Barna Saha, Christopher Ye
This leads to the main question of our work: Is FlashAttention I/O optimal for all values of $M$?