Search Results for author: Sanjoy Dasgupta

Found 36 papers, 5 papers with code

Explainable k-Means and k-Medians Clustering

no code implementations • ICML 2020 • Michal Moshkovitz, Sanjoy Dasgupta, Cyrus Rashtchian, Nave Frost

In terms of negative results, we show that popular top-down decision tree algorithms may lead to clusterings with arbitrarily large cost, and we prove that any explainable clustering must incur an \Omega(\log k) approximation compared to the optimal clustering.

Clustering

Paper
Add Code

Online nearest neighbor classification

no code implementations • 3 Jul 2023 • Sanjoy Dasgupta, Geelon So

We study an instance of online non-parametric classification in the realizable setting.

Classification

Paper
Add Code

Active learning using region-based sampling

no code implementations • 5 Mar 2023 • Sanjoy Dasgupta, Yoav Freund

We present a general-purpose active learning scheme for data in metric spaces.

Active Learning

Paper
Add Code

Data-Copying in Generative Models: A Formal Framework

no code implementations • 25 Feb 2023 • Robi Bhattacharjee, Sanjoy Dasgupta, Kamalika Chaudhuri

There has been some recent interest in detecting and addressing memorization of training data by deep neural networks.

Memorization

Paper
Add Code

Streaming Encoding Algorithms for Scalable Hyperdimensional Computing

no code implementations • 20 Sep 2022 • Anthony Thomas, Behnam Khaleghi, Gopi Krishna Jha, Sanjoy Dasgupta, Nageen Himayat, Ravi Iyer, Nilesh Jain, Tajana Rosing

Hyperdimensional computing (HDC) is a paradigm for data representation and learning originating in computational neuroscience.

Paper
Add Code

Convergence of online $k$-means

no code implementations • 22 Feb 2022 • Sanjoy Dasgupta, Gaurav Mahajan, Geelon So

We prove asymptotic convergence for a general class of $k$-means algorithms performed over streaming data from a distribution: the centers asymptotically converge to the set of stationary points of the $k$-means cost function.

Paper
Add Code

Framework for Evaluating Faithfulness of Local Explanations

no code implementations • 1 Feb 2022 • Sanjoy Dasgupta, Nave Frost, Michal Moshkovitz

We study the faithfulness of an explanation system to the underlying prediction model.

Paper
Add Code

Algorithmic insights on continual learning from fruit flies

1 code implementation • 15 Jul 2021 • Yang shen, Sanjoy Dasgupta, Saket Navlakha

We discovered a two layer neural circuit in the fruit fly olfactory system that addresses this challenge by uniquely combining sparse coding and associative learning.

Continual Learning

Paper
Code

Online $k$-means Clustering on Arbitrary Data Streams

no code implementations • 18 Feb 2021 • Robi Bhattacharjee, Jacob Imola, Michal Moshkovitz, Sanjoy Dasgupta

We propose a data parameter, $\Lambda(X)$, such that for any algorithm maintaining $O(k\text{poly}(\log n))$ centers at time $n$, there exists a data stream $X$ for which a loss of $\Omega(\Lambda(X))$ is inevitable.

Clustering

Paper
Add Code

A Theoretical Perspective on Hyperdimensional Computing

no code implementations • 14 Oct 2020 • Anthony Thomas, Sanjoy Dasgupta, Tajana Rosing

Hyperdimensional (HD) computing is a set of neurally inspired methods for obtaining high-dimensional, low-precision, distributed representations of data.

Paper
Add Code

Expressivity of expand-and-sparsify representations

no code implementations • 5 Jun 2020 • Sanjoy Dasgupta, Christopher Tosh

The linear functions can be specified explicitly and are easy to learn, and we give bounds on how large $m$ needs to be as a function of the input dimension $d$ and the smoothness of the target function.

Paper
Add Code

A Non-Parametric Test to Detect Data-Copying in Generative Models

1 code implementation • 12 Apr 2020 • Casey Meehan, Kamalika Chaudhuri, Sanjoy Dasgupta

Detecting overfitting in generative models is an important challenge in machine learning.

BIG-bench Machine Learning

Paper
Code

Robust Learning from Discriminative Feature Feedback

no code implementations • 9 Mar 2020 • Sanjoy Dasgupta, Sivan Sabato

We show how such errors can be handled algorithmically, in both an adversarial and a stochastic setting.

Paper
Add Code

Explainable $k$-Means and $k$-Medians Clustering

3 code implementations • 28 Feb 2020 • Sanjoy Dasgupta, Nave Frost, Michal Moshkovitz, Cyrus Rashtchian

In terms of negative results, we show, first, that popular top-down decision tree algorithms may lead to clusterings with arbitrarily large cost, and second, that any tree-induced clustering must in general incur an $\Omega(\log k)$ approximation factor compared to the optimal clustering.

Clustering

Paper
Code

Interactive Topic Modeling with Anchor Words

no code implementations • 18 Jun 2019 • Sanjoy Dasgupta, Stefanos Poulis, Christopher Tosh

The formalism of anchor words has enabled the development of fast topic modeling algorithms with provable guarantees.

Topic Models

Paper
Add Code

An adaptive nearest neighbor rule for classification

1 code implementation • NeurIPS 2019 • Akshay Balsubramani, Sanjoy Dasgupta, Yoav Freund, Shay Moran

We introduce a variant of the $k$-nearest neighbor classifier in which $k$ is chosen adaptively for each query, rather than supplied as a parameter.

Classification General Classification +1

Paper
Code

What relations are reliably embeddable in Euclidean space?

no code implementations • 13 Mar 2019 • Robi Bhattacharjee, Sanjoy Dasgupta

We consider the problem of embedding a relation, represented as a directed graph, into Euclidean space.

Knowledge Graphs Relation

Paper
Add Code

Learning from discriminative feature feedback

no code implementations • NeurIPS 2018 • Sanjoy Dasgupta, Akansha Dey, Nicholas Roberts, Sivan Sabato

We consider the problem of learning a multi-class classifier from labels as well as simple explanations that we call "discriminative features".

Paper
Add Code

Interactive Structure Learning with Structural Query-by-Committee

no code implementations • NeurIPS 2018 • Christopher Tosh, Sanjoy Dasgupta

In this work, we introduce interactive structure learning, a framework that unifies many different interactive learning tasks.

Active Learning

Paper
Add Code

Structural query-by-committee

no code implementations • 17 Mar 2018 • Christopher Tosh, Sanjoy Dasgupta

In this work, we describe a framework that unifies many different interactive learning tasks.

Active Learning

Paper
Add Code

Comparison Based Learning from Weak Oracles

no code implementations • 20 Feb 2018 • Ehsan Kazemi, Lin Chen, Sanjoy Dasgupta, Amin Karbasi

More specifically, we aim at devising efficient algorithms to locate a target object in a database equipped with a dissimilarity metric via invocation of the weak comparison oracle.

Paper
Add Code

Learning from partial correction

no code implementations • 23 May 2017 • Sanjoy Dasgupta, Michael Luby

We introduce a new model of interactive learning in which an expert examines the predictions of a learner and partially fixes them if they are wrong.

Generalization Bounds

Paper
Add Code

Diameter-Based Active Learning

no code implementations • ICML 2017 • Christopher Tosh, Sanjoy Dasgupta

To date, the tightest upper and lower-bounds for the active learning of general concept classes have been in terms of a parameter of the learning problem called the splitting index.

Active Learning

Paper
Add Code

An algorithm for L1 nearest neighbor search via monotonic embedding

no code implementations • NeurIPS 2016 • Xinan Wang, Sanjoy Dasgupta

Fast algorithms for nearest neighbor (NN) search have in large part focused on L2 distance.

Paper
Add Code

Interactive Bayesian Hierarchical Clustering

no code implementations • 10 Feb 2016 • Sharad Vikram, Sanjoy Dasgupta

Clustering is a powerful tool in data analysis, but it is often difficult to find a grouping that aligns with a user's needs.

Clustering

Paper
Add Code

A cost function for similarity-based hierarchical clustering

no code implementations • 16 Oct 2015 • Sanjoy Dasgupta

The development of algorithms for hierarchical clustering has been hampered by a shortage of precise objective functions.

Clustering

Paper
Add Code

The Fast Convergence of Incremental PCA

no code implementations • NeurIPS 2013 • Akshay Balsubramani, Sanjoy Dasgupta, Yoav Freund

We consider a situation in which we see samples in $\mathbb{R}^d$ drawn i. i. d.

Paper
Add Code

Optimal rates for k-NN density and mode estimation

no code implementations • NeurIPS 2014 • Sanjoy Dasgupta, Samory Kpotufe

We present two related contributions of independent interest: (1) high-probability finite sample rates for $k$-NN density estimation, and (2) practical mode estimators -- based on $k$-NN -- which attain minimax-optimal rates under surprisingly general distributional conditions.

Density Estimation

Paper
Add Code

Rates of Convergence for Nearest Neighbor Classification

no code implementations • NeurIPS 2014 • Kamalika Chaudhuri, Sanjoy Dasgupta

Nearest neighbor methods are a popular class of nonparametric estimators with several desirable properties, such as adaptivity to different distance scales in different regions of space.

Classification General Classification

Paper
Add Code

Incremental Clustering: The Case for Extra Clusters

no code implementations • NeurIPS 2014 • Margareta Ackerman, Sanjoy Dasgupta

The explosion in the amount of data available for analysis often necessitates a transition from batch to incremental clustering methods, which process one element at a time and typically store only a small subset of the data.

Clustering

Paper
Add Code

Consistent procedures for cluster tree estimation and pruning

no code implementations • 5 Jun 2014 • Kamalika Chaudhuri, Sanjoy Dasgupta, Samory Kpotufe, Ulrike Von Luxburg

For a density $f$ on ${\mathbb R}^d$, a {\it high-density cluster} is any connected component of $\{x: f(x) \geq \lambda\}$, for some $\lambda > 0$.

Clustering

Paper
Add Code

Moment-based Uniform Deviation Bounds for k-means and Friends

no code implementations • NeurIPS 2013 • Matus J. Telgarsky, Sanjoy Dasgupta

Suppose $k$ centers are fit to $m$ points by heuristically minimizing the $k$-means cost; what is the corresponding fit over the source distribution?

Clustering

Paper
Add Code

Moment-based Uniform Deviation Bounds for $k$-means and Friends

1 code implementation • 8 Nov 2013 • Matus Telgarsky, Sanjoy Dasgupta

Suppose $k$ centers are fit to $m$ points by heuristically minimizing the $k$-means cost; what is the corresponding fit over the source distribution?

Clustering

Paper
Code

Rates of convergence for the cluster tree

no code implementations • NeurIPS 2010 • Kamalika Chaudhuri, Sanjoy Dasgupta

For a density f on R^d, a high-density cluster is any connected component of {x: f(x) >= c}, for some c > 0.

Paper
Add Code

A learning framework for nearest neighbor search

no code implementations • NeurIPS 2007 • Lawrence Cayton, Sanjoy Dasgupta

Can we leverage learning techniques to build a fast nearest-neighbor (NN) retrieval data structure?

Retrieval

Paper
Add Code

Learning the structure of manifolds using random projections

no code implementations • NeurIPS 2007 • Yoav Freund, Sanjoy Dasgupta, Mayank Kabra, Nakul Verma

We present a simple variant of the k-d tree which automatically adapts to intrinsic low dimensional structure in data.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.