no code implementations • 16 Jun 2022 • Peter Bartlett, Piotr Indyk, Tal Wagner
Our techniques are general, and provide generalization bounds for many other recently proposed data-driven algorithms in numerical linear algebra, covering both sketching-based and multigrid-based methods.
no code implementations • 9 Jun 2022 • Yi Zhang, Arturs Backurs, Sébastien Bubeck, Ronen Eldan, Suriya Gunasekar, Tal Wagner
We propose a synthetic task, LEGO (Learning Equality and Group Operations), that encapsulates the problem of following a chain of reasoning, and we study how the transformer architecture learns this task.
no code implementations • ICLR 2022 • Justin Y. Chen, Talya Eden, Piotr Indyk, Honghao Lin, Shyam Narayanan, Ronitt Rubinfeld, Sandeep Silwal, Tal Wagner, David P. Woodruff, Michael Zhang
We propose data-driven one-pass streaming algorithms for estimating the number of triangles and four cycles, two fundamental problems in graph analytics that are widely studied in the graph data stream literature.
no code implementations • NeurIPS 2021 • Piotr Indyk, Tal Wagner, David Woodruff
Recently, data-driven and learning-based algorithms for low rank matrix approximation were shown to outperform classical data-oblivious algorithms by wide margins in terms of accuracy.
no code implementations • ICLR 2021 • Talya Eden, Piotr Indyk, Shyam Narayanan, Ronitt Rubinfeld, Sandeep Silwal, Tal Wagner
We consider the problem of estimating the number of distinct elements in a large data set (or, equivalently, the support size of the distribution induced by the data set) from a random sample of its elements.
no code implementations • 16 Feb 2021 • Arturs Backurs, Piotr Indyk, Cameron Musco, Tal Wagner
In particular, we consider estimating the sum of kernel matrix entries, along with its top eigenvalue and eigenvector.
1 code implementation • NeurIPS 2019 • Arturs Backurs, Piotr Indyk, Tal Wagner
We instantiate our framework with the Laplacian and Exponential kernels, two popular kernels which possess the aforementioned property.
1 code implementation • ICML 2020 • Arturs Backurs, Yihe Dong, Piotr Indyk, Ilya Razenshteyn, Tal Wagner
Our extensive experiments, on real-world text and image datasets, show that Flowtree improves over various baselines and existing methods in either running time or accuracy.
Data Structures and Algorithms
no code implementations • 2 Jun 2019 • Piotr Indyk, Ali Vakilian, Tal Wagner, David Woodruff
Recent work by Bakshi and Woodruff (NeurIPS 2018) showed it is possible to compute a rank-$k$ approximation of a distance matrix in time $O((n+m)^{1+\gamma}) \cdot \mathrm{poly}(k, 1/\epsilon)$, where $\epsilon>0$ is an error parameter and $\gamma>0$ is an arbitrarily small constant.
1 code implementation • 10 Feb 2019 • Arturs Backurs, Piotr Indyk, Krzysztof Onak, Baruch Schieber, Ali Vakilian, Tal Wagner
In the fair variant of $k$-median, the points are colored, and the goal is to minimize the same average distance objective while ensuring that all clusters have an "approximately equal" number of points of each color.
1 code implementation • ICLR 2020 • Yihe Dong, Piotr Indyk, Ilya Razenshteyn, Tal Wagner
Space partitions of $\mathbb{R}^d$ underlie a vast and important class of fast nearest neighbor search (NNS) algorithms.
no code implementations • ICML 2018 • Tal Wagner, Sudipto Guha, Shiva Kasiviswanathan, Nina Mishra
We consider the problem of labeling points on a fast-moving data stream when only a small number of labeled examples are available.
no code implementations • NeurIPS 2017 • Piotr Indyk, Ilya Razenshteyn, Tal Wagner
We introduce a new distance-preserving compact representation of multi-dimensional point-sets.
no code implementations • NeurIPS 2017 • Noga Alon, Daniel Reichman, Igor Shinkar, Tal Wagner, Sebastian Musslick, Jonathan D. Cohen, Tom Griffiths, Biswadip Dey, Kayhan Ozcimder
A key feature of neural network architectures is their ability to support the simultaneous interaction among large numbers of units in the learning and processing of representations.
no code implementations • NeurIPS 2012 • Koby Crammer, Tal Wagner
We introduce a large-volume box classification for binary prediction, which maintains a subset of weight vectors, and specifically axis-aligned boxes.