no code implementations • ICML 2020 • Benjamin Coleman, Anshumali Shrivastava, Richard Baraniuk
We present the first sublinear memory sketch that can be queried to find the nearest neighbors in a dataset.
no code implementations • 15 Feb 2024 • Noveen Sachdeva, Benjamin Coleman, Wang-Cheng Kang, Jianmo Ni, Lichan Hong, Ed H. Chi, James Caverlee, Julian McAuley, Derek Zhiyuan Cheng
The training of large language models (LLMs) is expensive.
no code implementations • 22 Nov 2023 • Shabnam Daghaghi, Benjamin Coleman, Benito Geordie, Anshumali Shrivastava
To address this problem, we propose a novel sampling distribution based on nonparametric kernel regression that learns an effective importance score as the neural network trains.
1 code implementation • 29 Aug 2023 • Gaurav Gupta, Jonah Yi, Benjamin Coleman, Chen Luo, Vihan Lakshman, Anshumali Shrivastava
With the surging popularity of approximate near-neighbor search (ANNS), driven by advances in neural representation learning, the ability to serve queries accompanied by a set of constraints has become an area of intense interest.
no code implementations • 26 May 2023 • Benjamin Coleman, David Torres Ramos, Vihan Lakshman, Chen Luo, Anshumali Shrivastava
Lookup tables are a fundamental structure in many data processing and systems applications.
no code implementations • NeurIPS 2023 • Benjamin Coleman, Wang-Cheng Kang, Matthew Fahrbach, Ruoxi Wang, Lichan Hong, Ed H. Chi, Derek Zhiyuan Cheng
Learning high-quality feature embeddings efficiently and effectively is critical for the performance of web-scale machine learning systems.
2 code implementations • 30 Mar 2023 • Nicholas Meisburger, Vihan Lakshman, Benito Geordie, Joshua Engels, David Torres Ramos, Pratik Pranav, Benjamin Coleman, Benjamin Meisburger, Shubh Gupta, Yashwanth Adunukota, Tharun Medini, Anshumali Shrivastava
Efficient large-scale neural network training and inference on commodity CPU hardware is of immense practical significance in democratizing deep learning (DL) capabilities.
Ranked #2 on Node Classification on Yelp-Fraud
no code implementations • 29 Sep 2021 • Gaurav Gupta, Benjamin Coleman, John Chen, Anshumali Shrivastava
To this end, we propose STORM, an online sketching-based method for empirical risk minimization.
no code implementations • 21 Jun 2021 • Zichang Liu, Benjamin Coleman, Anshumali Shrivastava
Large machine learning models achieve unprecedented performance on various tasks and have evolved as the go-to technique.
no code implementations • 24 Feb 2021 • Aditya Desai, Benjamin Coleman, Anshumali Shrivastava
We introduce Density sketches (DS): a succinct online summary of the data distribution.
no code implementations • 21 Jul 2020 • Louis Abraham, Gary Becigneul, Benjamin Coleman, Bernhard Scholkopf, Anshumali Shrivastava, Alexander Smola
Group testing is a well-studied problem with several appealing solutions, but recent biological studies impose practical constraints for COVID-19 that are incompatible with traditional methods.
no code implementations • 25 Jun 2020 • Benjamin Coleman, Gaurav Gupta, John Chen, Anshumali Shrivastava
To this end, we propose STORM, an online sketch for empirical risk minimization.
no code implementations • 16 Jun 2020 • Benjamin Coleman, Anshumali Shrivastava
Existing methods for DP kernel density estimation scale poorly, often exponentially slower with an increase in dimensions.
no code implementations • 4 Dec 2019 • Benjamin Coleman, Anshumali Shrivastava
We evaluate our method on real-world high-dimensional datasets and show that our sketch achieves 10x better compression compared to competing methods.
1 code implementation • 10 Oct 2019 • Gaurav Gupta, Minghao Yan, Benjamin Coleman, Bryce Kille, R. A. Leo Elworth, Tharun Medini, Todd Treangen, Anshumali Shrivastava
Interestingly, it is a count-min sketch type arrangement of a membership testing utility (Bloom Filter in our case).
1 code implementation • 7 Oct 2019 • Gaurav Gupta, Benjamin Coleman, Tharun Medini, Vijai Mohan, Anshumali Shrivastava
A simple array of Bloom Filters can achieve that.
no code implementations • 18 Feb 2019 • Benjamin Coleman, Richard G. Baraniuk, Anshumali Shrivastava
We present the first sublinear memory sketch that can be queried to find the nearest neighbors in a dataset.