no code implementations • 19 Feb 2025 • Haike Xu, Magdalen Dobson Manohar, Philip A. Bernstein, Badrish Chandramouli, Richard Wen, Harsha Vardhan Simhadri
However, it is challenging to update such graph index at a high rate, while supporting stable recall after many updates.
no code implementations • 30 Oct 2024 • Anders Aamand, Alexandr Andoni, Justin Y. Chen, Piotr Indyk, Shyam Narayanan, Sandeep Silwal, Haike Xu
In particular, if an algorithm uses $O(n/\log^c k)$ samples for some constant $c>0$ and polynomial space, then the query time of the data structure must be at least $k^{1-O(1)/\log \log k}$, i. e., close to linear in the number of distributions $k$.
no code implementations • 15 Jun 2024 • Haike Xu, Zongyu Lin, Yizhou Sun, Kai-Wei Chang, Piotr Indyk
Our experiments demonstrate the efficacy of our approach not only in contradiction retrieval with more than 30% accuracy improvements on MSMARCO and HotpotQA across different model architectures but also in applications such as cleaning corrupted corpora to restore high-quality QA retrieval.
1 code implementation • 5 Jun 2024 • Haike Xu, Sandeep Silwal, Piotr Indyk
In both cases we show that, as long as the proxy metric used to construct the data structure approximates the ground-truth metric up to a bounded factor, our data structure achieves arbitrarily good approximation guarantees with respect to the ground-truth metric.
1 code implementation • NeurIPS 2023 • Piotr Indyk, Haike Xu
Graph-based approaches to nearest neighbor search are popular and powerful tools for handling large datasets in practice, but they have limited theoretical guarantees.
1 code implementation • 15 Nov 2022 • Haike Xu, Zongyu Lin, Jing Zhou, Yanan Zheng, Zhilin Yang
In the finetuning setting, our approach also achieves new state-of-the-art results on a wide range of NLP tasks, with only 1/4 parameters of previous methods.
no code implementations • 19 Nov 2021 • Talya Eden, Piotr Indyk, Haike Xu
In particular, we consider heuristics induced by norm embeddings and distance labeling schemes, and provide lower bounds for the tradeoffs between the number of dimensions or bits used to represent each graph node, and the running time of the A* algorithm.
no code implementations • 12 Jun 2021 • Haike Xu, Jian Li
Our algorithm achieves an (approximation) regret bound of $\tilde{O}\left(d\sqrt{KT}\right)$.
no code implementations • 9 Feb 2021 • Haike Xu, Tengyu Ma, Simon S. Du
We further show that for general MDPs, AMB suffers an additional $\frac{|Z_{mul}|}{\Delta_{min}}$ regret, where $Z_{mul}$ is the set of state-action pairs $(s, a)$'s satisfying $a$ is a non-unique optimal action for $s$.