no code implementations • 18 Apr 2025 • Quentin Romero Lauro, Shreya Shankar, Sepanta Zeighami, Aditya Parameswaran
Retrieval-augmented generation (RAG) pipelines have become the de-facto approach for building AI assistants with access to external, domain-specific knowledge.
no code implementations • 18 Feb 2025 • Sepanta Zeighami, Yiming Lin, Shreya Shankar, Aditya Parameswaran
Such data systems do as they are told, but fail to understand and leverage what the LLM is being asked to do (i. e. the underlying operations, which may be error-prone), the data the LLM is operating on (e. g., long, complex documents), or what the user really needs.
no code implementations • 9 Nov 2024 • Sepanta Zeighami, Cyrus Shahabi
Machine learning models have demonstrated substantial performance enhancements over non-learned alternatives in various fundamental data management operations, including indexing (locating items in an array), cardinality estimation (estimating the number of matching records in a database), and range-sum estimation (estimating aggregate attribute values for query-matched records).
no code implementations • 9 Nov 2024 • Sepanta Zeighami, Cyrus Shahahbi
Use of machine learning to perform database operations, such as indexing, cardinality estimation, and sorting, is shown to provide substantial performance benefits.
1 code implementation • 4 Sep 2024 • Sepanta Zeighami, Zac Wellmer, Aditya Parameswaran
Existing approaches either fine-tune the pre-trained model itself or, more efficiently, but at the cost of accuracy, train adaptor models to transform the output of the pre-trained model.
no code implementations • 17 Feb 2024 • Sepanta Zeighami, Cyrus Shahabi
We show that statistical debiasing, although in some cases useful, often fails to improve accuracy.
no code implementations • 19 Jun 2023 • Sepanta Zeighami, Cyrus Shahabi
In this paper, we significantly strengthen this result, showing that under mild assumptions on data distribution, and the same space complexity as non-learned methods, learned indexes can answer queries in $O(\log\log n)$ expected query time.
1 code implementation • 14 Dec 2020 • Sirisha Rambhatla, Sepanta Zeighami, Kameron Shahabi, Cyrus Shahabi, Yan Liu
As countries look towards re-opening of economic activities amidst the ongoing COVID-19 pandemic, ensuring public health has been challenging.
no code implementations • 25 Sep 2019 • Zac Wellmer, Sepanta Zeighami, James Kwok
However, decision-time planning with implicit dynamics models in continuous action space has proven to be a difficult problem.
3 code implementations • 18 Oct 2018 • Sepanta Zeighami, Raymong Chi-Wing Wong
This problem takes into account the probability distribution of the users and considers the satisfaction (ratio) of all users, which is more reasonable in practice, compared with the existing studies that only consider the worst-case satisfaction (ratio) of the users, which may not reflect the whole population and is not useful in some applications.