no code implementations • 4 Mar 2025 • Joyce Cahoon, Prerna Singh, Nick Litombe, Jonathan Larson, Ha Trinh, Yiwen Zhu, Andreas Mueller, Fotis Psallidas, Carlo Curino
In this work, we benchmark various graph-based retrieval-augmented generation (RAG) systems across a broad spectrum of query types, including OLTP-style (fact-based) and OLAP-style (thematic) queries, to address the complex demands of open-domain question answering (QA).
no code implementations • 15 Feb 2025 • Dasha Metropolitansky, Jonathan Larson
A common strategy for fact-checking long-form content generated by Large Language Models (LLMs) is extracting simple claims that can be verified independently.
no code implementations • 24 Jan 2025 • Cencheng Shen, Yuexiao Dong, Carey E. Priebe, Jonathan Larson, Ha Trinh, Youngser Park
We prove that the population principal graph encoder embedding preserves the conditional density of the vertex labels and that the population community score successfully distinguishes the principal communities.
no code implementations • 24 Jan 2025 • Cencheng Shen, Darren Edge, Jonathan Larson, Carey E. Priebe
This graph covariance quantifies temporal changes in dependence structures within categorical data and is established as a consistent dependence measure under the Bernoulli distribution.
no code implementations • 21 May 2024 • Cencheng Shen, Jonathan Larson, Ha Trinh, Carey E. Priebe
This paper introduces a refined graph encoder embedding method, enhancing the original graph encoder embedding through linear transformation, self-training, and hidden community recovery within observed communities.
3 code implementations • 24 Apr 2024 • Darren Edge, Ha Trinh, Newman Cheng, Joshua Bradley, Alex Chao, Apurva Mody, Steven Truitt, Jonathan Larson
To combine the strengths of these contrasting methods, we propose a Graph RAG approach to question answering over private text corpora that scales with both the generality of user questions and the quantity of source text to be indexed.
1 code implementation • 15 Dec 2023 • Xin Jin, Jonathan Larson, Weiwei Yang, Zhiqiang Lin
Binary code summarization, while invaluable for understanding code semantics, is challenging due to its labor-intensive nature.
2 code implementations • 28 Nov 2023 • Harsha Nori, Yin Tat Lee, Sheng Zhang, Dean Carignan, Richard Edgar, Nicolo Fusi, Nicholas King, Jonathan Larson, Yuanzhi Li, Weishung Liu, Renqian Luo, Scott Mayer McKinney, Robert Osazuwa Ness, Hoifung Poon, Tao Qin, Naoto Usuyama, Chris White, Eric Horvitz
We find that prompting innovation can unlock deeper specialist capabilities and show that GPT-4 easily tops prior leading results for medical benchmarks.
Ranked #3 on
Question Answering
on MedQA
(using extra training data)
1 code implementation • 3 May 2023 • Cencheng Shen, Jonathan Larson, Ha Trinh, Xihan Qin, Youngser Park, Carey E. Priebe
Analyzing large-scale time-series network data, such as social media and email communications, poses a significant challenge in understanding social dynamics, detecting anomalies, and predicting trends.
1 code implementation • 31 Mar 2023 • Cencheng Shen, Carey E. Priebe, Jonathan Larson, Ha Trinh
In this paper, we introduce a method called graph fusion embedding, designed for multi-graph embedding with shared vertex sets.
no code implementations • 1 Apr 2021 • Tiona Zuzul, Emily Cox Pahnke, Jonathan Larson, Patrick Bourke, Nicholas Caurvina, Neha Parikh Shah, Fereshteh Amini, Jeffrey Weston, Youngser Park, Joshua Vogelstein, Christopher White, Carey E. Priebe
Workplace communications around the world were drastically altered by Covid-19, related work-from-home orders, and the rise of remote work.
no code implementations • 16 Mar 2021 • Vivek Kurien George, Vikash Morar, Weiwei Yang, Jonathan Larson, Bryan Tower, Shweti Mahajan, Arkin Gupta, Christopher White, Gabriel A. Silva
The success of state-of-the-art machine learning is essentially all based on different variations of gradient descent algorithms that minimize some version of a cost or loss function.
1 code implementation • 23 Aug 2020 • Guodong Chen, Jesús Arroyo, Avanti Athreya, Joshua Cape, Joshua T. Vogelstein, Youngser Park, Chris White, Jonathan Larson, Weiwei Yang, Carey E. Priebe
We examine two related, complementary inference tasks: the detection of anomalous graphs within a time series, and the detection of temporally anomalous vertices.
Methodology
2 code implementations • 20 May 2020 • Hayden S. Helm, Amitabh Basu, Avanti Athreya, Youngser Park, Joshua T. Vogelstein, Carey E. Priebe, Michael Winding, Marta Zlatic, Albert Cardona, Patrick Bourke, Jonathan Larson, Marah Abdin, Piali Choudhury, Weiwei Yang, Christopher W. White
Learning to rank -- producing a ranked list of items specific to a query and with respect to a set of supervisory items -- is a problem of general interest.
1 code implementation • 27 Apr 2020 • Joshua T. Vogelstein, Jayanta Dey, Hayden S. Helm, Will LeVine, Ronak D. Mehta, Ali Geisa, Haoyin Xu, Gido M. van de Ven, Emily Chang, Chenyu Gao, Weiwei Yang, Bryan Tower, Jonathan Larson, Christopher M. White, Carey E. Priebe
But striving to avoid forgetting sets the goal unnecessarily low: the goal of lifelong learning, whether biological or artificial, should be to improve performance on all tasks (including past and future) with any new data.
no code implementations • 6 May 2019 • Joshua Agterberg, Youngser Park, Jonathan Larson, Christopher White, Carey E. Priebe, Vince Lyzinski
Given a pair of graphs $G_1$ and $G_2$ and a vertex set of interest in $G_1$, the vertex nomination (VN) problem seeks to find the corresponding vertices of interest in $G_2$ (if they exist) and produce a rank list of the vertices in $G_2$, with the corresponding vertices of interest in $G_2$ concentrating, ideally, at the top of the rank list.