no code implementations • 7 Jan 2025 • Zelin Zhou, Simone Conia, Daniel Lee, Min Li, Shenglei Huang, Umar Farooq Minhas, Saloni Potdar, Henry Xiao, Yunyao Li
Multilingual knowledge graphs (KGs) provide high-quality relational and textual information for various NLP applications, but they are often incomplete, especially in non-English languages.
no code implementations • 1 Nov 2024 • Jason Mohoney, Anil Pacaci, Shihabur Rahman Chowdhury, Umar Farooq Minhas, Jeffery Pound, Cedric Renggli, Nima Reyhani, Ihab F. Ilyas, Theodoros Rekatsinas, Shivaram Venkataraman
The prevalence of vector similarity search in modern machine learning applications and the continuously changing nature of data processed by these applications necessitate efficient and effective index maintenance techniques for vector search indexes.
no code implementations • 17 Oct 2024 • Simone Conia, Daniel Lee, Min Li, Umar Farooq Minhas, Saloni Potdar, Yunyao Li
Translating text that contains entity names is a challenging task, as cultural-related references can vary significantly across languages.
no code implementations • 2 Apr 2024 • Junxiong Wang, Ali Mousavi, Omar Attia, Ronak Pradeep, Saloni Potdar, Alexander M. Rush, Umar Farooq Minhas, Yunyao Li
Existing generative approaches demonstrate improved accuracy compared to classification approaches under the standardized ZELDA benchmark.
Ranked #1 on
Entity Linking
on KORE50
(Micro-F1 strong metric)
1 code implementation • 27 Nov 2023 • Simone Conia, Min Li, Daniel Lee, Umar Farooq Minhas, Ihab Ilyas, Yunyao Li
Recent work in Natural Language Processing and Computer Vision has been using textual information -- e. g., entity names and descriptions -- available in knowledge graphs to ground neural models to high-quality structured data.
no code implementations • 16 May 2023 • Ihab F. Ilyas, JP Lacerda, Yunyao Li, Umar Farooq Minhas, Ali Mousavi, Jeffrey Pound, Theodoros Rekatsinas, Chiraag Sumanth
We then describe how our platform, including graph embeddings, can be leveraged to create a Semantic Annotation service that links unstructured Web documents to entities in our KG.
no code implementations • 4 Apr 2023 • Jason Mohoney, Anil Pacaci, Shihabur Rahman Chowdhury, Ali Mousavi, Ihab F. Ilyas, Umar Farooq Minhas, Jeffrey Pound, Theodoros Rekatsinas
Motivated by the tasks of finding related KG queries and entities for past KG query workloads, we focus on hybrid vector similarity search (hybrid queries for short) where part of the query corresponds to vector similarity search and part of the query corresponds to predicates over relational attributes associated with the underlying data vectors.
no code implementations • 29 Nov 2021 • Benjamin Spector, Andreas Kipf, Kapil Vaidya, Chi Wang, Umar Farooq Minhas, Tim Kraska
RSS achieves this by using the minimal string prefix to sufficiently distinguish the data unlike most learned approaches which index the entire string.
no code implementations • 22 Apr 2020 • Zongheng Yang, Badrish Chandramouli, Chi Wang, Johannes Gehrke, Yi-Nan Li, Umar Farooq Minhas, Per-Åke Larson, Donald Kossmann, Rajeev Acharya
For a given workload, however, such techniques are unable to optimize for the important metric of the number of blocks accessed by a query.
no code implementations • 21 May 2019 • Jialin Ding, Umar Farooq Minhas, JIA YU, Chi Wang, Jaeyoung Do, Yi-Nan Li, Hantian Zhang, Badrish Chandramouli, Johannes Gehrke, Donald Kossmann, David Lomet, Tim Kraska
The original work by Kraska et al. shows that a learned index beats a B+Tree by a factor of up to three in search time and by an order of magnitude in memory footprint.