Search Results for author: Yash Govind

Found 3 papers, 2 papers with code

Sparkly: A Simple yet Surprisingly Strong TF/IDF Blocker for Entity Matching

1 code implementation Proceedings of the VLDB Endowment 2023 Derek Paulsen, Yash Govind, AnHai Doan

We develop Sparkly, which uses Lucene to perform top-k tf/idf blocking in a distributed share-nothing fashion on a Spark cluster.

Blocking

Toward a System Building Agenda for Data Integration

no code implementations29 Sep 2017 AnHai Doan, Adel Ardalan, Jeffrey R. Ballard, Sanjib Das, Yash Govind, Pradap Konda, Han Li, Erik Paulson, Paul Suganthan G. C., Haojun Zhang

They provide tools to address the "pain points" of the steps, and tools are built on top of the Python data science and Big Data ecosystem (PyData).

Databases

Cannot find the paper you are looking for? You can Submit a new open access paper.