1 code implementation • 23 Apr 2024 • Nirupan Ananthamurugan, Dat Duong, Philip George, Ankita Gupta, Sandeep Tata, Beliz Gunel
Summarizing comparative opinions about entities (e. g., hotels, phones) from a set of source reviews, often referred to as contrastive summarization, can considerably aid users in decision making.
no code implementations • 25 Mar 2024 • Beliz Gunel, James B. Wendt, Jing Xie, Yichao Zhou, Nguyen Vo, Zachary Fisher, Sandeep Tata
Users often struggle with decision-making between two options (A vs B), as it usually requires time-consuming research across multiple web pages.
no code implementations • 20 Dec 2022 • Jing Xie, James B. Wendt, Yichao Zhou, Seth Ebner, Sandeep Tata
Many business workflows require extracting important fields from form-like documents (e. g. bank statements, bills of lading, purchase orders, etc.).
no code implementations • 15 Nov 2022 • Zilong Wang, Yichao Zhou, Wei Wei, Chen-Yu Lee, Sandeep Tata
Understanding visually-rich business documents to extract structured data and automate business workflows has been receiving attention both in academia and industry.
no code implementations • 28 Oct 2022 • Yichao Zhou, James B. Wendt, Navneet Potti, Jing Xie, Sandeep Tata
A key bottleneck in building automatic extraction models for visually rich documents like invoices is the cost of acquiring the several thousand high-quality labeled documents that are needed to train a model with acceptable accuracy.
no code implementations • 7 Jan 2022 • Beliz Gunel, Navneet Potti, Sandeep Tata, James B. Wendt, Marc Najork, Jing Xie
Automating information extraction from form-like documents at scale is a pressing need due to its potential impact on automating business workflows across many industries like financial services, insurance, and healthcare.
2 code implementations • 7 Jan 2021 • Yichao Zhou, Ying Sheng, Nguyen Vo, Nick Edmonds, Sandeep Tata
There has been a steady need to precisely extract structured knowledge from the web (i. e. HTML documents).
no code implementations • 21 Oct 2020 • Bill Yuchen Lin, Ying Sheng, Nguyen Vo, Sandeep Tata
By combining these stages, FreeDOM is able to generalize to unseen sites after training on a small number of seed sites from that vertical without requiring expensive hand-crafted features over visual renderings of the page.
1 code implementation • ACL 2020 • Bodhisattwa Majumder, Navneet Potti, Sandeep Tata, James B. Wendt, Qi Zhao, Marc Najork
We propose a novel approach using representation learning for tackling the problem of extracting structured information from form-like document images.
no code implementations • 23 May 2020 • Abbas Kazerouni, Qi Zhao, Jing Xie, Sandeep Tata, Marc Najork
Furthermore, there is usually only a small amount of initial training data available when building machine-learned models to solve such problems.
no code implementations • 12 Mar 2011 • Jun Rao, Eugene J. Shekita, Sandeep Tata
Compared to an eventually consistent datastore, we show that Spinnaker can be as fast or even faster on reads and only 5% to 10% slower on writes.
Databases Distributed, Parallel, and Cluster Computing