Search Results for author: Sam Madden

Found 13 papers, 4 papers with code

SEED: Domain-Specific Data Curation With Large Language Models

no code implementations1 Oct 2023 Zui Chen, Lei Cao, Sam Madden, Tim Kraska, Zeyuan Shang, Ju Fan, Nan Tang, Zihui Gu, Chunwei Liu, Michael Cafarella

SEED uses these generated modules to process most of the data records and dynamically decides when the LLM should step in to directly process some individual records, possibly using the data-access modules to retrieve relevant information from the data sources to assist the LLM in solving the task.

Code Generation Imputation +1

Lingua Manga: A Generic Large Language Model Centric System for Data Curation

no code implementations20 Jun 2023 Zui Chen, Lei Cao, Sam Madden

Data curation is a wide-ranging area which contains many critical but time-consuming data processing tasks.

Language Modelling Large Language Model

RoTaR: Efficient Row-Based Table Representation Learning via Teacher-Student Training

no code implementations20 Jun 2023 Zui Chen, Lei Cao, Sam Madden

In addition to the row-based architecture, we introduce several techniques: cell-aware position embedding, teacher-student training paradigm, and selective backward to improve the performance of RoTaR model.

Position Representation Learning

Interleaving Pre-Trained Language Models and Large Language Models for Zero-Shot NL2SQL Generation

1 code implementation15 Jun 2023 Zihui Gu, Ju Fan, Nan Tang, Songyue Zhang, Yuxin Zhang, Zui Chen, Lei Cao, Guoliang Li, Sam Madden, Xiaoyong Du

PLMs can perform well in schema alignment but struggle to achieve complex reasoning, while LLMs is superior in complex reasoning tasks but cannot achieve precise schema alignment.

Self-Supervised Multi-Object Tracking with Cross-Input Consistency

1 code implementation NeurIPS 2021 Favyen Bastani, Songtao He, Sam Madden

In this paper, we propose a self-supervised learning procedure for training a robust multi-object tracking (MOT) model given only unlabeled video.

Multi-Object Tracking Self-Supervised Learning

Updating Street Maps using Changes Detected in Satellite Imagery

no code implementations13 Oct 2021 Favyen Bastani, Songtao He, Satvat Jagwani, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden, Mohammad Amin Sadeghi

To address this challenge, much work has studied automatically processing geospatial data sources such as GPS trajectories and satellite images to reduce the cost of maintaining digital maps.

Beyond Road Extraction: A Dataset for Map Update using Aerial Images

1 code implementation ICCV 2021 Favyen Bastani, Sam Madden

The increasing availability of satellite and aerial imagery has sparked substantial interest in automatically updating street maps by processing aerial images.

RPT: Relational Pre-trained Transformer Is Almost All You Need towards Democratizing Data Preparation

no code implementations4 Dec 2020 Nan Tang, Ju Fan, Fangyi Li, Jianhong Tu, Xiaoyong Du, Guoliang Li, Sam Madden, Mourad Ouzzani

RPT is pre-trained for a tuple-to-tuple model by corrupting the input tuple and then learning a model to reconstruct the original tuple.

Denoising Entity Resolution +4

Inferring and Improving Street Maps with Data-Driven Automation

no code implementations2 Oct 2019 Favyen Bastani, Songtao He, Satvat Jagwani, Edward Park, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden, Mohammad Amin Sadeghi

Through an evaluation on a large-scale dataset including satellite imagery, GPS trajectories, and ground-truth map data in forty cities, we show that Mapster makes automation practical for map editing, and enables the curation of map datasets that are more complete and up-to-date at less cost.

Machine-Assisted Map Editing

no code implementations17 Jun 2019 Favyen Bastani, Songtao He, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden

Systems to automatically infer road network graphs from aerial imagery and GPS trajectories have been proposed to improve coverage of road maps.

graph construction

TabulaROSA: Tabular Operating System Architecture for Massively Parallel Heterogeneous Compute Engines

no code implementations14 Jul 2018 Jeremy Kepner, Ron Brightwell, Alan Edelman, Vijay Gadepally, Hayden Jananthan, Michael Jones, Sam Madden, Peter Michaleas, Hamed Okhravi, Kevin Pedretti, Albert Reuther, Thomas Sterling, Mike Stonebraker

In this context, an operating system can be viewed as software that brokers and tracks the resources of the compute engines and is akin to a database management system.

Distributed, Parallel, and Cluster Computing Databases Operating Systems Performance

Cannot find the paper you are looking for? You can Submit a new open access paper.