Search Results for author: Raul Castro Fernandez

Found 6 papers, 3 papers with code

A Data-Centric Online Market for Machine Learning: From Discovery to Pricing

no code implementations27 Oct 2023 Minbiao Han, Jonathan Light, Steven Xia, Sainyam Galhotra, Raul Castro Fernandez, Haifeng Xu

We envision that the synergy of our data and model discovery algorithm and pricing mechanism will be an important step towards building a new data-centric online market that serves ML users effectively.

Model Discovery

Addressing Budget Allocation and Revenue Allocation in Data Market Environments Using an Adaptive Sampling Algorithm

1 code implementation5 Jun 2023 Boxin Zhao, Boxiang Lyu, Raul Castro Fernandez, Mladen Kolar

Data markets help with identifying valuable training data: model consumers pay to train a model, the market uses that budget to identify data and train the model (the budget allocation problem), and finally the market compensates data providers according to their data contribution (revenue allocation problem).

Fraud Detection

METAM: Goal-Oriented Data Discovery

no code implementations18 Apr 2023 Sainyam Galhotra, Yue Gong, Raul Castro Fernandez

Data is a central component of machine learning and causal inference tasks.

Causal Inference

Solo: Data Discovery Using Natural Language Questions Via A Self-Supervised Approach

2 code implementations9 Jan 2023 Qiming Wang, Raul Castro Fernandez

All in all, the technique is a stepping stone towards building learned discovery systems.

ARDA: Automatic Relational Data Augmentation for Machine Learning

1 code implementation21 Mar 2020 Nadiia Chepurko, Ryan Marcus, Emanuel Zgraggen, Raul Castro Fernandez, Tim Kraska, David Karger

Our system has two distinct components: (1) a framework to search and join data with the input data, based on various attributes of the input, and (2) an efficient feature selection algorithm that prunes out noisy or irrelevant features from the resulting join.

BIG-bench Machine Learning Data Augmentation +2

Smallify: Learning Network Size while Training

no code implementations10 Jun 2018 Guillaume Leclerc, Manasi Vartak, Raul Castro Fernandez, Tim Kraska, Samuel Madden

As neural networks become widely deployed in different applications and on different hardware, it has become increasingly important to optimize inference time and model size along with model accuracy.

Cannot find the paper you are looking for? You can Submit a new open access paper.