Many ground-breaking advancements in machine learning can be attributed to the availability of a large volume of rich data.
We also introduce a novel tabular data augmentation method for self- and semi-supervised learning frameworks.
The clinical time-series setting poses a unique combination of challenges to data modeling and sharing.
While much attention has been given to the problem of estimating the effect of discrete interventions from observational data, relatively little work has been done in the setting of continuous-valued interventions, such as treatments associated with a dosage parameter.
Identifying when to give treatments to patients and how to select among multiple treatments over time are important medical problems with a few existing solutions.
In addition, patient recruitment can be difficult by the fact that clinical trials do not aim to provide a benefit to any given patient in the trial.
The second benefit is that, through analysis that we provide inthe paper, we can derive tighter differential privacy guarantees when several queriesare made to this mechanism.
The predictor network uses the observations selected by the selector network to predict a label, providing feedback to the selector network (well-selected variables should be predictive of the label).
In this paper, we present Lifelong Bayesian Optimization (LBO), an online, multitask Bayesian optimization (BO) algorithm designed to solve the problem of model selection for datasets arriving and evolving over time.
Machine learning has the potential to assist many communities in using the large datasets that are becoming more and more available.
Ranked #2 on Synthetic Data Generation on UCI Epileptic Seizure Recognition (using extra training data)
The advent of big data brings with it data with more and more dimensions and thus a growing need to be able to efficiently select which features to use for a variety of problems.
We demonstrate the capability of our model to perform feature selection, showing that it performs as well as the originally proposed knockoff generation model in the Gaussian setting and that it outperforms the original model in non-Gaussian settings, including on a real-world dataset.
Accordingly, we call our method Generative Adversarial Imputation Nets (GAIN).
Training complex machine learning models for prediction often requires a large amount of data that is not always readily available.
Estimating individualized treatment effects (ITE) is a challenging task due to the need for an individual's potential outcomes to be learned from biased data and without having access to the counterfactuals.