Search Results for author: Stephen Mussmann

Found 17 papers, 8 papers with code

Active Learning with Expected Error Reduction

no code implementations17 Nov 2022 Stephen Mussmann, Julia Reisler, Daniel Tsai, Ehsan Mousavi, Shayne O'Brien, Moises Goldszmidt

In this paper we reformulate EER under the lens of Bayesian active learning and derive a computationally efficient version that can use any Bayesian parameter sampling method (such as arXiv:1506. 02142).

Active Learning

Comparing the Value of Labeled and Unlabeled Data in Method-of-Moments Latent Variable Estimation

1 code implementation3 Mar 2021 Mayee F. Chen, Benjamin Cohen-Wang, Stephen Mussmann, Frederic Sala, Christopher Ré

We apply our decomposition framework to three scenarios -- well-specified, misspecified, and corrected models -- to 1) choose between labeled and unlabeled data and 2) learn from their combination.

On the Importance of Adaptive Data Collection for Extremely Imbalanced Pairwise Tasks

1 code implementation Findings of the Association for Computational Linguistics 2020 Stephen Mussmann, Robin Jia, Percy Liang

Many pairwise classification tasks, such as paraphrase detection and open-domain question answering, naturally have extreme label imbalance (e. g., $99. 99\%$ of examples are negatives).

Active Learning Open-Domain Question Answering +1

Concept Bottleneck Models

4 code implementations ICML 2020 Pang Wei Koh, Thao Nguyen, Yew Siang Tang, Stephen Mussmann, Emma Pierson, Been Kim, Percy Liang

We seek to learn models that we can interact with using high-level concepts: if the model did not think there was a bone spur in the x-ray, would it still predict severe arthritis?

Selection via Proxy: Efficient Data Selection for Deep Learning

1 code implementation ICLR 2020 Cody Coleman, Christopher Yeh, Stephen Mussmann, Baharan Mirzasoleiman, Peter Bailis, Percy Liang, Jure Leskovec, Matei Zaharia

By removing hidden layers from the target model, using smaller architectures, and training for fewer epochs, we create proxies that are an order of magnitude faster to train.

Active Learning Computational Efficiency

A Tight Analysis of Greedy Yields Subexponential Time Approximation for Uniform Decision Tree

no code implementations26 Jun 2019 Ray Li, Percy Liang, Stephen Mussmann

The greedy algorithm's $O(\log n)$ approximation ratio was the best known, but the largest approximation ratio known to be NP-hard is $4-\varepsilon$.

Active Learning

Select Via Proxy: Efficient Data Selection For Training Deep Networks

no code implementations ICLR 2019 Cody Coleman, Stephen Mussmann, Baharan Mirzasoleiman, Peter Bailis, Percy Liang, Jure Leskovec, Matei Zaharia

In our approach, we first train a small proxy model quickly, which we then use to estimate the utility of individual training data points, and then select the most informative ones for training the large target model.

BIG-bench Machine Learning Image Classification +1

Uncertainty Sampling is Preconditioned Stochastic Gradient Descent on Zero-One Loss

1 code implementation NeurIPS 2018 Stephen Mussmann, Percy Liang

Uncertainty sampling, a popular active learning algorithm, is used to reduce the amount of data required to learn a classifier, but it has been observed in practice to converge to different parameters depending on the initialization and sometimes to even better parameters than standard training on all the data.

Active Learning

The price of debiasing automatic metrics in natural language evalaution

no code implementations ACL 2018 Arun Chaganty, Stephen Mussmann, Percy Liang

For evaluating generation systems, automatic metrics such as BLEU cost nothing to run but have been shown to correlate poorly with human judgment, leading to systematic bias against certain model improvements.

Abstractive Text Summarization Image Captioning +1

On the Relationship between Data Efficiency and Error for Uncertainty Sampling

1 code implementation ICML 2018 Stephen Mussmann, Percy Liang

While active learning offers potential cost savings, the actual data efficiency---the reduction in amount of labeled data needed to obtain the same error rate---observed in practice is mixed.

Active Learning regression

Generalized Binary Search For Split-Neighborly Problems

no code implementations27 Feb 2018 Stephen Mussmann, Percy Liang

In sequential hypothesis testing, Generalized Binary Search (GBS) greedily chooses the test with the highest information gain at each step.

Two-sample testing

Understanding Trajectory Behavior: A Motion Pattern Approach

no code implementations4 Jan 2015 Mahdi M. Kalayeh, Stephen Mussmann, Alla Petrakova, Niels da Vitoria Lobo, Mubarak Shah

In the second phase, via a Kmeans clustering approach, we create motion components by clustering the flow vectors with respect to their location and velocity.

Clustering Trajectory Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.