Search Results for author: Curtis G. Northcutt

Found 5 papers, 4 papers with code

Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks

2 code implementations • 26 Mar 2021 • Curtis G. Northcutt, Anish Athalye, Jonas Mueller

Errors in test sets are numerous and widespread: we estimate an average of at least 3. 3% errors across the 10 datasets, where for example label errors comprise at least 6% of the ImageNet validation set.

BIG-bench Machine Learning

8,643

Paper
Code

Rapformer: Conditional Rap Lyrics Generation with Denoising Autoencoders

no code implementations • INLG (ACL) 2020 • Nikola I. Nikolov, Eric Malmi, Curtis G. Northcutt, Loreto Parisi

The ability to combine symbols to generate language is a defining characteristic of human intelligence, particularly in the context of artistic story-telling through lyrics.

Denoising Information Retrieval +1

Paper
Add Code

Comment Ranking Diversification in Forum Discussions

1 code implementation • 27 Feb 2020 • Curtis G. Northcutt, Kimberly A. Leon, Naichun Chen

We conducted a double-blind, small-scale evaluation experiment requiring subjects to select between the top 5 comments of a diversified ranking and a baseline ranking ordered by score.

Re-Ranking Semantic Similarity +1

Paper
Code

Confident Learning: Estimating Uncertainty in Dataset Labels

4 code implementations • 31 Oct 2019 • Curtis G. Northcutt, Lu Jiang, Isaac L. Chuang

Confident learning (CL) is an alternative approach which focuses instead on label quality by characterizing and identifying label errors in datasets, based on the principles of pruning noisy data, counting with probabilistic thresholds to estimate noise, and ranking examples to train with confidence.

Learning with noisy labels Sentiment Analysis +1

8,643

Paper
Code

Learning with Confident Examples: Rank Pruning for Robust Classification with Noisy Labels

2 code implementations • 4 May 2017 • Curtis G. Northcutt, Tailin Wu, Isaac L. Chuang

To highlight, RP with a CNN classifier can predict if an MNIST digit is a "one"or "not" with only 0. 25% error, and 0. 46 error across all digits, even when 50% of positive examples are mislabeled and 50% of observed positive labels are mislabeled negative examples.

Binary Classification General Classification +2

8,643

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.