# Linear Classification and Selective Sampling Under Low Noise Conditions

We provide a new analysis of an efficient margin-based algorithm for selective sampling in classification problems. Using the so-called Tsybakov low noise condition to parametrize the instance distribution, we show bounds on the convergence rate to the Bayes risk of both the fully supervised and the selective sampling versions of the basic algorithm. Our analysis reveals that, excluding logarithmic factors, the average risk of the selective sampler converges to the Bayes risk at rate $n^{-(1+\alpha)/(3+\alpha)}$, with labels being sampled at the same rate (here $n$ denotes the sample size, and $\alpha > 0$ is the exponent in the low noise condition). We compare this convergence rate to the rate $n^{-(1+\alpha)/(2+\alpha)}$ achieved by the fully supervised algorithm using all labels. Experiments on textual data reveal that simple variants of the proposed selective sampler perform much better than popular and similarly efficient competitors.

PDF Abstract

## Code Add Remove Mark official

No code implementations yet. Submit your code now

## Datasets

Add Datasets introduced or used in this paper

## Results from the Paper Add Remove

Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.