Two-sample testing

76 papers with code • 5 benchmarks • 1 datasets

In statistical hypothesis testing, a two-sample test is a test performed on the data of two random samples, each independently obtained from a different given population. The purpose of the test is to determine whether the difference between these two populations is statistically significant. The statistics used in two-sample tests can be used to solve many machine learning problems, such as domain adaptation, covariate shift and generative adversarial networks.

Most implemented papers

A Meta-Analysis of the Anomaly Detection Problem

yaroslav-moiseev/evidence-based-possibly-best-practices-in-classical-ML 3 Mar 2015

The intended contributions of this article are many; in addition to providing a large publicly-available corpus of anomaly detection benchmarks, we provide an ontology for describing anomaly detection contexts, a methodology for controlling various aspects of benchmark creation, guidelines for future experimental design and a discussion of the many potential pitfalls of trying to measure success in this field.

Sequential Nonparametric Testing with the Law of the Iterated Logarithm

sshekhar17/nonparametric-testing-by-betting 10 Jun 2015

It is novel in several ways: (a) it takes linear time and constant space to compute on the fly, (b) it has the same power guarantee as a non-sequential version of the test with the same computational constraints up to a small factor, and (c) it accesses only as many samples as are required - its stopping time adapts to the unknown difficulty of the problem.

Fast Two-Sample Testing with Analytic Representations of Probability Measures

kacperChwialkowski/analyticMeanEmbeddings NeurIPS 2015

The new tests are consistent against a larger class of alternatives than the previous linear-time tests based on the (non-smoothed) empirical characteristic functions, while being much faster than the current state-of-the-art quadratic-time kernel-based or energy distance-based tests.

On Wasserstein Two Sample Testing and Related Families of Nonparametric Tests

marcodeangelis/Area-metric 8 Sep 2015

In this work, our central object is the Wasserstein distance, as we form a chain of connections from univariate methods like the Kolmogorov-Smirnov test, PP/QQ plots and ROC/ODC curves, to multivariate tests involving energy statistics and kernel based maximum mean discrepancy.

Interpretability of Multivariate Brain Maps in Brain Decoding: Definition and Quantification

smkia/interpretability 29 Mar 2016

In this paper, first, we present a theoretical definition of interpretability in brain decoding; we show that the interpretability of multivariate brain maps can be decomposed into their reproducibility and representativeness.

A U-statistic Approach to Hypothesis Testing for Structure Discovery in Undirected Graphical Models

wbounliphone/Ustatistics_Approach_For_SD 6 Apr 2016

For some class of probability distributions, an edge between two variables is present if and only if the corresponding entry in the precision matrix is non-zero.

Efficient Nonparametric Smoothness Estimation

sss1/SobolevEstimation NeurIPS 2016

Sobolev quantities (norms, inner products, and distances) of probability density functions are important in the theory of nonparametric statistics, but have rarely been used in practice, partly due to a lack of practical estimators.

Statistical comparison of classifiers through Bayesian hierarchical modelling

BayesianTestsML/tutorial 28 Sep 2016

Usually one compares the accuracy of two competing classifiers via null hypothesis significance tests (nhst).

Priv'IT: Private and Sample Efficient Identity Testing

hoonose/privit 29 Mar 2017

We develop differentially private hypothesis testing methods for the small sample regime.

Data-adaptive statistics for multiple hypothesis testing in high-dimensional settings

wilsoncai1992/adaptest 24 Apr 2017

Current statistical inference problems in areas like astronomy, genomics, and marketing routinely involve the simultaneous testing of thousands -- even millions -- of null hypotheses.