We present the NeurIPS 2021 consistency experiment, a larger-scale variant of the 2014 NeurIPS experiment in which 10% of conference submissions were reviewed by two independent committees to quantify the randomness in the review process.

In a top-tier computer science conference (NeurIPS 2021) with more than 23, 000 submitting authors and 9, 000 submitted papers, we survey the authors on three questions: (i) their predicted probability of acceptance for each of their papers, (ii) their perceived ranking of their own papers based on scientific contribution, and (iii) the change in their perception about their own papers after seeing the reviews.

Reproducibility, that is obtaining similar results as presented in a paper or talk, using the same code and data (when available), is a necessary step to verify the reliability of research findings.

Under the more challenging weak linear separability condition, we design an efficient algorithm with a mistake bound of $\min (2^{\widetilde{O}(K \log^2 (1/\gamma))}, 2^{\widetilde{O}(\sqrt{1/\gamma} \log K)})$.

We design and study a Contextual Memory Tree (CMT), a learning memory controller that inserts new memories into an experience store of unbounded size.

We present a systematic approach for achieving fairness in a binary classification setting.

An efficient bandit algorithm for $\sqrt{T}$-regret in online multiclass prediction?

The regret bound holds simultaneously with respect to a family of loss functions parameterized by $\eta$, for a range of $\eta$ restricted by the norm of the competitor.

We investigate active learning with access to two distinct oracles: Label (which is standard) and Search (which is not).

We extend the theory of boosting for regression problems to the online learning setting.

We study online boosting, the task of converting any weak online learner into a strong online learner.

We provide a summary of the mathematical and computational techniques that have enabled learning reductions to effectively address a wide class of problems, and show that this approach to solving machine learning problems can be broadly useful.

Can we effectively learn a nonlinear representation in time comparable to linear learning?

We consider the problem of estimating the conditional probability of a label in time O(log n), where n is the number of possible labels.

We present and analyze an agnostic active learning algorithm that works without keeping a version space.

We show that the Offset Tree is an optimal reduction to binary classification.

