no code implementations • 21 Nov 2023 • Caelan Atamanchuk, Luc Devroye, Gabor Lugosi
We also show that, without any condition on the density, a consistent estimator of $d$ exists when $n r_n^d \to \infty$ and $r_n = o(1)$.
no code implementations • 2 Jun 2023 • Simon Briend, Luc Devroye, Gabor Lugosi
The parents' bits are flipped with probability $p$, and a majority vote is taken.
no code implementations • 3 May 2021 • Luc Devroye, Alex Dytso
In particular, under the assumptions that the probability measure $\mu$ of the observation is atomic, and the map from $f$ to $\mu$ is bijective, it is shown that there exists an estimator $f_n$ such that for every density $f$ $\lim_{n\to \infty} \mathbb{E} \left[ \int |f_n -f | \right]=0$.
no code implementations • 25 Feb 2021 • Luc Devroye, László Györfi
We revisit the problem of the estimation of the differential entropy $H(f)$ of a random vector $X$ in $R^d$ with density $f$, assuming that $H(f)$ exists and is finite.
Statistics Theory Statistics Theory
no code implementations • 22 Oct 2020 • Luc Devroye, Silvio Lattanzi, Gabor Lugosi, Nikita Zhivotovskiy
We study the problem of estimating the common mean $\mu$ of $n$ independent symmetric random variables with different and unknown standard deviations $\sigma_1 \le \sigma_2 \le \cdots \le\sigma_n$.
no code implementations • 14 Dec 2018 • Luc Devroye, Tommy Reddad
We propose a simple recursive data-based partitioning scheme which produces piecewise-constant or piecewise-linear density estimates on intervals, and show how this scheme can determine the optimal $L_1$ minimax rate for some discrete nonparametric classes.
no code implementations • 18 Jun 2018 • Luc Devroye, Abbas Mehrabian, Tommy Reddad
Let $G$ be an undirected graph with $m$ edges and $d$ vertices.
no code implementations • 20 Jan 2013 • Gérard Biau, Luc Devroye
The cellular tree classifier model addresses a fundamental problem in the design of classifiers for a parallel or distributed computing world: Given a data set, is it sufficient to apply a majority rule for classification, or shall one split the data into two or more parts and send each part to a potentially different computer (or cell) for further processing?