Search Results for author: Yi Hao

Found 12 papers, 2 papers with code

Unsupervised Embedding of Hierarchical Structure in Euclidean Space

1 code implementation30 Oct 2020 Jinyu Zhao, Yi Hao, Cyrus Rashtchian

To learn the embedding, we revisit using a variational autoencoder with a Gaussian mixture prior, and we show that rescaling the latent space embedding and then applying Ward's linkage-based algorithm leads to improved results for both dendrogram purity and the Moseley-Wang cost function.

Clustering

The Broad Optimality of Profile Maximum Likelihood

1 code implementation NeurIPS 2019 Yi Hao, Alon Orlitsky

In particular, for every alphabet size $k$ and desired accuracy $\varepsilon$: $\textbf{Distribution estimation}$ Under $\ell_1$ distance, PML yields optimal $\Theta(k/(\varepsilon^2\log k))$ sample complexity for sorted-distribution estimation, and a PML-based estimator empirically outperforms the Good-Turing estimator on the actual distribution; $\textbf{Additive property estimation}$ For a broad class of additive properties, the PML plug-in estimator uses just four times the sample size required by the best estimator to achieve roughly twice its error, with exponentially higher confidence; $\boldsymbol{\alpha}\textbf{-R\'enyi entropy estimation}$ For integer $\alpha>1$, the PML plug-in estimator has optimal $k^{1-1/\alpha}$ sample complexity; for non-integer $\alpha>3/4$, the PML plug-in estimator has sample complexity lower than the state of the art; $\textbf{Identity testing}$ In testing whether an unknown distribution is equal to or at least $\varepsilon$ far from a given distribution in $\ell_1$ distance, a PML-based tester achieves the optimal sample complexity up to logarithmic factors of $k$.

On Learning Markov Chains

no code implementations NeurIPS 2018 Yi Hao, Alon Orlitsky, Venkatadheeraj Pichapati

We consider two problems related to the min-max risk (expected loss) of estimating an unknown $k$-state Markov chain from its $n$ sequential samples: predicting the conditional distribution of the next sample with respect to the KL-divergence, and estimating the transition matrix with respect to a natural loss induced by KL or a more general $f$-divergence measure.

Maxing and Ranking with Few Assumptions

no code implementations NeurIPS 2017 Moein Falahatgar, Yi Hao, Alon Orlitsky, Venkatadheeraj Pichapati, Vaishakh Ravindrakumar

PAC maximum selection (maxing) and ranking of $n$ elements via random pairwise comparisons have diverse applications and have been studied under many models and assumptions.

Data Amplification: Instance-Optimal Property Estimation

no code implementations ICML 2020 Yi Hao, Alon Orlitsky

For a large variety of distribution properties including four of the most popular ones and for every underlying distribution, they achieve the accuracy that the empirical-frequency plug-in estimators would attain using a logarithmic-factor more samples.

Data Amplification: A Unified and Competitive Approach to Property Estimation

no code implementations NeurIPS 2018 Yi Hao, Alon Orlitsky, Ananda T. Suresh, Yihong Wu

We design the first unified, linear-time, competitive, property estimator that for a wide class of properties and for all underlying distributions uses just $2n$ samples to achieve the performance attained by the empirical estimator with $n\sqrt{\log n}$ samples.

Unified Sample-Optimal Property Estimation in Near-Linear Time

no code implementations NeurIPS 2019 Yi Hao, Alon Orlitsky

We consider the fundamental learning problem of estimating properties of distributions over large domains.

SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm

no code implementations NeurIPS 2020 Yi Hao, Ayush Jain, Alon Orlitsky, Vaishakh Ravindrakumar

Sample- and computationally-efficient distribution estimation is a fundamental tenet in statistics and machine learning.

Optimal Prediction of the Number of Unseen Species with Multiplicity

no code implementations NeurIPS 2020 Yi Hao, Ping Li

Based on a sample of size $n$, we consider estimating the number of symbols that appear at least $\mu$ times in an independent sample of size $a \cdot n$, where $a$ is a given parameter.

TURF: A Two-factor, Universal, Robust, Fast Distribution Learning Algorithm

no code implementations15 Feb 2022 Yi Hao, Ayush Jain, Alon Orlitsky, Vaishakh Ravindrakumar

We derive a near-linear-time and essentially sample-optimal estimator that establishes $c_{t, d}=2$ for all $(t, d)\ne(1, 0)$.

Vocal Bursts Valence Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.