no code implementations • ICML 2020 • Yuchao Cai, Hanyuan Hang, Hanfang Yang, Zhouchen Lin
In this paper, we propose a boosting algorithm for regression problems called \textit{boosted histogram transform for regression} (BHTR) based on histogram transforms composed of random rotations, stretchings, and translations.
no code implementations • 12 Dec 2023 • Hongwei Wen, Annika Betken, Hanyuan Hang
In covariate shift adaptation where the differences in data distribution arise from variations in feature probabilities, existing approaches naturally address this problem based on \textit{feature probability matching} (\textit{FPM}).
no code implementations • 2 Dec 2023 • Yuchao Cai, Yuheng Ma, Hanfang Yang, Hanyuan Hang
We consider the paradigm of unsupervised anomaly detection, which involves the identification of anomalies within a dataset in the absence of labeled examples.
no code implementations • 18 Oct 2022 • Hanyuan Hang
On the theoretical side, we show that with a properly chosen number of nearest neighbors $k_D$ in the bagged $k$-distance, the sub-sample size $s$, the bagging rounds $B$, and the number of nearest neighbors $k_L$ for the localized level sets, BDMBC can achieve optimal convergence rates for mode estimation.
no code implementations • 5 Dec 2021 • Hanyuan Hang
In this paper, we propose a gradient boosting algorithm called \textit{adaptive boosting histogram transform} (\textit{ABHT}) for regression to illustrate the local adaptivity of gradient boosting algorithms in histogram transform ensemble learning.
no code implementations • 1 Sep 2021 • Hanyuan Hang, Yuchao Cai, Hanfang Yang, Zhouchen Lin
In this paper, we propose an ensemble learning algorithm called \textit{under-bagging $k$-nearest neighbors} (\textit{under-bagging $k$-NN}) for imbalanced classification problems.
no code implementations • 10 Jun 2021 • Jingyi Cui, Hanyuan Hang, Yisen Wang, Zhouchen Lin
In this paper, we propose a density estimation algorithm called \textit{Gradient Boosting Histogram Transform} (GBHT), where we adopt the \textit{Negative Log Likelihood} as the loss function to make the boosting procedure available for the unsupervised tasks.
1 code implementation • 10 Jun 2021 • Hongwei Wen, Jingyi Cui, Hanyuan Hang, Jiabin Liu, Yisen Wang, Zhouchen Lin
As an important branch of weakly supervised learning, partial label learning deals with data where each instance is assigned with a set of candidate labels, whereas only one of them is true.
no code implementations • 3 Jun 2021 • Hanyuan Hang, Tao Huang, Yuchao Cai, Hanfang Yang, Zhouchen Lin
In this paper, we propose a gradient boosting algorithm for large-scale regression problems called \textit{Gradient Boosted Binary Histogram Ensemble} (GBBHE) based on binary histogram partition and ensemble learning.
no code implementations • 1 Jan 2021 • Jiabin Liu, Hanyuan Hang, Bo wang, Xin Shen, Zhouchen Lin
Learning from label proportions (LLP), where the training data are arranged in form of groups with only label proportions provided instead of the exact labels, is an important weakly supervised learning paradigm in machine learning.
no code implementations • ICLR 2020 • Tao Huang, Zhen Han, Xu Jia, Hanyuan Hang
In this paper, we propose a novel kind of kernel, random forest kernel, to enhance the empirical performance of MMD GAN.
no code implementations • 8 Dec 2019 • Hanyuan Hang, Zhouchen Lin, Xiaoyu Liu, Hongwei Wen
Instead, we apply kernel histogram transforms (KHT) equipped with smoother regressors such as support vector machines (SVMs), and it turns out that both single and ensemble KHT enjoy almost optimal convergence rates.
no code implementations • 24 Nov 2019 • Hanyuan Hang
We investigate an algorithm named histogram transform ensembles (HTE) density estimator whose effectiveness is supported by both solid theoretical analysis and significant experimental performance.
no code implementations • 24 Jun 2019 • Hanyuan Hang, Yuchao Cai, Hanfang Yang
Single-level density-based approach has long been widely acknowledged to be a conceptually and mathematically convincing clustering method.
no code implementations • 27 May 2019 • Hanyuan Hang, Xiaoyu Liu, Ingo Steinwart
We propose an algorithm named best-scored random forest for binary classification problems.
no code implementations • 9 May 2019 • Hanyuan Hang, Hongwei Wen
Thirdly, the convergence rates under $L_{\infty}$-norm is presented.
no code implementations • 9 May 2019 • Hanyuan Hang, Yingyi Chen, Johan A. K. Suykens
We propose a novel method designed for large-scale regression problems, namely the two-stage best-scored random forest (TBRF).
no code implementations • 4 Oct 2018 • Hanyuan Hang, Ingo Steinwart
This paper investigates the nonparametric regression problem using SVMs with anisotropic Gaussian RBF kernels.
no code implementations • 13 Jul 2016 • Hanyuan Hang, Ingo Steinwart, Yunlong Feng, Johan A. K. Suykens
We study the density estimation problem with observations generated by certain dynamical systems that admit a unique underlying invariant Lebesgue density.
no code implementations • 10 May 2016 • Hanyuan Hang, Yunlong Feng, Ingo Steinwart, Johan A. K. Suykens
We show that when the stochastic processes satisfy a generalized Bernstein-type inequality, a unified treatment on analyzing the learning schemes with various mixing processes can be conducted and a sharp oracle inequality for generic regularized empirical risk minimization schemes can be established.