no code implementations • CVPR 2013 • Hu Ding, Branislav Stojkovic, Ronald Berezney, Jinhui Xu
In this paper, we introduce a novel algorithmic tool for investigating association patterns of chromosome territories in a population of cells.
no code implementations • NeurIPS 2013 • Hu Ding, Ronald Berezney, Jinhui Xu
In this paper, we study the following new variant of prototype learning, called {\em $k$-prototype learning problem for 3D rigid structures}: Given a set of 3D rigid structures, find a set of $k$ rigid structures so that each of them is a prototype for a cluster of the given rigid structures and the total cost (or dissimilarity) is minimized.
no code implementations • 4 Sep 2018 • Hu Ding
For the balanced $k$-center clustering, we provide a $4$-approximation algorithm that improves the existing approximation factors; for the balanced $k$-median and $k$-means clusterings, our algorithms yield constant and $(1+\epsilon)$-approximation factors with any $\epsilon>0$.
no code implementations • 2 Oct 2018 • Hu Ding, Jinhui Xu
To overcome the difficulty caused by the loss of locality, we present in this paper a unified framework, called {\em Peeling-and-Enclosing (PnE)}, to iteratively solve two variants of the constrained clustering problems, {\em constrained $k$-means clustering} ($k$-CMeans) and {\em constrained $k$-median clustering} ($k$-CMedian).
no code implementations • 19 Nov 2018 • Hu Ding, Mingquan Ye
In real-world, many problems can be formulated as the alignment between two geometric patterns.
no code implementations • 24 Jan 2019 • Hu Ding, Haikuo Yu, Zixiu Wang
Our idea is inspired by the greedy method, Gonzalez's algorithm, for solving the problem of ordinary $k$-center clustering.
no code implementations • 8 Apr 2019 • Hu Ding
Though the problem has been extensively studied before, most of the existing algorithms need at least linear time (in the number of input points $n$ and the dimensionality $d$) to achieve a $(1+\epsilon)$-approximation.
no code implementations • 24 May 2019 • Hu Ding, Jiawei Huang, Haikuo Yu
The experiments suggest that the uniform sampling method can achieve comparable clustering results with other existing methods, but greatly reduce the running times.
no code implementations • 27 Feb 2020 • Hu Ding, Ruizhe Qin, Jiawei Huang
We focus on two fundamental optimization problems: {\em SVM with outliers} and {\em $k$-center clustering with outliers}.
no code implementations • 14 Jun 2020 • Hu Ding, Fan Yang, Jiawei Huang
For the data sanitization defense, we link it to the intrinsic dimensionality of data; in particular, we provide a sampling theorem in doubling metrics for explaining the effectiveness of DBSCAN (as a density-based outlier removal method) for defending against poisoning attacks.
no code implementations • 16 Nov 2020 • Guannan Hu, Wu Zhang, Hu Ding, Wenhao Zhu
Catastrophic forgetting in continual learning is a common destructive phenomenon in gradient-based neural networks that learn sequential tasks, and it is much different from forgetting in humans, who can learn and accumulate knowledge throughout their whole lives.
1 code implementation • 28 Feb 2021 • Jiawei Huang, Wenjie Liu, Hu Ding
Real-world datasets often contain outliers, and the presence of outliers can make the clustering problems to be much more challenging.
no code implementations • NeurIPS 2021 • Zixiu Wang, Yiwen Guo, Hu Ding
In this paper, we propose a novel robust coreset method for the {\em continuous-and-bounded learning} problems (with outliers) which includes a broad range of popular optimization objectives in machine learning, {\em e. g.,} logistic regression and $ k $-means clustering.
no code implementations • NeurIPS 2021 • Ruizhe Qin, Mengying Li, Hu Ding
Clustering ensemble is one of the most important problems in ensemble learning.
no code implementations • 5 Dec 2021 • Jiawei Huang, Ruomin Huang, Wenjie Liu, Nikolaos M. Freris, Hu Ding
A wide range of optimization problems arising in machine learning can be solved by gradient descent algorithms, and a central question in this area is how to efficiently compress a large-scale dataset so as to reduce the computational complexity.
1 code implementation • 7 Sep 2022 • Hu Ding, Wenjie Liu, Mingquan Ye
Our framework is a ``data-dependent'' approach that has the complexity depending on the intrinsic dimension of the input data.
1 code implementation • 9 Oct 2022 • Ruomin Huang, Jiawei Huang, Wenjie Liu, Hu Ding
Though it is challenging to obtain a conventional coreset for \textsf{WDRO} due to the uncertainty issue of ambiguous data, we show that we can compute a ``dual coreset'' by using the strong duality property of \textsf{WDRO}.
1 code implementation • 9 Oct 2022 • Jiaxiang Chen, Qingyuan Yang, Ruomin Huang, Hu Ding
A coreset is a small set that can approximately preserve the structure of the original input data set.
no code implementations • 7 Jan 2023 • Hu Ding
Under the stability assumption, we present two sampling algorithms for computing radius-approximate MEB with sample complexities independent of the number of input points $n$.
1 code implementation • 7 Jan 2023 • Hu Ding, Ruomin Huang, Kai Liu, Haikuo Yu, Zixiu Wang
Though a number of methods have been developed in the past decades, it is still quite challenging to design quality guaranteed algorithm with low complexity for this problem.
1 code implementation • 27 Oct 2023 • Xiaoyang Xu, Hu Ding
Optimal transport is a fundamental topic that has attracted a great amount of attention from the optimization community in the past decades.
no code implementations • 20 Apr 2024 • Qingyuan Yang, Hu Ding
First, we investigate the relation between $k$-sparse WB with outliers and the clustering (with outliers) problems.