Search Results for author: Hu Ding

Found 22 papers, 6 papers with code

Approximate Algorithms For $k$-Sparse Wasserstein Barycenter With Outliers

no code implementations20 Apr 2024 Qingyuan Yang, Hu Ding

First, we investigate the relation between $k$-sparse WB with outliers and the clustering (with outliers) problems.

A Novel Skip Orthogonal List for Dynamic Optimal Transport Problem

1 code implementation27 Oct 2023 Xiaoyang Xu, Hu Ding

Optimal transport is a fundamental topic that has attracted a great amount of attention from the optimization community in the past decades.

Randomized Greedy Algorithms and Composable Coreset for k-Center Clustering with Outliers

1 code implementation7 Jan 2023 Hu Ding, Ruomin Huang, Kai Liu, Haikuo Yu, Zixiu Wang

Though a number of methods have been developed in the past decades, it is still quite challenging to design quality guaranteed algorithm with low complexity for this problem.

Clustering

Sublinear Time Algorithms for Several Geometric Optimization (With Outliers) Problems In Machine Learning

no code implementations7 Jan 2023 Hu Ding

Under the stability assumption, we present two sampling algorithms for computing radius-approximate MEB with sample complexities independent of the number of input points $n$.

LEMMA

Coresets for Wasserstein Distributionally Robust Optimization Problems

1 code implementation9 Oct 2022 Ruomin Huang, Jiawei Huang, Wenjie Liu, Hu Ding

Though it is challenging to obtain a conventional coreset for \textsf{WDRO} due to the uncertainty issue of ambiguous data, we show that we can compute a ``dual coreset'' by using the strong duality property of \textsf{WDRO}.

Coresets for Relational Data and The Applications

1 code implementation9 Oct 2022 Jiaxiang Chen, Qingyuan Yang, Ruomin Huang, Hu Ding

A coreset is a small set that can approximately preserve the structure of the original input data set.

Relational Reasoning

A Data-dependent Approach for High Dimensional (Robust) Wasserstein Alignment

1 code implementation7 Sep 2022 Hu Ding, Wenjie Liu, Mingquan Ye

Our framework is a ``data-dependent'' approach that has the complexity depending on the intrinsic dimension of the input data.

Vocal Bursts Intensity Prediction

A Novel Sequential Coreset Method for Gradient Descent Algorithms

no code implementations5 Dec 2021 Jiawei Huang, Ruomin Huang, Wenjie Liu, Nikolaos M. Freris, Hu Ding

A wide range of optimization problems arising in machine learning can be solved by gradient descent algorithms, and a central question in this area is how to efficiently compress a large-scale dataset so as to reduce the computational complexity.

Data Compression

Robust and Fully-Dynamic Coreset for Continuous-and-Bounded Learning (With Outliers) Problems

no code implementations NeurIPS 2021 Zixiu Wang, Yiwen Guo, Hu Ding

In this paper, we propose a novel robust coreset method for the {\em continuous-and-bounded learning} problems (with outliers) which includes a broad range of popular optimization objectives in machine learning, {\em e. g.,} logistic regression and $ k $-means clustering.

BIG-bench Machine Learning

Is Simple Uniform Sampling Effective for Center-Based Clustering with Outliers: When and Why?

1 code implementation28 Feb 2021 Jiawei Huang, Wenjie Liu, Hu Ding

Real-world datasets often contain outliers, and the presence of outliers can make the clustering problems to be much more challenging.

Clustering

Gradient Episodic Memory with a Soft Constraint for Continual Learning

no code implementations16 Nov 2020 Guannan Hu, Wu Zhang, Hu Ding, Wenhao Zhu

Catastrophic forgetting in continual learning is a common destructive phenomenon in gradient-based neural networks that learn sequential tasks, and it is much different from forgetting in humans, who can learn and accumulate knowledge throughout their whole lives.

Continual Learning

Defending SVMs against Poisoning Attacks: the Hardness and DBSCAN Approach

no code implementations14 Jun 2020 Hu Ding, Fan Yang, Jiawei Huang

For the data sanitization defense, we link it to the intrinsic dimensionality of data; in particular, we provide a sampling theorem in doubling metrics for explaining the effectiveness of DBSCAN (as a density-based outlier removal method) for defending against poisoning attacks.

The Effectiveness of Uniform Sampling for Center-Based Clustering with Outliers

no code implementations24 May 2019 Hu Ding, Jiawei Huang, Haikuo Yu

The experiments suggest that the uniform sampling method can achieve comparable clustering results with other existing methods, but greatly reduce the running times.

Clustering

Minimum Enclosing Ball Revisited: Stability and Sub-linear Time Algorithms

no code implementations8 Apr 2019 Hu Ding

Though the problem has been extensively studied before, most of the existing algorithms need at least linear time (in the number of input points $n$ and the dimensionality $d$) to achieve a $(1+\epsilon)$-approximation.

Greedy Strategy Works for $k$-Center Clustering with Outliers and Coreset Construction

no code implementations24 Jan 2019 Hu Ding, Haikuo Yu, Zixiu Wang

Our idea is inspired by the greedy method, Gonzalez's algorithm, for solving the problem of ordinary $k$-center clustering.

Clustering

On Geometric Alignment in Low Doubling Dimension

no code implementations19 Nov 2018 Hu Ding, Mingquan Ye

In real-world, many problems can be formulated as the alignment between two geometric patterns.

A Unified Framework for Clustering Constrained Data without Locality Property

no code implementations2 Oct 2018 Hu Ding, Jinhui Xu

To overcome the difficulty caused by the loss of locality, we present in this paper a unified framework, called {\em Peeling-and-Enclosing (PnE)}, to iteratively solve two variants of the constrained clustering problems, {\em constrained $k$-means clustering} ($k$-CMeans) and {\em constrained $k$-median clustering} ($k$-CMedian).

Constrained Clustering LEMMA

Faster Balanced Clusterings in High Dimension

no code implementations4 Sep 2018 Hu Ding

For the balanced $k$-center clustering, we provide a $4$-approximation algorithm that improves the existing approximation factors; for the balanced $k$-median and $k$-means clusterings, our algorithms yield constant and $(1+\epsilon)$-approximation factors with any $\epsilon>0$.

Constrained Clustering Vocal Bursts Intensity Prediction

k-Prototype Learning for 3D Rigid Structures

no code implementations NeurIPS 2013 Hu Ding, Ronald Berezney, Jinhui Xu

In this paper, we study the following new variant of prototype learning, called {\em $k$-prototype learning problem for 3D rigid structures}: Given a set of 3D rigid structures, find a set of $k$ rigid structures so that each of them is a prototype for a cluster of the given rigid structures and the total cost (or dissimilarity) is minimized.

Clustering

Gauging Association Patterns of Chromosome Territories via Chromatic Median

no code implementations CVPR 2013 Hu Ding, Branislav Stojkovic, Ronald Berezney, Jinhui Xu

In this paper, we introduce a novel algorithmic tool for investigating association patterns of chromosome territories in a population of cells.

Cannot find the paper you are looking for? You can Submit a new open access paper.