Search Results for author: Yudong Chen

Found 71 papers, 13 papers with code

Prelimit Coupling and Steady-State Convergence of Constant-stepsize Nonsmooth Contractive SA

no code implementations9 Apr 2024 Yixuan Zhang, Dongyan Huo, Yudong Chen, Qiaomin Xie

Motivated by Q-learning, we study nonsmooth contractive stochastic approximation (SA) with constant stepsize.

Q-Learning

Span-Based Optimal Sample Complexity for Weakly Communicating and General Average Reward MDPs

no code implementations18 Mar 2024 Matthew Zurek, Yudong Chen

Our result is the first that is minimax optimal (up to log factors) in all parameters $S, A, H$ and $\epsilon$, improving on existing work that either assumes uniformly bounded mixing times for all policies or has suboptimal dependence on the parameters.

Entry-Specific Bounds for Low-Rank Matrix Completion under Highly Non-Uniform Sampling

no code implementations29 Feb 2024 Xumei Xi, Christina Lee Yu, Yudong Chen

Our bounds characterize the hardness of estimating each entry as a function of the localized sampling probabilities.

Low-Rank Matrix Completion

Effectiveness of Constant Stepsize in Markovian LSA and Statistical Inference

no code implementations18 Dec 2023 Dongyan Huo, Yudong Chen, Qiaomin Xie

Our procedure leverages the fast mixing property of constant-stepsize LSA for better covariance estimation and employs Richardson-Romberg (RR) extrapolation to reduce the bias induced by constant stepsize and Markovian data.

Span-Based Optimal Sample Complexity for Average Reward MDPs

no code implementations22 Nov 2023 Matthew Zurek, Yudong Chen

Our result is the first that is minimax optimal (up to log factors) in all parameters $S, A, H$ and $\varepsilon$, improving on existing work that either assumes uniformly bounded mixing times for all policies or has suboptimal dependence on the parameters.

GTP-ViT: Efficient Vision Transformers via Graph-based Token Propagation

1 code implementation6 Nov 2023 Xuwei Xu, Sen Wang, Yudong Chen, Yanping Zheng, Zhewei Wei, Jiajun Liu

Vision Transformers (ViTs) have revolutionized the field of computer vision, yet their deployments on resource-constrained devices remain challenging due to high computational demands.

Efficient ViTs

Minimally Modifying a Markov Game to Achieve Any Nash Equilibrium and Value

no code implementations1 Nov 2023 Young Wu, Jeremy McMahan, Yiding Chen, Yudong Chen, Xiaojin Zhu, Qiaomin Xie

We study the game modification problem, where a benevolent game designer or a malevolent adversary modifies the reward function of a zero-sum Markov game so that a target deterministic or stochastic policy profile becomes the unique Markov perfect Nash equilibrium and has a value within a target range, in a way that minimizes the modification cost.

Understanding the Effects of Projectors in Knowledge Distillation

1 code implementation26 Oct 2023 Yudong Chen, Sen Wang, Jiajun Liu, Xuwei Xu, Frank de Hoog, Brano Kusy, Zi Huang

Interestingly, we discovered that even if the student and the teacher have the same feature dimensions, adding a projector still helps to improve the distillation performance.

Knowledge Distillation

Plug n' Play: Channel Shuffle Module for Enhancing Tiny Vision Transformers

no code implementations9 Oct 2023 Xuwei Xu, Sen Wang, Yudong Chen, Jiajun Liu

Inspired by the channel shuffle design in ShuffleNetV2 \cite{ma2018shufflenet}, our module expands the feature channels of a tiny ViT and partitions the channels into two groups: the \textit{Attended} and \textit{Idle} groups.

No Token Left Behind: Efficient Vision Transformer via Dynamic Token Idling

1 code implementation9 Oct 2023 Xuwei Xu, Changlin Li, Yudong Chen, Xiaojun Chang, Jiajun Liu, Sen Wang

By allowing the idle tokens to be re-selected in the following layers, IdleViT mitigates the negative impact of improper pruning in the early stages.

Towards Joint Modeling of Dialogue Response and Speech Synthesis based on Large Language Model

1 code implementation20 Sep 2023 Xinyu Zhou, Delong Chen, Yudong Chen

This paper explores the potential of constructing an AI spoken dialogue system that "thinks how to respond" and "thinks how to speak" simultaneously, which more closely aligns with the human speech production process compared to the current cascade pipeline of independent chatbot and Text-to-Speech (TTS) modules.

Chatbot Language Modelling +3

Clustering Without an Eigengap

no code implementations29 Aug 2023 Matthew Zurek, Yudong Chen

Our gap-free clustering procedure also leads to improved algorithms for recursive clustering.

Clustering Graph Clustering +1

VISER: A Tractable Solution Concept for Games with Information Asymmetry

1 code implementation18 Jul 2023 Jeremy McMahan, Young Wu, Yudong Chen, Xiaojin Zhu, Qiaomin Xie

Many real-world games suffer from information asymmetry: one player is only aware of their own payoffs while the other player has the full game information.

Multi-agent Reinforcement Learning

Stochastic Methods in Variational Inequalities: Ergodicity, Bias and Refinements

no code implementations28 Jun 2023 Emmanouil-Vasileios Vlatakis-Gkaragkounis, Angeliki Giannou, Yudong Chen, Qiaomin Xie

Our work endeavors to elucidate and quantify the probabilistic structures intrinsic to these algorithms.

Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces

no code implementations2 Jun 2023 Brahma S. Pavse, Matthew Zurek, Yudong Chen, Qiaomin Xie, Josiah P. Hanna

This latter objective is called stability and is especially important when the state space is unbounded, such that the states can be arbitrarily far from each other and the agent can drift far away from the desired states.

Attribute reinforcement-learning +1

Matrix Estimation for Offline Reinforcement Learning with Low-Rank Structure

no code implementations24 May 2023 Xumei Xi, Christina Lee Yu, Yudong Chen

We consider offline Reinforcement Learning (RL), where the agent does not interact with the environment and must rely on offline data collected using a behavior policy.

Matrix Completion reinforcement-learning +1

Improved Feature Distillation via Projector Ensemble

1 code implementation27 Oct 2022 Yudong Chen, Sen Wang, Jiajun Liu, Xuwei Xu, Frank de Hoog, Zi Huang

Motivated by the positive effect of the projector in feature distillation, we propose an ensemble of projectors to further improve the quality of student features.

Knowledge Distillation Multi-Task Learning

Bias and Extrapolation in Markovian Linear Stochastic Approximation with Constant Stepsizes

no code implementations3 Oct 2022 Dongyan Huo, Yudong Chen, Qiaomin Xie

We consider Linear Stochastic Approximation (LSA) with a constant stepsize and Markovian data.

Asymmetric Transfer Hashing with Adaptive Bipartite Graph Learning

no code implementations25 Jun 2022 Jianglin Lu, Jie zhou, Yudong Chen, Witold Pedrycz, Kwok-Wai Hung

Specifically, ATH characterizes the domain distribution gap by the discrepancy between two asymmetric hash functions, and minimizes the feature gap with the help of a novel adaptive bipartite graph constructed on cross-domain data.

Graph Learning Retrieval +1

Overcoming the Long Horizon Barrier for Sample-Efficient Reinforcement Learning with Latent Low-Rank Structure

no code implementations7 Jun 2022 Tyler Sam, Yudong Chen, Christina Lee Yu

The practicality of reinforcement learning algorithms has been limited due to poor scaling with respect to the problem size, as the sample complexity of learning an $\epsilon$-optimal policy is $\tilde{\Omega}\left(|S||A|H^3 / \epsilon^2\right)$ over worst case instances of an MDP with state space $S$, action space $A$, and horizon $H$.

Algorithmic Regularization in Model-free Overparametrized Asymmetric Matrix Factorization

no code implementations6 Mar 2022 Liwei Jiang, Yudong Chen, Lijun Ding

We study the asymmetric matrix factorization problem under a natural nonconvex formulation with arbitrary overparametrization.

A Geometric Approach to $k$-means

no code implementations13 Jan 2022 Jiazhen Hong, Wei Qian, Yudong Chen, Yuqian Zhang

This framework consists of alternating between the following two steps iteratively: (i) detect mis-specified clusters in a local solution and (ii) improve the current local solution by non-local operations.

Curriculum Disentangled Recommendation with Noisy Multi-feedback

1 code implementation NeurIPS 2021 Hong Chen, Yudong Chen, Xin Wang, Ruobing Xie, Rui Wang, Feng Xia, Wenwu Zhu

However, learning such disentangled representations from multi-feedback data is challenging because i) multi-feedback is complex: there exist complex relations among different types of feedback (e. g., click, unclick, and dislike, etc) as well as various user intentions, and ii) multi-feedback is noisy: there exists noisy (useless) information both in features and labels, which may deteriorate the recommendation performance.

Denoising Representation Learning

Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning

no code implementations NeurIPS 2021 Yingjie Fei, Zhuoran Yang, Yudong Chen, Zhaoran Wang

The exponential Bellman equation inspires us to develop a novel analysis of Bellman backup procedures in risk-sensitive RL algorithms, and further motivates the design of a novel exploration mechanism.

reinforcement-learning Reinforcement Learning (RL)

Rank Overspecified Robust Matrix Recovery: Subgradient Method and Exact Recovery

no code implementations NeurIPS 2021 Lijun Ding, Liwei Jiang, Yudong Chen, Qing Qu, Zhihui Zhu

We study the robust recovery of a low-rank matrix from sparsely and grossly corrupted Gaussian measurements, with no prior knowledge on the intrinsic rank.

Deep Self-Adaptive Hashing for Image Retrieval

no code implementations16 Aug 2021 Qinghong Lin, Xiaojun Chen, Qin Zhang, Shangxuan Tian, Yudong Chen

Secondly, we measure the priorities of data pairs with PIC and assign adaptive weights to them, which is relies on the assumption that more dissimilar data pairs contain more discriminative information for hash learning.

Deep Hashing Image Retrieval

Curriculum Meta-Learning for Next POI Recommendation

no code implementations KDD 2021 Yudong Chen, Xin Wang, Miao Fan, Jizhou Huang, Shengwen Yang, and Wenwu Zhu.

Next point-of-interest (POI) recommendation is a hot research field where a recent emerging scenario, next POI to search recommendation, has been deployed in many online map services such as Baidu Maps.

Meta-Learning

MetaDelta: A Meta-Learning System for Few-shot Image Classification

1 code implementation22 Feb 2021 Yudong Chen, Chaoyu Guan, Zhikun Wei, Xin Wang, Wenwu Zhu

Meta-learning aims at learning quickly on novel tasks with limited data by transferring generic experience learned from previous tasks.

Classification Few-Shot Image Classification +2

Towards a Unified Quadrature Framework for Large-Scale Kernel Machines

no code implementations3 Nov 2020 Fanghui Liu, Xiaolin Huang, Yudong Chen, Johan A. K. Suykens

In this paper, we develop a quadrature framework for large-scale kernel machines via a numerical integration representation.

Numerical Integration

A Survey on Curriculum Learning

no code implementations25 Oct 2020 Xin Wang, Yudong Chen, Wenwu Zhu

We discuss works on curriculum learning within a general CL framework, elaborating on how to design a manually predefined curriculum or an automatic curriculum.

Active Learning BIG-bench Machine Learning +3

Local Minima Structures in Gaussian Mixture Models

no code implementations28 Sep 2020 Yudong Chen, Dogyoon Song, Xumei Xi, Yuqian Zhang

As the objective function is non-convex, there can be multiple local minima that are not globally optimal, even for well-separated mixture models.

valid

Low-rank matrix recovery with non-quadratic loss: projected gradient method and regularity projection oracle

no code implementations31 Aug 2020 Lijun Ding, Yuqian Zhang, Yudong Chen

Existing results for low-rank matrix recovery largely focus on quadratic loss, which enjoys favorable properties such as restricted strong convexity/smoothness (RSC/RSM) and well conditioning over all low rank matrices.

Matrix Completion

Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret

no code implementations NeurIPS 2020 Yingjie Fei, Zhuoran Yang, Yudong Chen, Zhaoran Wang, Qiaomin Xie

We study risk-sensitive reinforcement learning in episodic Markov decision processes with unknown transition kernels, where the goal is to optimize the total reward under the risk measure of exponential utility.

Q-Learning reinforcement-learning +1

Random Features for Kernel Approximation: A Survey on Algorithms, Theory, and Beyond

no code implementations23 Apr 2020 Fanghui Liu, Xiaolin Huang, Yudong Chen, Johan A. K. Suykens

This survey may serve as a gentle introduction to this topic, and as a users' guide for practitioners interested in applying the representative algorithms and understanding theoretical results under various technical assumptions.

High-dimensional, multiscale online changepoint detection

no code implementations7 Mar 2020 Yudong Chen, Tengyao Wang, Richard J. Samworth

We introduce a new method for high-dimensional, online changepoint detection in settings where a $p$-variate Gaussian data stream may undergo a change in mean.

Vocal Bursts Intensity Prediction

Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium

no code implementations17 Feb 2020 Qiaomin Xie, Yudong Chen, Zhaoran Wang, Zhuoran Yang

In the offline setting, we control both players and aim to find the Nash Equilibrium by minimizing the duality gap.

Structures of Spurious Local Minima in $k$-means

no code implementations16 Feb 2020 Wei Qian, Yuqian Zhang, Yudong Chen

Our theoretical results corroborate existing empirical observations and provide justification for several improved algorithms for $k$-means clustering.

Clustering

Random Fourier Features via Fast Surrogate Leverage Weighted Sampling

no code implementations20 Nov 2019 Fanghui Liu, Xiaolin Huang, Yudong Chen, Jie Yang, Johan A. K. Suykens

In this paper, we propose a fast surrogate leverage weighted sampling strategy to generate refined random Fourier features for kernel approximation.

Factor Group-Sparse Regularization for Efficient Low-Rank Matrix Recovery

no code implementations NeurIPS 2019 Jicong Fan, Lijun Ding, Yudong Chen, Madeleine Udell

Compared to the max norm and the factored formulation of the nuclear norm, factor group-sparse regularizers are more efficient, accurate, and robust to the initial guess of rank.

Low-Rank Matrix Completion

Global Convergence of Least Squares EM for Demixing Two Log-Concave Densities

1 code implementation NeurIPS 2019 Wei Qian, Yuqian Zhang, Yudong Chen

This work studies the location estimation problem for a mixture of two rotation invariant log-concave densities.

Clustering Degree-Corrected Stochastic Block Model with Outliers

no code implementations7 Jun 2019 Xin Qian, Yudong Chen, Andreea Minca

For the degree corrected stochastic block model in the presence of arbitrary or even adversarial outliers, we develop a convex-optimization-based clustering algorithm that includes a penalization term depending on the positive deviation of a node from the expected number of edges to other inliers.

Clustering Stochastic Block Model

Achieving the Bayes Error Rate in Synchronization and Block Models by SDP, Robustly

no code implementations21 Apr 2019 Yingjie Fei, Yudong Chen

We study the statistical performance of semidefinite programming (SDP) relaxations for clustering under random graph models.

Clustering Stochastic Block Model +1

Convex Relaxation Methods for Community Detection

no code implementations30 Sep 2018 Xiao-Dong Li, Yudong Chen, Jiaming Xu

We introduce some important theoretical techniques and results for establishing the consistency of convex community detection under various statistical models.

Community Detection

Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning

no code implementations14 Jun 2018 Dong Yin, Yudong Chen, Kannan Ramchandran, Peter Bartlett

In this setting, the Byzantine machines may create fake local minima near a saddle point that is far away from any true local minimum, even when robust gradient estimators are used.

Tensor Robust Principal Component Analysis with A New Tensor Nuclear Norm

1 code implementation10 Apr 2018 Canyi Lu, Jiashi Feng, Yudong Chen, Wei Liu, Zhouchen Lin, Shuicheng Yan

Equipped with the new tensor nuclear norm, we then solve the TRPCA problem by solving a convex program and provide the theoretical guarantee for the exact recovery.

Leave-one-out Approach for Matrix Completion: Primal and Dual Analysis

no code implementations20 Mar 2018 Lijun Ding, Yudong Chen

In this paper, we introduce a powerful technique based on Leave-one-out analysis to the study of low-rank matrix completion problems.

Low-Rank Matrix Completion

Hidden Integrality and Semi-random Robustness of SDP Relaxation for Sub-Gaussian Mixture Model

no code implementations17 Mar 2018 Yingjie Fei, Yudong Chen

The error of the integer program, and hence that of the SDP, are further shown to decay exponentially in the signal-to-noise ratio.

Clustering

Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates

1 code implementation ICML 2018 Dong Yin, Yudong Chen, Kannan Ramchandran, Peter Bartlett

In particular, these algorithms are shown to achieve order-optimal statistical error rates for strongly convex losses.

Harnessing Structures in Big Data via Guaranteed Low-Rank Matrix Estimation

no code implementations23 Feb 2018 Yudong Chen, Yuejie Chi

Low-rank modeling plays a pivotal role in signal processing and machine learning, with applications ranging from collaborative filtering, video surveillance, medical imaging, to dimensionality reduction and adaptive filtering.

Collaborative Filtering Dimensionality Reduction

Tensor Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Tensors via Convex Optimization

no code implementations CVPR 2016 Canyi Lu, Jiashi Feng, Yudong Chen, Wei Liu, Zhouchen Lin, Shuicheng Yan

In this work, we prove that under certain suitable assumptions, we can recover both the low-rank and the sparse components exactly by simply solving a convex program whose objective is a weighted combination of the tensor nuclear norm and the $\ell_1$-norm, i. e., $\min_{{\mathcal{L}},\ {\mathcal{E}}} \ \|{{\mathcal{L}}}\|_*+\lambda\|{{\mathcal{E}}}\|_1, \ \text{s. t.}

Image Denoising

Exponential error rates of SDP for block models: Beyond Grothendieck's inequality

no code implementations23 May 2017 Yingjie Fei, Yudong Chen

In this paper we consider the cluster estimation problem under the Stochastic Block Model.

Stochastic Block Model

Distributed Statistical Machine Learning in Adversarial Settings: Byzantine Gradient Descent

2 code implementations16 May 2017 Yudong Chen, Lili Su, Jiaming Xu

The total computational complexity of our algorithm is of $O((Nd/m) \log N)$ at each working machine and $O(md + kd \log^3 N)$ at the central server, and the total communication cost is of $O(m d \log N)$.

BIG-bench Machine Learning Federated Learning

Fast Algorithms for Robust PCA via Gradient Descent

no code implementations NeurIPS 2016 Xinyang Yi, Dohyung Park, Yudong Chen, Constantine Caramanis

For the partially observed case, we show the complexity of our algorithm is no more than $\mathcal{O}(r^4d \log d \log(1/\varepsilon))$.

Matrix Completion

Convexified Modularity Maximization for Degree-corrected Stochastic Block Models

no code implementations28 Dec 2015 Yudong Chen, Xiao-Dong Li, Jiaming Xu

We establish non-asymptotic theoretical guarantees for both approximate clustering and perfect clustering.

Clustering Community Detection +1

Fast low-rank estimation by projected gradient descent: General statistical and algorithmic guarantees

no code implementations10 Sep 2015 Yudong Chen, Martin J. Wainwright

We provide a simple set of conditions under which projected gradient descent, when given a suitable initialization, converges geometrically to a statistically useful solution.

Clustering Graph Clustering +2

Clustering from Labels and Time-Varying Graphs

no code implementations NeurIPS 2014 Shiau Hong Lim, Yudong Chen, Huan Xu

Our theoretical results cover and subsume a wide range of existing graph clustering results including planted partition, weighted clustering and partially observed graphs.

Clustering Graph Clustering

A Convex Formulation for Mixed Regression with Two Components: Minimax Optimal Rates

no code implementations25 Dec 2013 Yudong Chen, Xinyang Yi, Constantine Caramanis

We consider the mixed regression problem with two components, under adversarial and stochastic noise.

regression

Incoherence-Optimal Matrix Completion

no code implementations1 Oct 2013 Yudong Chen

We show that it is not necessary to assume joint incoherence, which is a standard but unintuitive and restrictive condition that is imposed by previous studies.

Clustering Matrix Completion

Completing Any Low-rank Matrix, Provably

no code implementations12 Jun 2013 Yudong Chen, Srinadh Bhojanapalli, Sujay Sanghavi, Rachel Ward

Matrix completion, i. e., the exact and provable recovery of a low-rank matrix from a small subset of its elements, is currently only known to be possible if the matrix satisfies a restrictive structural constraint---known as {\em incoherence}---on its row and column spaces.

Matrix Completion

Detecting Overlapping Temporal Community Structure in Time-Evolving Networks

no code implementations28 Mar 2013 Yudong Chen, Vikas Kawadia, Rahul Urgaonkar

We present a principled approach for detecting overlapping temporal community structure in dynamic networks.

Combinatorial Optimization

Clustering Sparse Graphs

no code implementations NeurIPS 2012 Yudong Chen, Sujay Sanghavi, Huan Xu

We develop a new algorithm to cluster sparse unweighted graphs -- i. e. partition the nodes into disjoint clusters so that there is higher density within clusters, and low across clusters.

Clustering Stochastic Block Model

Improved Graph Clustering

no code implementations11 Oct 2012 Yudong Chen, Sujay Sanghavi, Huan Xu

We show that, in the classic stochastic block model setting, it outperforms existing methods by polynomial factors when the cluster size is allowed to have general scalings.

Clustering Graph Clustering +1

Orthogonal Matching Pursuit with Noisy and Missing Data: Low and High Dimensional Results

no code implementations5 Jun 2012 Yudong Chen, Constantine Caramanis

Many models for sparse regression typically assume that the covariates are known completely, and without noise.

regression

Clustering Partially Observed Graphs via Convex Optimization

no code implementations25 Apr 2011 Yudong Chen, Ali Jalali, Sujay Sanghavi, Huan Xu

This paper considers the problem of clustering a partially observed unweighted graph---i. e., one where for some node pairs we know there is an edge between them, for some others we know there is no edge, and for the remaining we do not know whether or not there is an edge.

Clustering Stochastic Block Model

Matrix completion with column manipulation: Near-optimal sample-robustness-rank tradeoffs

no code implementations10 Feb 2011 Yudong Chen, Huan Xu, Constantine Caramanis, Sujay Sanghavi

Moreover, we show by an information-theoretic argument that our guarantees are nearly optimal in terms of the fraction of sampled entries on the authentic columns, the fraction of corrupted columns, and the rank of the underlying matrix.

Collaborative Filtering Matrix Completion

Cannot find the paper you are looking for? You can Submit a new open access paper.