Search Results for author: Zhihui Zhu

Found 49 papers, 23 papers with code

AdaContour: Adaptive Contour Descriptor with Hierarchical Representation

1 code implementation • 12 Apr 2024 • Tianyu Ding, Jinxin Zhou, Tianyi Chen, Zhihui Zhu, Ilya Zharkov, Luming Liang

Existing angle-based contour descriptors suffer from lossy representation for non-starconvex shapes.

Instance Segmentation Semantic Segmentation

Paper
Code

The Distributional Reward Critic Architecture for Perturbed-Reward Reinforcement Learning

no code implementations • 11 Jan 2024 • Xi Chen, Zhihui Zhu, Andrew Perrault

We study reinforcement learning in the presence of an unknown reward perturbation.

Continuous Control reinforcement-learning

Paper
Add Code

Guaranteed Nonconvex Factorization Approach for Tensor Train Recovery

no code implementations • 5 Jan 2024 • Zhen Qin, Michael B. Wakin, Zhihui Zhu

We first delve into the TT factorization problem and establish the local linear convergence of RGD.

Paper
Add Code

OTOv3: Automatic Architecture-Agnostic Neural Network Training and Compression from Structured Pruning to Erasing Operators

1 code implementation • 15 Dec 2023 • Tianyi Chen, Tianyu Ding, Zhihui Zhu, Zeyu Chen, HsiangTao Wu, Ilya Zharkov, Luming Liang

Compressing a predefined deep neural network (DNN) into a compact sub-network with competitive performance is crucial in the efficient machine learning realm.

Neural Architecture Search

261

Paper
Code

The Efficiency Spectrum of Large Language Models: An Algorithmic Survey

1 code implementation • 1 Dec 2023 • Tianyu Ding, Tianyi Chen, Haidong Zhu, Jiachen Jiang, Yiqi Zhong, Jinxin Zhou, Guangzhi Wang, Zhihui Zhu, Ilya Zharkov, Luming Liang

The rapid growth of Large Language Models (LLMs) has been a driving force in transforming various domains, reshaping the artificial general intelligence landscape.

Model Compression

Paper
Code

DREAM: Diffusion Rectification and Estimation-Adaptive Models

1 code implementation • 30 Nov 2023 • Jinxin Zhou, Tianyu Ding, Tianyi Chen, Jiachen Jiang, Ilya Zharkov, Zhihui Zhu, Luming Liang

We present DREAM, a novel training framework representing Diffusion Rectification and Estimation Adaptive Models, requiring minimal code changes (just three lines) yet significantly enhancing the alignment of training with sampling in diffusion models.

Image Super-Resolution

Paper
Code

Convergence Analysis for Learning Orthonormal Deep Linear Neural Networks

no code implementations • 24 Nov 2023 • Zhen Qin, Xuwei Tan, Zhihui Zhu

Enforcing orthonormal or isometric property for the weight matrices has been shown to enhance the training of deep neural networks by mitigating gradient exploding/vanishing and increasing the robustness of the learned networks.

Paper
Add Code

Understanding Deep Representation Learning via Layerwise Feature Compression and Discrimination

1 code implementation • 6 Nov 2023 • Peng Wang, Xiao Li, Can Yaras, Zhihui Zhu, Laura Balzano, Wei Hu, Qing Qu

To the best of our knowledge, this is the first quantitative characterization of feature evolution in hierarchical representations of deep linear networks.

Feature Compression Multi-class Classification +2

Paper
Code

Generalized Neural Collapse for a Large Number of Classes

no code implementations • 9 Oct 2023 • Jiachen Jiang, Jinxin Zhou, Peng Wang, Qing Qu, Dustin Mixon, Chong You, Zhihui Zhu

However, most of the existing empirical and theoretical studies in neural collapse focus on the case that the number of classes is small relative to the dimension of the feature space.

Face Recognition Retrieval

Paper
Add Code

The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks

1 code implementation • 1 Jun 2023 • Can Yaras, Peng Wang, Wei Hu, Zhihui Zhu, Laura Balzano, Qing Qu

Second, it allows us to better understand deep representation learning by elucidating the linear progressive separation and concentration of representations from shallow to deep layers.

Representation Learning

Paper
Code

OTOV2: Automatic, Generic, User-Friendly

1 code implementation • 13 Mar 2023 • Tianyi Chen, Luming Liang, Tianyu Ding, Zhihui Zhu, Ilya Zharkov

We propose the second generation of Only-Train-Once (OTOv2), which first automatically trains and compresses a general DNN only once from scratch to produce a more compact model with competitive performance without fine-tuning.

Model Compression

261

Paper
Code

A Provable Splitting Approach for Symmetric Nonnegative Matrix Factorization

no code implementations • 25 Jan 2023 • Xiao Li, Zhihui Zhu, Qiuwei Li, Kai Liu

The symmetric Nonnegative Matrix Factorization (NMF), a special but important class of the general NMF, has found numerous applications in data analysis such as various clustering tasks.

Clustering Image Clustering +1

Paper
Add Code

Principled and Efficient Transfer Learning of Deep Models via Neural Collapse

no code implementations • 23 Dec 2022 • Xiao Li, Sheng Liu, Jinxin Zhou, Xinyu Lu, Carlos Fernandez-Granda, Zhihui Zhu, Qing Qu

As model size continues to grow and access to labeled training data remains limited, transfer learning has become a popular approach in many scientific and engineering fields.

Data Augmentation Self-Supervised Learning +1

Paper
Add Code

Revisiting Sparse Convolutional Model for Visual Recognition

1 code implementation • 24 Oct 2022 • Xili Dai, Mingyang Li, Pengyuan Zhai, Shengbang Tong, Xingjian Gao, Shao-Lun Huang, Zhihui Zhu, Chong You, Yi Ma

We show that such models have equally strong empirical performance on CIFAR-10, CIFAR-100, and ImageNet datasets when compared to conventional neural networks.

Image Classification

119

Paper
Code

Are All Losses Created Equal: A Neural Collapse Perspective

no code implementations • 4 Oct 2022 • Jinxin Zhou, Chong You, Xiao Li, Kangning Liu, Sheng Liu, Qing Qu, Zhihui Zhu

We extend such results and show through global solution and landscape analyses that a broad family of loss functions including commonly used label smoothing (LS) and focal loss (FL) exhibits Neural Collapse.

Paper
Add Code

A Validation Approach to Over-parameterized Matrix and Image Recovery

no code implementations • 21 Sep 2022 • Lijun Ding, Zhen Qin, Liwei Jiang, Jinxin Zhou, Zhihui Zhu

In this paper, we study the problem of recovering a low-rank matrix from a number of noisy random linear measurements.

Image Restoration

Paper
Add Code

Neural Collapse with Normalized Features: A Geometric Analysis over the Riemannian Manifold

1 code implementation • 19 Sep 2022 • Can Yaras, Peng Wang, Zhihui Zhu, Laura Balzano, Qing Qu

When training overparameterized deep networks for classification tasks, it has been widely observed that the learned features exhibit a so-called "neural collapse" phenomenon.

Multi-class Classification Representation Learning +1

Paper
Code

Sparsity-guided Network Design for Frame Interpolation

1 code implementation • 9 Sep 2022 • Tianyu Ding, Luming Liang, Zhihui Zhu, Tianyi Chen, Ilya Zharkov

As a result, we achieve a considerable performance gain with a quarter of the size of the original AdaCoF.

108

Paper
Code

Error Analysis of Tensor-Train Cross Approximation

no code implementations • 9 Jul 2022 • Zhen Qin, Alexander Lidiak, Zhexuan Gong, Gongguo Tang, Michael B. Wakin, Zhihui Zhu

Tensor train decomposition is widely used in machine learning and quantum physics due to its concise representation of high-dimensional tensors, overcoming the curse of dimensionality.

Paper
Add Code

On the Optimization Landscape of Neural Collapse under MSE Loss: Global Optimality with Unconstrained Features

no code implementations • 2 Mar 2022 • Jinxin Zhou, Xiao Li, Tianyu Ding, Chong You, Qing Qu, Zhihui Zhu

When training deep neural networks for classification tasks, an intriguing empirical phenomenon has been widely observed in the last-layer classifiers and features, where (i) the class means and the last-layer classifiers all collapse to the vertices of a Simplex Equiangular Tight Frame (ETF) up to scaling, and (ii) cross-example within-class variability of last-layer activations collapses to zero.

Paper
Add Code

Robust Training under Label Noise by Over-parameterization

1 code implementation • 28 Feb 2022 • Sheng Liu, Zhihui Zhu, Qing Qu, Chong You

In this work, we propose a principled approach for robust training of over-parameterized deep networks in classification tasks where a proportion of training labels are corrupted.

Ranked #1 on Learning with noisy labels on CIFAR-10N-Random3

Learning with noisy labels

Paper
Code

Rank Overspecified Robust Matrix Recovery: Subgradient Method and Exact Recovery

no code implementations • NeurIPS 2021 • Lijun Ding, Liwei Jiang, Yudong Chen, Qing Qu, Zhihui Zhu

We study the robust recovery of a low-rank matrix from sparsely and grossly corrupted Gaussian measurements, with no prior knowledge on the intrinsic rank.

Paper
Add Code

Only Train Once: A One-Shot Neural Network Training And Pruning Framework

1 code implementation • NeurIPS 2021 • Tianyi Chen, Bo Ji, Tianyu Ding, Biyi Fang, Guanyi Wang, Zhihui Zhu, Luming Liang, Yixin Shi, Sheng Yi, Xiao Tu

Structured pruning is a commonly used technique in deploying deep neural networks (DNNs) onto resource-constrained devices.

261

Paper
Code

A Geometric Analysis of Neural Collapse with Unconstrained Features

1 code implementation • NeurIPS 2021 • Zhihui Zhu, Tianyu Ding, Jinxin Zhou, Xiao Li, Chong You, Jeremias Sulam, Qing Qu

In contrast to existing landscape analysis for deep neural networks which is often disconnected from practice, our analysis of the simplified model not only does it explain what kind of features are learned in the last layer, but it also shows why they can be efficiently optimized in the simplified settings, matching the empirical observations in practical deep network architectures.

Paper
Code

CDFI: Compression-Driven Network Design for Frame Interpolation

1 code implementation • CVPR 2021 • Tianyu Ding, Luming Liang, Zhihui Zhu, Ilya Zharkov

DNN-based frame interpolation--that generates the intermediate frames given two consecutive frames--typically relies on heavy model architectures with a huge number of features, preventing them from being deployed on systems with limited resources, e. g., mobile devices.

Ranked #1 on Video Frame Interpolation on Middlebury (LPIPS metric)

Video Frame Interpolation

108

Paper
Code

Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training

1 code implementation • NeurIPS 2021 • Sheng Liu, Xiao Li, Yuexiang Zhai, Chong You, Zhihui Zhu, Carlos Fernandez-Granda, Qing Qu

Furthermore, we show that our ConvNorm can reduce the layerwise spectral norm of the weight matrices and hence improve the Lipschitzness of the network, leading to easier training and improved robustness for deep ConvNets.

Generative Adversarial Network

Paper
Code

A Half-Space Stochastic Projected Gradient Method for Group Sparsity Regularization

no code implementations • 1 Jan 2021 • Tianyi Chen, Guanyi Wang, Tianyu Ding, Bo Ji, Sheng Yi, Zhihui Zhu

Optimizing with group sparsity is significant in enhancing model interpretability in machining learning applications, e. g., feature selection, compressed sensing and model compression.

feature selection Model Compression +1

Paper
Add Code

Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization

1 code implementation • NeurIPS 2020 • Chong You, Zhihui Zhu, Qing Qu, Yi Ma

This paper shows that with a double over-parameterization for both the low-rank matrix and sparse corruption, gradient descent with discrepant learning rates provably recovers the underlying matrix even without prior knowledge on neither rank of the matrix nor sparsity of the corruption.

Paper
Code

Recovery and Generalization in Over-Realized Dictionary Learning

no code implementations • 11 Jun 2020 • Jeremias Sulam, Chong You, Zhihui Zhu

We thoroughly demonstrate this observation in practice and provide an analysis of this phenomenon by tying recovery measures to generalization bounds.

Dictionary Learning Generalization Bounds

Paper
Add Code

Geometric Analysis of Nonconvex Optimization Landscapes for Overcomplete Learning

no code implementations • ICLR 2020 • Qing Qu, Yuexiang Zhai, Xiao Li, Yuqian Zhang, Zhihui Zhu

Learning overcomplete representations finds many applications in machine learning and data analytics.

Representation Learning

Paper
Add Code

Orthant Based Proximal Stochastic Gradient Method for $\ell_1$-Regularized Optimization

1 code implementation • 7 Apr 2020 • Tianyi Chen, Tianyu Ding, Bo Ji, Guanyi Wang, Jing Tian, Yixin Shi, Sheng Yi, Xiao Tu, Zhihui Zhu

Sparsity-inducing regularization problems are ubiquitous in machine learning applications, ranging from feature selection to model compression.

feature selection Model Compression

Paper
Code

Finding the Sparsest Vectors in a Subspace: Theory, Algorithms, and Applications

no code implementations • 20 Jan 2020 • Qing Qu, Zhihui Zhu, Xiao Li, Manolis C. Tsakiris, John Wright, René Vidal

The problem of finding the sparsest vector (direction) in a low dimensional subspace can be considered as a homogeneous variant of the sparse recovery problem, which finds applications in robust subspace recovery, dictionary learning, sparse blind deconvolution, and many other problems in signal processing and machine learning.

Dictionary Learning Representation Learning

Paper
Add Code

Analysis of the Optimization Landscapes for Overcomplete Representation Learning

no code implementations • 5 Dec 2019 • Qing Qu, Yuexiang Zhai, Xiao Li, Yuqian Zhang, Zhihui Zhu

In this work, we show these problems can be formulated as $\ell^4$-norm optimization problems with spherical constraint, and study the geometric properties of their nonconvex optimization landscapes.

Representation Learning

Paper
Add Code

Viewpoint-Aware Loss with Angular Regularization for Person Re-Identification

1 code implementation • 3 Dec 2019 • Zhihui Zhu, Xinyang Jiang, Feng Zheng, Xiaowei Guo, Feiyue Huang, Wei-Shi Zheng, Xing Sun

Instead of one subspace for each viewpoint, our method projects the feature from different viewpoints into a unified hypersphere and effectively models the feature distribution on both the identity-level and the viewpoint-level.

Ranked #5 on Person Re-Identification on Market-1501 (using extra training data)

Person Re-Identification

Paper
Code

Distributed Low-rank Matrix Factorization With Exact Consensus

1 code implementation • NeurIPS 2019 • Zhihui Zhu, Qiuwei Li, Xinshuo Yang, Gongguo Tang, Michael B. Wakin

Low-rank matrix factorization is a problem of broad importance, owing to the ubiquity of low-rank models in machine learning contexts.

Paper
Code

A Linearly Convergent Method for Non-Smooth Non-Convex Optimization on the Grassmannian with Applications to Robust Subspace and Dictionary Learning

no code implementations • NeurIPS 2019 • Zhihui Zhu, Tianyu Ding, Daniel Robinson, Manolis Tsakiris, René Vidal

Minimizing a non-smooth function over the Grassmannian appears in many applications in machine learning.

Dictionary Learning

Paper
Add Code

Weakly Convex Optimization over Stiefel Manifold Using Riemannian Subgradient-Type Methods

1 code implementation • 12 Nov 2019 • Xiao Li, Shixiang Chen, Zengde Deng, Qing Qu, Zhihui Zhu, Anthony Man Cho So

To the best of our knowledge, these are the first convergence guarantees for using Riemannian subgradient-type methods to optimize a class of nonconvex nonsmooth functions over the Stiefel manifold.

Dictionary Learning Vocal Bursts Type Prediction

Paper
Code

A Nonconvex Approach for Exact and Efficient Multichannel Sparse Blind Deconvolution

1 code implementation • NeurIPS 2019 • Qing Qu, Xiao Li, Zhihui Zhu

We study the multi-channel sparse blind deconvolution (MCS-BD) problem, whose task is to simultaneously recover a kernel $\mathbf a$ and multiple sparse inputs $\{\mathbf x_i\}_{i=1}^p$ from their circulant convolution $\mathbf y_i = \mathbf a \circledast \mathbf x_i $ ($i=1,\cdots, p$).

Computational Efficiency

Paper
Code

Provable Bregman-divergence based Methods for Nonconvex and Non-Lipschitz Problems

no code implementations • 22 Apr 2019 • Qiuwei Li, Zhihui Zhu, Gongguo Tang, Michael B. Wakin

Therefore, this work not only develops guaranteed optimization methods for non-Lipschitz smooth problems but also solves an open problem of showing the second-order convergence guarantees for these alternating minimization methods.

Paper
Add Code

Dual Principal Component Pursuit: Probability Analysis and Efficient Algorithms

no code implementations • 24 Dec 2018 • Zhihui Zhu, Yifan Wang, Daniel P. Robinson, Daniel Q. Naiman, Rene Vidal, Manolis C. Tsakiris

However, its geometric analysis is based on quantities that are difficult to interpret and are not amenable to statistical analysis.

Paper
Add Code

Dual Principal Component Pursuit: Improved Analysis and Efficient Algorithms

no code implementations • NeurIPS 2018 • Zhihui Zhu, Yifan Wang, Daniel Robinson, Daniel Naiman, Rene Vidal, Manolis Tsakiris

However, its geometric analysis is based on quantities that are difficult to interpret and are not amenable to statistical analysis.

Paper
Add Code

Dropping Symmetry for Fast Symmetric Nonnegative Matrix Factorization

no code implementations • NeurIPS 2018 • Zhihui Zhu, Xiao Li, Kai Liu, Qiuwei Li

Symmetric nonnegative matrix factorization (NMF), a special but important class of the general NMF, is demonstrated to be useful for data analysis and in particular for various clustering tasks.

Clustering Image Clustering

Paper
Add Code

Global Optimality in Distributed Low-rank Matrix Factorization

no code implementations • 7 Nov 2018 • Zhihui Zhu, Qiuwei Li, Xinshuo Yang, Gongguo Tang, Michael B. Wakin

We study the convergence of a variant of distributed gradient descent (DGD) on a distributed low-rank matrix approximation problem wherein some optimization variables are used for consensus (as in classical DGD) and some optimization variables appear only locally at a single node in the network.

Paper
Add Code

Nonconvex Robust Low-rank Matrix Recovery

no code implementations • 24 Sep 2018 • Xiao Li, Zhihui Zhu, Anthony Man-Cho So, Rene Vidal

In this paper we study the problem of recovering a low-rank matrix from a number of random linear measurements that are corrupted by outliers taking arbitrary values.

Information Theory Information Theory

Paper
Add Code

The Global Optimization Geometry of Shallow Linear Neural Networks

no code implementations • 13 May 2018 • Zhihui Zhu, Daniel Soudry, Yonina C. Eldar, Michael B. Wakin

We examine the squared error loss landscape of shallow linear neural networks.

Paper
Add Code

Optimized Structured Sparse Sensing Matrices for Compressive Sensing

no code implementations • 19 Sep 2017 • Tao Hong, Xiao Li, Zhihui Zhu, Qiuwei Li

We consider designing a robust structured sparse sensing matrix consisting of a sparse matrix with a few non-zero entries per row and a dense base matrix for capturing signals efficiently We design the robust structured sparse sensing matrix through minimizing the distance between the Gram matrix of the equivalent dictionary and the target Gram of matrix holding small mutual coherence.

Compressive Sensing Image Compression

Paper
Add Code

Geometry of Factored Nuclear Norm Regularization

no code implementations • 5 Apr 2017 • Qiuwei Li, Zhihui Zhu, Gongguo Tang

In spite of the nonconvexity of the factored formulation, we prove that when the convex loss function $f(X)$ is $(2r, 4r)$-restricted well-conditioned, each critical point of the factored problem either corresponds to the optimal solution $X^\star$ of the original convex optimization or is a strict saddle point where the Hessian matrix has a strictly negative eigenvalue.

Paper
Add Code

Online Learning Sensing Matrix and Sparsifying Dictionary Simultaneously for Compressive Sensing

1 code implementation • 4 Jan 2017 • Tao Hong, Zhihui Zhu

The simulation results on natural images demonstrate the effectiveness of the suggested online algorithm compared with the existing methods.

Compressive Sensing

Paper
Code

An Efficient Method for Robust Projection Matrix Design

1 code implementation • 27 Sep 2016 • Tao Hong, Zhihui Zhu

Without requiring of training data, we can efficiently design the robust projection matrix and apply it for most of CS systems, like a CS system for image processing with a conventional wavelet dictionary in which the SRE matrix is generally not available.

Compressive Sensing

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.