Search Results for author: Chong You

Found 33 papers, 13 papers with code

HiRE: High Recall Approximate Top-$k$ Estimation for Efficient LLM Inference

no code implementations14 Feb 2024 Yashas Samaga B L, Varun Yerram, Chong You, Srinadh Bhojanapalli, Sanjiv Kumar, Prateek Jain, Praneeth Netrapalli

Autoregressive decoding with generative Large Language Models (LLMs) on accelerators (GPUs/TPUs) is often memory-bound where most of the time is spent on transferring model parameters from high bandwidth memory (HBM) to cache.

It's an Alignment, Not a Trade-off: Revisiting Bias and Variance in Deep Models

no code implementations13 Oct 2023 Lin Chen, Michal Lukasik, Wittawat Jitkrittum, Chong You, Sanjiv Kumar

Classical wisdom in machine learning holds that the generalization error can be decomposed into bias and variance, and these two terms exhibit a \emph{trade-off}.

Generalized Neural Collapse for a Large Number of Classes

no code implementations9 Oct 2023 Jiachen Jiang, Jinxin Zhou, Peng Wang, Qing Qu, Dustin Mixon, Chong You, Zhihui Zhu

However, most of the existing empirical and theoretical studies in neural collapse focus on the case that the number of classes is small relative to the dimension of the feature space.

Face Recognition Retrieval

Functional Interpolation for Relative Positions Improves Long Context Transformers

no code implementations6 Oct 2023 Shanda Li, Chong You, Guru Guruganesh, Joshua Ainslie, Santiago Ontanon, Manzil Zaheer, Sumit Sanghai, Yiming Yang, Sanjiv Kumar, Srinadh Bhojanapalli

Preventing the performance decay of Transformers on inputs longer than those used for training has been an important challenge in extending the context length of these models.

Language Modelling Position

Revisiting Sparse Convolutional Model for Visual Recognition

1 code implementation24 Oct 2022 Xili Dai, Mingyang Li, Pengyuan Zhai, Shengbang Tong, Xingjian Gao, Shao-Lun Huang, Zhihui Zhu, Chong You, Yi Ma

We show that such models have equally strong empirical performance on CIFAR-10, CIFAR-100, and ImageNet datasets when compared to conventional neural networks.

Image Classification

The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers

no code implementations12 Oct 2022 Zonglin Li, Chong You, Srinadh Bhojanapalli, Daliang Li, Ankit Singh Rawat, Sashank J. Reddi, Ke Ye, Felix Chern, Felix Yu, Ruiqi Guo, Sanjiv Kumar

This paper studies the curious phenomenon for machine learning models with Transformer architectures that their activation maps are sparse.

Are All Losses Created Equal: A Neural Collapse Perspective

no code implementations4 Oct 2022 Jinxin Zhou, Chong You, Xiao Li, Kangning Liu, Sheng Liu, Qing Qu, Zhihui Zhu

We extend such results and show through global solution and landscape analyses that a broad family of loss functions including commonly used label smoothing (LS) and focal loss (FL) exhibits Neural Collapse.

Teacher Guided Training: An Efficient Framework for Knowledge Transfer

no code implementations14 Aug 2022 Manzil Zaheer, Ankit Singh Rawat, Seungyeon Kim, Chong You, Himanshu Jain, Andreas Veit, Rob Fergus, Sanjiv Kumar

In this paper, we propose the teacher-guided training (TGT) framework for training a high-quality compact model that leverages the knowledge acquired by pretrained generative models, while obviating the need to go through a large volume of data.

Generalization Bounds Image Classification +4

On the Optimization Landscape of Neural Collapse under MSE Loss: Global Optimality with Unconstrained Features

no code implementations2 Mar 2022 Jinxin Zhou, Xiao Li, Tianyu Ding, Chong You, Qing Qu, Zhihui Zhu

When training deep neural networks for classification tasks, an intriguing empirical phenomenon has been widely observed in the last-layer classifiers and features, where (i) the class means and the last-layer classifiers all collapse to the vertices of a Simplex Equiangular Tight Frame (ETF) up to scaling, and (ii) cross-example within-class variability of last-layer activations collapses to zero.

Robust Training under Label Noise by Over-parameterization

1 code implementation28 Feb 2022 Sheng Liu, Zhihui Zhu, Qing Qu, Chong You

In this work, we propose a principled approach for robust training of over-parameterized deep networks in classification tasks where a proportion of training labels are corrupted.

Learning with noisy labels

Learning a Self-Expressive Network for Subspace Clustering

1 code implementation CVPR 2021 Shangzhi Zhang, Chong You, René Vidal, Chun-Guang Li

We show that our SENet can not only learn the self-expressive coefficients with desired properties on the training data, but also handle out-of-sample data.

Clustering

ReduNet: A White-box Deep Network from the Principle of Maximizing Rate Reduction

2 code implementations21 May 2021 Kwan Ho Ryan Chan, Yaodong Yu, Chong You, Haozhi Qi, John Wright, Yi Ma

This work attempts to provide a plausible theoretical framework that aims to interpret modern deep (convolutional) networks from the principles of data compression and discriminative representation.

Data Compression

A Geometric Analysis of Neural Collapse with Unconstrained Features

1 code implementation NeurIPS 2021 Zhihui Zhu, Tianyu Ding, Jinxin Zhou, Xiao Li, Chong You, Jeremias Sulam, Qing Qu

In contrast to existing landscape analysis for deep neural networks which is often disconnected from practice, our analysis of the simplified model not only does it explain what kind of features are learned in the last layer, but it also shows why they can be efficiently optimized in the simplified settings, matching the empirical observations in practical deep network architectures.

Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training

1 code implementation NeurIPS 2021 Sheng Liu, Xiao Li, Yuexiang Zhai, Chong You, Zhihui Zhu, Carlos Fernandez-Granda, Qing Qu

Furthermore, we show that our ConvNorm can reduce the layerwise spectral norm of the weight matrices and hence improve the Lipschitzness of the network, leading to easier training and improved robustness for deep ConvNets.

Generative Adversarial Network

Incremental Learning via Rate Reduction

no code implementations CVPR 2021 Ziyang Wu, Christina Baek, Chong You, Yi Ma

Current deep learning architectures suffer from catastrophic forgetting, a failure to retain knowledge of previously learned classes when incrementally trained on new classes.

Incremental Learning

Deep Networks from the Principle of Rate Reduction

3 code implementations27 Oct 2020 Kwan Ho Ryan Chan, Yaodong Yu, Chong You, Haozhi Qi, John Wright, Yi Ma

The layered architectures, linear and nonlinear operators, and even parameters of the network are all explicitly constructed layer-by-layer in a forward propagation fashion by emulating the gradient scheme.

A Critique of Self-Expressive Deep Subspace Clustering

no code implementations ICLR 2021 Benjamin D. Haeffele, Chong You, René Vidal

To extend this approach to data supported on a union of non-linear manifolds, numerous studies have proposed learning an embedding of the original data using a neural network which is regularized by a self-expressive loss function on the data in the embedded space to encourage a union of linear subspaces prior on the data in the embedded space.

Clustering

Deep Isometric Learning for Visual Recognition

1 code implementation ICML 2020 Haozhi Qi, Chong You, Xiaolong Wang, Yi Ma, Jitendra Malik

Initialization, normalization, and skip connections are believed to be three indispensable techniques for training very deep convolutional neural networks and obtaining state-of-the-art performance.

Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization

1 code implementation NeurIPS 2020 Chong You, Zhihui Zhu, Qing Qu, Yi Ma

This paper shows that with a double over-parameterization for both the low-rank matrix and sparse corruption, gradient descent with discrepant learning rates provably recovers the underlying matrix even without prior knowledge on neither rank of the matrix nor sparsity of the corruption.

Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction

2 code implementations NeurIPS 2020 Yaodong Yu, Kwan Ho Ryan Chan, Chong You, Chaobing Song, Yi Ma

To learn intrinsic low-dimensional structures from high-dimensional data that most discriminate between classes, we propose the principle of Maximal Coding Rate Reduction ($\text{MCR}^2$), an information-theoretic measure that maximizes the coding rate difference between the whole dataset and the sum of each individual class.

Clustering Contrastive Learning +1

Recovery and Generalization in Over-Realized Dictionary Learning

no code implementations11 Jun 2020 Jeremias Sulam, Chong You, Zhihui Zhu

We thoroughly demonstrate this observation in practice and provide an analysis of this phenomenon by tying recovery measures to generalization bounds.

Dictionary Learning Generalization Bounds

Self-Representation Based Unsupervised Exemplar Selection in a Union of Subspaces

no code implementations7 Jun 2020 Chong You, Chi Li, Daniel P. Robinson, Rene Vidal

When the dataset is drawn from a union of independent subspaces, our method is able to select sufficiently many representatives from each subspace.

Clustering

Is an Affine Constraint Needed for Affine Subspace Clustering?

no code implementations ICCV 2019 Chong You, Chun-Guang Li, Daniel P. Robinson, Rene Vidal

Specifically, our analysis provides conditions that guarantee the correctness of affine subspace clustering methods both with and without the affine constraint, and shows that these conditions are satisfied for high-dimensional data.

Clustering Face Clustering +1

Stochastic Sparse Subspace Clustering

no code implementations CVPR 2020 Ying Chen, Chun-Guang Li, Chong You

State-of-the-art subspace clustering methods are based on self-expressive model, which represents each data point as a linear combination of other data points.

Clustering

Rethinking Bias-Variance Trade-off for Generalization of Neural Networks

1 code implementation ICML 2020 Zitong Yang, Yaodong Yu, Chong You, Jacob Steinhardt, Yi Ma

We provide a simple explanation for this by measuring the bias and variance of neural networks: while the bias is monotonically decreasing as in the classical theory, the variance is unimodal or bell-shaped: it increases then decreases with the width of the network.

Basis Pursuit and Orthogonal Matching Pursuit for Subspace-preserving Recovery: Theoretical Analysis

no code implementations30 Dec 2019 Daniel P. Robinson, Rene Vidal, Chong You

The goal is to have the representation $c$ correctly identify the subspace, i. e. the nonzero entries of $c$ should correspond to columns of $A$ that are in the subspace $\mathcal{S}_0$.

Self-Supervised Convolutional Subspace Clustering Network

no code implementations CVPR 2019 Junjian Zhang, Chun-Guang Li, Chong You, Xianbiao Qi, Honggang Zhang, Jun Guo, Zhouchen Lin

However, the applicability of subspace clustering has been limited because practical visual data in raw form do not necessarily lie in such linear subspaces.

Clustering Image Clustering

Scalable Exemplar-based Subspace Clustering on Class-Imbalanced Data

no code implementations ECCV 2018 Chong You, Chi Li, Daniel P. Robinson, Rene Vidal

Our experiments demonstrate that the proposed method outperforms state-of-the-art subspace clustering methods in two large-scale image datasets that are imbalanced.

Clustering Image Classification

On Geometric Analysis of Affine Sparse Subspace Clustering

no code implementations17 Aug 2018 Chun-Guang Li, Chong You, René Vidal

In this paper, we develop a novel geometric analysis for a variant of SSC, named affine SSC (ASSC), for the problem of clustering data from a union of affine subspaces.

Clustering

Provable Self-Representation Based Outlier Detection in a Union of Subspaces

no code implementations CVPR 2017 Chong You, Daniel P. Robinson, René Vidal

While outlier detection methods based on robust statistics have existed for decades, only recently have methods based on sparse and low-rank representation been developed along with guarantees of correct outlier detection when the inliers lie in one or more low-dimensional subspaces.

Outlier Detection

Structured Sparse Subspace Clustering: A Joint Affinity Learning and Subspace Clustering Framework

no code implementations17 Oct 2016 Chun-Guang Li, Chong You, René Vidal

In this paper, we propose a joint optimization framework --- Structured Sparse Subspace Clustering (S$^3$C) --- for learning both the affinity and the segmentation.

Clustering Motion Segmentation +1

Oracle Based Active Set Algorithm for Scalable Elastic Net Subspace Clustering

1 code implementation CVPR 2016 Chong You, Chun-Guang Li, Daniel P. Robinson, Rene Vidal

Our geometric analysis also provides a theoretical justification and a geometric interpretation for the balance between the connectedness (due to $\ell_2$ regularization) and subspace-preserving (due to $\ell_1$ regularization) properties for elastic net subspace clustering.

Ranked #7 on Image Clustering on coil-100 (Accuracy metric)

Clustering Image Clustering

Scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit

2 code implementations CVPR 2016 Chong You, Daniel P. Robinson, Rene Vidal

Subspace clustering methods based on $\ell_1$, $\ell_2$ or nuclear norm regularization have become very popular due to their simplicity, theoretical guarantees and empirical success.

Clustering Face Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.