Search Results for author: Weijie J. Su

Found 59 papers, 28 papers with code

Evaluating the Unseen Capabilities: How Many Theorems Do LLMs Know?

no code implementations1 Jun 2025 Xiang Li, Jiayi Xin, Qi Long, Weijie J. Su

Accurate evaluation of large language models (LLMs) is crucial for understanding their capabilities and guiding their development.

Diversity Information Retrieval

Fundamental Limits of Game-Theoretic LLM Alignment: Smith Consistency and Preference Matching

no code implementations27 May 2025 Zhekun Shi, Kaizhao Liu, Qi Long, Weijie J. Su, Jiancong Xiao

However, using raw preference as the payoff in the game highly limits the potential of the game-theoretic LLM alignment framework.

Diversity

Statistical Impossibility and Possibility of Aligning LLMs with Human Preferences: From Condorcet Paradox to Nash Equilibrium

1 code implementation14 Mar 2025 Kaizhao Liu, Qi Long, Zhekun Shi, Weijie J. Su, Jiancong Xiao

As a blessing, we prove that this condition holds with high probability under the probabilistic preference model, thereby highlighting the statistical possibility of preserving minority preferences without explicit regularization in aligning LLMs.

Fairness

An Overview of Large Language Models for Statisticians

no code implementations25 Feb 2025 Wenlong Ji, Weizhe Yuan, Emily Getzen, Kyunghyun Cho, Michael I. Jordan, Song Mei, Jason E Weston, Weijie J. Su, Jing Xu, Linjun Zhang

Large Language Models (LLMs) have emerged as transformative tools in artificial intelligence (AI), exhibiting remarkable capabilities across diverse tasks such as text generation, reasoning, and decision-making.

Causal Inference Decision Making +3

Robust Detection of Watermarks for Large Language Models Under Human Edits

1 code implementation21 Nov 2024 Xiang Li, Feng Ruan, Huiyuan Wang, Qi Long, Weijie J. Su

We prove that the Tr-GoF test achieves optimality in robust detection of the Gumbel-max watermark in a certain asymptotic regime of substantial text modifications and vanishing watermark signals.

Debiasing Watermarks for Large Language Models via Maximal Coupling

1 code implementation17 Nov 2024 Yangxinyu Xie, Xiang Li, Tanwi Mallick, Weijie J. Su, Ruixun Zhang

Watermarking language models is essential for distinguishing between human and machine-generated text and thus maintaining the integrity and trustworthiness of digital communication.

The 2020 United States Decennial Census Is More Private Than You (Might) Think

2 code implementations11 Oct 2024 Buxin Su, Weijie J. Su, Chendi Wang

The U. S. Decennial Census serves as the foundation for many high-profile policy decision-making processes, including federal funding allocation and redistricting.

A Statistical Viewpoint on Differential Privacy: Hypothesis Testing, Representation and Blackwell's Theorem

no code implementations14 Sep 2024 Weijie J. Su

We review techniques that render $f$-differential privacy a unified framework for analyzing privacy bounds in data analysis and machine learning.

Informativeness Privacy Preserving

A Law of Next-Token Prediction in Large Language Models

1 code implementation24 Aug 2024 Hangfeng He, Weijie J. Su

Large language models (LLMs) have been widely employed across various application domains, yet their black-box nature poses significant challenges to understanding how these models process input data internally to make predictions.

Mamba

A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners

1 code implementation16 Jun 2024 Bowen Jiang, Yangxinyu Xie, Zhuoqun Hao, Xiaomeng Wang, Tanwi Mallick, Weijie J. Su, Camillo J. Taylor, Dan Roth

This study introduces a hypothesis-testing framework to assess whether large language models (LLMs) possess genuine reasoning abilities or primarily depend on token bias.

Logical Reasoning

Bridging the Gap: Rademacher Complexity in Robust and Standard Generalization

no code implementations8 Jun 2024 Jiancong Xiao, Ruoyu Sun, Qi Long, Weijie J. Su

We aim to construct a new cover that possesses two properties: 1) compatibility with adversarial examples, and 2) precision comparable to covers used in standard settings.

Tackling Copyright Issues in AI Image Generation Through Originality Estimation and Genericization

1 code implementation5 Jun 2024 Hiroaki Chiba-Okabe, Weijie J. Su

As a practical implementation, we introduce PREGen (Prompt Rewriting-Enhanced Genericization), which combines our genericization method with an existing mitigation technique.

Image Generation

Towards Rationality in Language and Multimodal Agents: A Survey

1 code implementation1 Jun 2024 Bowen Jiang, Yangxinyu Xie, Xiaomeng Wang, Yuan Yuan, Zhuoqun Hao, Xinyi Bai, Weijie J. Su, Camillo J. Taylor, Tanwi Mallick

This work discusses how to build more rational language and multimodal agents and what criteria define rationality in intelligent systems.

Decision Making Survey

On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization

1 code implementation26 May 2024 Jiancong Xiao, Ziniu Li, Xingyu Xie, Emily Getzen, Cong Fang, Qi Long, Weijie J. Su

To mitigate this algorithmic bias, we introduce preference matching (PM) RLHF, a novel approach that provably aligns LLMs with the preference distribution of the reward model under the Bradley--Terry--Luce/Plackett--Luce model.

Decision Making Text Generation

Neural Collapse Meets Differential Privacy: Curious Behaviors of NoisyGD with Near-perfect Representation Learning

no code implementations14 May 2024 Chendi Wang, Yuqing Zhu, Weijie J. Su, Yu-Xiang Wang

A recent study by De et al. (2022) has reported that large-scale representation learning through pre-training on a public dataset significantly enhances differentially private (DP) learning in downstream tasks, despite the high dimensionality of the feature space.

Dimensionality Reduction Representation Learning +1

An Economic Solution to Copyright Challenges of Generative AI

no code implementations22 Apr 2024 Jiachen T. Wang, Zhun Deng, Hiroaki Chiba-Okabe, Boaz Barak, Weijie J. Su

Generative artificial intelligence (AI) systems are trained on large data corpora to generate new pieces of text, images, videos, and other media.

Provable Multi-Party Reinforcement Learning with Diverse Human Feedback

no code implementations8 Mar 2024 Huiying Zhong, Zhun Deng, Weijie J. Su, Zhiwei Steven Wu, Linjun Zhang

Our work \textit{initiates} the theoretical study of multi-party RLHF that explicitly models the diverse preferences of multiple individuals.

Fairness Meta-Learning +2

What Should Data Science Education Do with Large Language Models?

no code implementations6 Jul 2023 Xinming Tu, James Zou, Weijie J. Su, Linjun Zhang

LLMs can also play a significant role in the classroom as interactive teaching and learning tools, contributing to personalized education.

DP-HyPO: An Adaptive Private Hyperparameter Optimization Framework

no code implementations9 Jun 2023 Hua Wang, Sheng Gao, Huanyu Zhang, Weijie J. Su, Milan Shen

In our paper, we introduce DP-HyPO, a pioneering framework for ``adaptive'' private hyperparameter optimization, aiming to bridge the gap between private and non-private hyperparameter optimization.

Hyperparameter Optimization Privacy Preserving

Reward Collapse in Aligning Large Language Models

1 code implementation28 May 2023 Ziang Song, Tianle Cai, Jason D. Lee, Weijie J. Su

This insight allows us to derive closed-form expressions for the reward distribution associated with a set of utility functions in an asymptotic regime.

The Implicit Regularization of Dynamical Stability in Stochastic Gradient Descent

no code implementations27 May 2023 Lei Wu, Weijie J. Su

By contrast, for gradient descent (GD), the stability imposes a similar constraint but only on the largest eigenvalue of Hessian.

Isotonic Mechanism for Exponential Family Estimation in Machine Learning Peer Review

no code implementations21 Apr 2023 Yuling Yan, Weijie J. Su, Jianqing Fan

In 2023, the International Conference on Machine Learning (ICML) required authors with multiple submissions to rank their submissions based on perceived quality.

A Law of Data Separation in Deep Learning

2 code implementations31 Oct 2022 Hangfeng He, Weijie J. Su

While deep learning has enabled significant advances in many areas of science, its black-box nature hinders architecture design for future artificial intelligence applications and interpretation for high-stakes decision makings.

Deep Learning

On Quantum Speedups for Nonconvex Optimization via Quantum Tunneling Walks

1 code implementation29 Sep 2022 Yizhou Liu, Weijie J. Su, Tongyang Li

Classical algorithms are often not effective for solving nonconvex optimization problems where local minima are separated by high barriers.

Analytical Composition of Differential Privacy via the Edgeworth Accountant

1 code implementation9 Jun 2022 Hua Wang, Sheng Gao, Huanyu Zhang, Milan Shen, Weijie J. Su

Many modern machine learning algorithms are composed of simple private algorithms; thus, an increasingly important problem is to efficiently compute the overall privacy loss under composition.

FIFA: Making Fairness More Generalizable in Classifiers Trained on Imbalanced Data

no code implementations6 Jun 2022 Zhun Deng, Jiayao Zhang, Linjun Zhang, Ting Ye, Yates Coley, Weijie J. Su, James Zou

Specifically, FIFA encourages both classification and fairness generalization and can be flexibly combined with many existing fair learning methods with logits-based losses.

Classification Fairness

ROCK: Causal Inference Principles for Reasoning about Commonsense Causality

1 code implementation31 Jan 2022 Jiayao Zhang, Hongming Zhang, Weijie J. Su, Dan Roth

Commonsense causality reasoning (CCR) aims at identifying plausible causes and effects in natural language descriptions that are deemed reasonable by an average person.

Causal Inference

Envisioning Future Deep Learning Theories: Some Basic Concepts and Characteristics

no code implementations17 Dec 2021 Weijie J. Su

To advance deep learning methodologies in the next decade, a theoretical framework for reasoning about modern neural networks is needed.

Deep Learning Learning Theory

You Are the Best Reviewer of Your Own Papers: An Owner-Assisted Scoring Mechanism

no code implementations27 Oct 2021 Weijie J. Su

To address this withholding of information, in this paper, I introduce the Isotonic Mechanism, a simple and efficient approach to improving imprecise raw scores by leveraging certain information that the owner is incentivized to provide.

Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations

1 code implementation NeurIPS 2021 Jiayao Zhang, Hua Wang, Weijie J. Su

Our main finding uncovers a sharp phase transition phenomenon regarding the {intra-class impact: if the SDEs are locally elastic in the sense that the impact is more significant on samples from the same class as the input, the features of the training data become linearly separable, meaning vanishing training loss; otherwise, the features are not separable, regardless of how long the training time is.

Deep Learning

An Unconstrained Layer-Peeled Perspective on Neural Collapse

no code implementations ICLR 2022 Wenlong Ji, Yiping Lu, Yiliang Zhang, Zhun Deng, Weijie J. Su

We prove that gradient flow on this model converges to critical points of a minimum-norm separation problem exhibiting neural collapse in its global minimizer.

Weighted Training for Cross-Task Learning

1 code implementation ICLR 2022 Shuxiao Chen, Koby Crammer, Hangfeng He, Dan Roth, Weijie J. Su

In this paper, we introduce Target-Aware Weighted Training (TAWT), a weighted training algorithm for cross-task learning based on minimizing a representation-based task distance between the source and target tasks.

Chunking named-entity-recognition +6

Characterizing the SLOPE Trade-off: A Variational Perspective and the Donoho-Tanner Limit

1 code implementation27 May 2021 Zhiqi Bu, Jason Klusowski, Cynthia Rush, Weijie J. Su

Sorted l1 regularization has been incorporated into many methods for solving high-dimensional statistical estimation problems, including the SLOPE estimator in linear regression.

Variable Selection

Oneshot Differentially Private Top-k Selection

no code implementations18 May 2021 Gang Qiao, Weijie J. Su, Li Zhang

Being able to efficiently and accurately select the top-$k$ elements with differential privacy is an integral component of various private data analysis tasks.

Rejoinder: Gaussian Differential Privacy

no code implementations5 Apr 2021 Jinshuo Dong, Aaron Roth, Weijie J. Su

In this rejoinder, we aim to address two broad issues that cover most comments made in the discussion.

Privacy Preserving

A Central Limit Theorem for Differentially Private Query Answering

no code implementations NeurIPS 2021 Jinshuo Dong, Weijie J. Su, Linjun Zhang

The central question, therefore, is to understand which noise distribution optimizes the privacy-accuracy trade-off, especially when the dimension of the answer vector is high.

A Theorem of the Alternative for Personalized Federated Learning

no code implementations2 Mar 2021 Shuxiao Chen, Qinqing Zheng, Qi Long, Weijie J. Su

A widely recognized difficulty in federated learning arises from the statistical heterogeneity among clients: local datasets often come from different but not entirely unrelated distributions, and personalization is, therefore, necessary to achieve optimal results from each individual's perspective.

Personalized Federated Learning

Federated $f$-Differential Privacy

1 code implementation22 Feb 2021 Qinqing Zheng, Shuxiao Chen, Qi Long, Weijie J. Su

Federated learning (FL) is a training paradigm where the clients collaboratively learn models by repeatedly sharing information without compromising much on the privacy of their local sensitive data.

Federated Learning

Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse in Imbalanced Training

1 code implementation29 Jan 2021 Cong Fang, Hangfeng He, Qi Long, Weijie J. Su

More importantly, when moving to the imbalanced case, our analysis of the Layer-Peeled Model reveals a hitherto unknown phenomenon that we term \textit{Minority Collapse}, which fundamentally limits the performance of deep learning models on the minority classes.

Toward Better Generalization Bounds with Locally Elastic Stability

no code implementations27 Oct 2020 Zhun Deng, Hangfeng He, Weijie J. Su

Given that, we propose \emph{locally elastic stability} as a weaker and distribution-dependent stability notion, which still yields exponential generalization bounds.

Generalization Bounds Learning Theory

Precise High-Dimensional Asymptotics for Quantifying Heterogeneous Transfers

no code implementations22 Oct 2020 Fan Yang, Hongyang R. Zhang, Sen Wu, Christopher Ré, Weijie J. Su

For example, we can identify a phase transition in the high-dimensional linear regression setting from positive transfer to negative transfer under a model shift between the source and target tasks.

Multi-Task Learning text-classification +1

Label-Aware Neural Tangent Kernel: Toward Better Generalization and Local Elasticity

1 code implementation NeurIPS 2020 Shuxiao Chen, Hangfeng He, Weijie J. Su

As a popular approach to modeling the dynamics of training overparametrized neural networks (NNs), the neural tangent kernels (NTK) are known to fall behind real-world NNs in generalization ability.

Towards Understanding the Dynamics of the First-Order Adversaries

no code implementations ICML 2020 Zhun Deng, Hangfeng He, Jiaoyang Huang, Weijie J. Su

An acknowledged weakness of neural networks is their vulnerability to adversarial perturbations to the inputs.

The Complete Lasso Tradeoff Diagram

2 code implementations NeurIPS 2020 Hua Wang, Yachong Yang, Zhiqi Bu, Weijie J. Su

A fundamental problem in the high-dimensional regression is to understand the tradeoff between type I and type II errors or, equivalently, false discovery rate (FDR) and power in variable selection.

Statistics Theory Information Theory Information Theory Statistics Theory

On Learning Rates and Schrödinger Operators

no code implementations15 Apr 2020 Bin Shi, Weijie J. Su, Michael. I. Jordan

In this paper, we present a general theoretical analysis of the effect of the learning rate in stochastic gradient descent (SGD).

Sharp Composition Bounds for Gaussian Differential Privacy via Edgeworth Expansion

1 code implementation ICML 2020 Qinqing Zheng, Jinshuo Dong, Qi Long, Weijie J. Su

To address this question, we introduce a family of analytical and sharp privacy bounds under composition using the Edgeworth expansion in the framework of the recently proposed f-differential privacy.

Deep Learning with Gaussian Differential Privacy

3 code implementations26 Nov 2019 Zhiqi Bu, Jinshuo Dong, Qi Long, Weijie J. Su

Leveraging the appealing properties of $f$-differential privacy in handling composition and subsampling, this paper derives analytically tractable expressions for the privacy guarantees of both stochastic gradient descent and Adam used in training deep neural networks, without the need of developing sophisticated techniques as [3] did.

Deep Learning General Classification +4

The Local Elasticity of Neural Networks

1 code implementation ICLR 2020 Hangfeng He, Weijie J. Su

This phenomenon is shown to persist for neural networks with nonlinear activation functions through extensive simulations on real-life and synthetic datasets, whereas this is not observed in linear classifiers.

Clustering

Gaussian Differential Privacy

3 code implementations7 May 2019 Jinshuo Dong, Aaron Roth, Weijie J. Su

More precisely, the privacy guarantees of \emph{any} hypothesis testing based definition of privacy (including original DP) converges to GDP in the limit under composition.

Two-sample testing

Acceleration via Symplectic Discretization of High-Resolution Differential Equations

no code implementations NeurIPS 2019 Bin Shi, Simon S. Du, Weijie J. Su, Michael. I. Jordan

We study first-order optimization methods obtained by discretizing ordinary differential equations (ODEs) corresponding to Nesterov's accelerated gradient methods (NAGs) and Polyak's heavy-ball method.

Vocal Bursts Intensity Prediction

Understanding the Acceleration Phenomenon via High-Resolution Differential Equations

no code implementations21 Oct 2018 Bin Shi, Simon S. Du, Michael. I. Jordan, Weijie J. Su

We also show that these ODEs are more accurate surrogates for the underlying algorithms; in particular, they not only distinguish between NAG-SC and Polyak's heavy-ball method, but they allow the identification of a term that we refer to as "gradient correction" that is present in NAG-SC but not in the heavy-ball method and is responsible for the qualitative difference in convergence of the two methods.

Vocal Bursts Intensity Prediction

Differentially Private False Discovery Rate Control

no code implementations11 Jul 2018 Cynthia Dwork, Weijie J. Su, Li Zhang

Differential privacy provides a rigorous framework for privacy-preserving data analysis.

Privacy Preserving Two-sample testing

Robust Inference Under Heteroskedasticity via the Hadamard Estimator

1 code implementation1 Jul 2018 Edgar Dobriban, Weijie J. Su

In this paper, we propose methods that are robust to large and unequal noise in different observational units (i. e., heteroskedasticity) for statistical inference in linear regression.

Statistics Theory Methodology Statistics Theory

HiGrad: Uncertainty Quantification for Online Learning and Stochastic Approximation

no code implementations13 Feb 2018 Weijie J. Su, Yuancheng Zhu

Stochastic gradient descent (SGD) is an immensely popular approach for online learning in settings where data arrives in a stream or data sizes are very large.

Uncertainty Quantification

When Is the First Spurious Variable Selected by Sequential Regression Procedures?

no code implementations10 Aug 2017 Weijie J. Su

In a regime of certain sparsity levels, however, three examples of sequential procedures--forward stepwise, the lasso, and least angle regression--are shown to include the first spurious variable unexpectedly early.

Model Selection regression

Detecting Multiple Replicating Signals using Adaptive Filtering Procedures

1 code implementation11 Oct 2016 Jingshu Wang, Lin Gui, Weijie J. Su, Chiara Sabatti, Art B. Owen

Replicability is a fundamental quality of scientific discoveries: we are interested in those signals that are detectable in different laboratories, study populations, across time etc.

Methodology

Cannot find the paper you are looking for? You can Submit a new open access paper.