1 code implementation • 9 Apr 2025 • Li An, Yujian Liu, Yepeng Liu, Yang Zhang, Yuheng Bu, Shiyu Chang
We identify two core challenges that make defending against spoofing difficult: (1) the need for watermarks to be both sensitive to semantic-distorting changes and insensitive to semantic-preserving edits, and (2) the contradiction between the need to detect global semantic shifts and the local, auto-regressive nature of most watermarking schemes.
no code implementations • 15 Feb 2025 • Yepeng Liu, Xuandong Zhao, Dawn Song, Yuheng Bu
Retrieval-Augmented Generation (RAG) has become an effective method for enhancing large language models (LLMs) with up-to-date knowledge.
no code implementations • 27 Jan 2025 • Haiyun He, Yepeng Liu, Ziqiao Wang, Yongyi Mao, Yuheng Bu
This paper introduces a novel problem, distributional information embedding, motivated by the practical demands of multi-bit watermarking for large language models (LLMs).
1 code implementation • 7 Oct 2024 • Yepeng Liu, Yiren Song, Hai Ci, Yu Zhang, Haofan Wang, Mike Zheng Shou, Yuheng Bu
This scheme adds varying numbers of noise steps to the latent representation of the watermarked image, followed by a controlled denoising process starting from this noisy latent representation.
no code implementations • 3 Oct 2024 • Haiyun He, Yepeng Liu, Ziqiao Wang, Yongyi Mao, Yuheng Bu
Our approach focuses on maximizing detection performance while maintaining control over the worst-case Type-I error and text distortion.
1 code implementation • 13 Sep 2024 • Dingyi Zhuang, Yuheng Bu, Guang Wang, Shenhao Wang, Jinhua Zhao
Quantifying uncertainty is crucial for robust and reliable predictions.
no code implementations • 30 Apr 2024 • Chongyang Shi, Yuheng Bu, Jie Fu
The goal of the observer is to infer some secret, represented by a random variable, from its partial observations, while the goal of the planning agent is to make the secret maximally opaque to the observer while achieving a satisfactory total return.
1 code implementation • 9 Feb 2024 • Maohao Shen, J. Jon Ryu, Soumya Ghosh, Yuheng Bu, Prasanna Sattigeri, Subhro Das, Gregory W. Wornell
This paper questions the effectiveness of a modern predictive uncertainty quantification approach, called \emph{evidential deep learning} (EDL), in which a single neural network model is trained to learn a meta distribution over the predictive distribution by minimizing a specific objective function.
1 code implementation • 6 Feb 2024 • J. Jon Ryu, Xiangxiang Xu, H. S. Melihcan Erol, Yuheng Bu, Lizhong Zheng, Gregory W. Wornell
Computing eigenvalue decomposition (EVD) of a given linear operator, or finding its leading eigenvalues and eigenfunctions, is a fundamental task in many machine learning and scientific computing problems.
1 code implementation • 25 Jan 2024 • Yepeng Liu, Yuheng Bu
The advancement of Large Language Models (LLMs) has led to increasing concerns about the misuse of AI-generated text, and watermarking for LLM-generated text has emerged as a potential solution.
no code implementations • 5 Jan 2024 • Firas Laakom, Yuheng Bu, Moncef Gabbouj
Existing generalization theories of supervised learning typically take a holistic approach and provide bounds for the expected generalization over the whole data distribution, which implicitly assumes that the model generalizes similarly for all the classes.
no code implementations • 8 Jun 2023 • Haobo Chen, Yuheng Bu, Gregory W. Wornell
Double-descent refers to the unexpected drop in test loss of a learning algorithm beyond an interpolating threshold with over-parameterization, which is not predicted by information criteria in their classical forms due to the limitations in the standard asymptotic approach.
no code implementations • 31 May 2023 • Bo Hu, Yuheng Bu, José C. Príncipe
This paper proposes the Hierarchical Functional Maximal Correlation Algorithm (HFMCA), a hierarchical methodology that characterizes dependencies across two hierarchical levels in multiview systems.
no code implementations • 14 May 2023 • Amir Weiss, Alejandro Lancho, Yuheng Bu, Gregory W. Wornell
A bilateral (i. e., upper and lower) bound on the mean-square error under a general model mismatch is developed.
1 code implementation • 30 Apr 2023 • Maohao Shen, Soumya Ghosh, Prasanna Sattigeri, Subhro Das, Yuheng Bu, Gregory Wornell
Due to privacy or commercial constraints, large pre-trained language models (PLMs) are often offered as black-box APIs.
no code implementations • 27 Apr 2023 • Yuheng Bu, Harsha Vardhan Tetali, Gholamali Aminian, Miguel Rodrigues, Gregory Wornell
We analyze the generalization ability of joint-training meta learning algorithms via the Gibbs algorithm.
no code implementations • 16 Feb 2023 • Abhin Shah, Maohao Shen, Jongha Jon Ryu, Subhro Das, Prasanna Sattigeri, Yuheng Bu, Gregory W. Wornell
To overcome this limitation, we propose a bootstrap-based algorithm that achieves the target level of fairness despite the uncertainty in sensitive attributes.
1 code implementation • 14 Dec 2022 • Maohao Shen, Yuheng Bu, Prasanna Sattigeri, Soumya Ghosh, Subhro Das, Gregory Wornell
It is known that neural networks have the problem of being over-confident when directly using the output label distribution to generate uncertainty measures.
no code implementations • 15 Oct 2022 • Haiyun He, Gholamali Aminian, Yuheng Bu, Miguel Rodrigues, Vincent Y. F. Tan
Our findings offer new insights that the generalization performance of SSL with pseudo-labeling is affected not only by the information between the output hypothesis and input training data but also by the information {\em shared} between the {\em labeled} and {\em pseudo-labeled} data samples.
1 code implementation • 11 Sep 2022 • Alejandro Lancho, Amir Weiss, Gary C. F. Lee, Jennifer Tang, Yuheng Bu, Yury Polyanskiy, Gregory W. Wornell
We study the potential of data-driven deep learning methods for separation of two communication signals from an observation of their mixture.
1 code implementation • 22 Aug 2022 • Gary C. F. Lee, Amir Weiss, Alejandro Lancho, Jennifer Tang, Yuheng Bu, Yury Polyanskiy, Gregory W. Wornell
We study the problem of single-channel source separation (SCSS), and focus on cyclostationary signals, which are particularly suitable in a variety of application domains.
no code implementations • Entropy 2022 • Joshua Lee, Yuheng Bu, Prasanna Sattigeri, Rameswar Panda, Gregory W. Wornell, Leonid Karlinsky and Rogerio Schmidt Feris
As machine learning algorithms grow in popularity and diversify to many industries, ethical and legal concerns regarding their fairness have become increasingly relevant.
no code implementations • 24 Feb 2022 • Gholamali Aminian, Yuheng Bu, Gregory Wornell, Miguel Rodrigues
Due to the convexity of the information measures, the proposed bounds in terms of Wasserstein distance and total variation distance are shown to be tighter than their counterparts based on individual samples in the literature.
1 code implementation • 1 Feb 2022 • Maohao Shen, Yuheng Bu, Gregory Wornell
Due to privacy, storage, and other constraints, there is a growing need for unsupervised domain adaptation techniques in machine learning that do not require access to the data used to train a collection of source models.
Source-Free Domain Adaptation
Unsupervised Domain Adaptation
no code implementations • NeurIPS 2021 • Gholamali Aminian, Yuheng Bu, Laura Toni, Miguel Rodrigues, Gregory Wornell
Various approaches have been developed to upper bound the generalization error of a supervised learning algorithm.
no code implementations • 2 Nov 2021 • Yuheng Bu, Gholamali Aminian, Laura Toni, Miguel Rodrigues, Gregory Wornell
We provide an information-theoretic analysis of the generalization ability of Gibbs-based transfer learning algorithms by focusing on two popular transfer learning approaches, $\alpha$-weighted-ERM and two-stage-ERM.
1 code implementation • 28 Oct 2021 • Abhin Shah, Yuheng Bu, Joshua Ka-Wing Lee, Subhro Das, Rameswar Panda, Prasanna Sattigeri, Gregory W. Wornell
Selective regression allows abstention from prediction if the confidence to make an accurate prediction is not sufficient.
no code implementations • 28 Jul 2021 • Gholamali Aminian, Yuheng Bu, Laura Toni, Miguel R. D. Rodrigues, Gregory Wornell
As a result, they may fail to characterize the exact generalization ability of a learning algorithm.
no code implementations • 30 Dec 2020 • Joshua Lee, Yuheng Bu, Prasanna Sattigeri, Rameswar Panda, Gregory Wornell, Leonid Karlinsky, Rogerio Feris
As machine learning algorithms grow in popularity and diversify to many industries, ethical and legal concerns regarding their fairness have become increasingly relevant.
no code implementations • 4 Apr 2019 • Craig Wilson, Yuheng Bu, Venugopal Veeravalli
A framework previously introduced in [3] for solving a sequence of stochastic optimization problems with bounded changes in the minimizers is extended and applied to machine learning problems such as regression and classification.
1 code implementation • 27 Jan 2019 • Yuheng Bu, Weihao Gao, Shaofeng Zou, Venugopal V. Veeravalli
We show that model compression can improve the population risk of a pre-trained model, by studying the tradeoff between the decrease in the generalization error and the increase in the empirical risk with model compression.
no code implementations • 15 Jan 2019 • Yuheng Bu, Shaofeng Zou, Venugopal V. Veeravalli
The bound is derived under more general conditions on the loss function than in existing studies; nevertheless, it provides a tighter characterization of the generalization error.
no code implementations • 30 Nov 2018 • Yuheng Bu, Kevin Small
While recommendation systems generally observe user behavior passively, there has been an increased interest in directly querying users to learn their specific preferences.
no code implementations • 19 Nov 2018 • Yuheng Bu, Jiaxun Lu, Venugopal V. Veeravalli
The goal is to detect whether the change in the model is significant, i. e., whether the difference between the pre-change parameter and the post-change parameter $\|\theta-\theta'\|_2$ is larger than a pre-determined threshold $\rho$.
no code implementations • 29 May 2018 • Yuheng Bu, Jiaxun Lu, Venugopal V. Veeravalli
Furthermore, an estimator of the change in the learning problems using the active learning samples is constructed, which provides an adaptive sample size selection rule that guarantees the excess risk is bounded for sufficiently large number of time steps.
no code implementations • 21 Jan 2017 • Yuheng Bu, Shaofeng Zou, Venugopal V. Veeravalli
A sequence is considered as outlying if the observations therein are generated by a distribution different from those generating the observations in the majority of the sequences.