no code implementations • 26 Mar 2025 • Ryumei Nakada, Wenlong Ji, Tianxi Cai, James Zou, Linjun Zhang
Prompt engineering has emerged as a powerful technique for guiding large language models (LLMs) toward desired responses, significantly enhancing their performance across diverse tasks.
no code implementations • 25 Feb 2025 • Wenlong Ji, Weizhe Yuan, Emily Getzen, Kyunghyun Cho, Michael I. Jordan, Song Mei, Jason E Weston, Weijie J. Su, Jing Xu, Linjun Zhang
Large Language Models (LLMs) have emerged as transformative tools in artificial intelligence (AI), exhibiting remarkable capabilities across diverse tasks such as text generation, reasoning, and decision-making.
no code implementations • 25 Feb 2025 • Tianze Wang, Dongnan Gui, Yifan Hu, Shuhang Lin, Linjun Zhang
Reinforcement Learning from Human Feedback (RLHF) has shown promise in aligning large language models (LLMs).
no code implementations • 16 Feb 2025 • Tianci Liu, Haoxiang Jiang, Tianze Wang, ran Xu, Yue Yu, Linjun Zhang, Tuo Zhao, Haoyu Wang
Large language models (LLMs) have achieved impressive performance but face high computational costs and latency, limiting their deployment in resource-constrained settings.
no code implementations • 5 Jan 2025 • Yinpeng Cai, Lexin Li, Linjun Zhang
To address this issue, we propose embedding watermarks into the copyrighted training data and formulating the detection of data misappropriation as a hypothesis testing problem.
no code implementations • 9 Dec 2024 • Xinyu Yang, Jixuan Leng, Geyang Guo, Jiawei Zhao, Ryumei Nakada, Linjun Zhang, Huaxiu Yao, Beidi Chen
Utilizing this key insight, we propose a family of Structured Sparse Fine-Tuning (S$^{2}$FT) methods for LLMs, which concurrently achieve state-of-the-art fine-tuning performance, training efficiency, and inference scalability.
no code implementations • 20 Nov 2024 • Cynthia Dwork, Pranay Tankala, Linjun Zhang
We provide sharp theoretical estimates of the error of several well-studied differentially private algorithms for robust linear regression and logistic regression, including output perturbation, objective perturbation, and noisy stochastic gradient descent, in the proportional dimensionality regime.
no code implementations • 4 Nov 2024 • Fan Nie, Xiaotian Hou, Shuhang Lin, James Zou, Huaxiu Yao, Linjun Zhang
The propensity of Large Language Models (LLMs) to generate hallucinations and non-factual content undermines their reliability in high-stakes domains, where rigorous control over Type I errors (the conditional probability of incorrectly classifying hallucinations as truthful content) is essential.
no code implementations • 21 Oct 2024 • Xiaotian Hou, Linjun Zhang
Algorithmic fairness in machine learning has recently garnered significant attention.
1 code implementation • 16 Oct 2024 • Peng Xia, Kangyu Zhu, Haoran Li, Tianze Wang, Weijia Shi, Sheng Wang, Linjun Zhang, James Zou, Huaxiu Yao
Artificial Intelligence (AI) has demonstrated significant potential in healthcare, particularly in disease diagnosis and treatment planning.
no code implementations • 2 Oct 2024 • Yibo Zhong, Haoxiang Jiang, Lincan Li, Ryumei Nakada, Tianci Liu, Linjun Zhang, Huaxiu Yao, Haoyu Wang
The nonlinear approximation directly models the cumulative updates, effectively capturing complex and non-linear structures in the weight updates.
1 code implementation • 6 Jul 2024 • Peng Xia, Kangyu Zhu, Haoran Li, Hongtu Zhu, Yun Li, Gang Li, Linjun Zhang, Huaxiu Yao
Second, in cases where the model originally responds correctly, applying RAG can lead to an over-reliance on retrieved contexts, resulting in incorrect answers.
no code implementations • 23 Jun 2024 • Zexing Xu, Linjun Zhang, Sitan Yang, Rasoul Etesami, Hanghang Tong, huan zhang, Jiawei Han
In this paper, we propose a novel approach that leverages strategically chosen proxy data reflective of potential sales patterns from similar entities during non-peak periods, enriched by features learned from a graph neural networks (GNNs)-based forecasting model, to predict demand during peak events.
1 code implementation • 5 Jun 2024 • Ryumei Nakada, Yichen Xu, Lexin Li, Linjun Zhang
In the context of imbalanced data, LLMs are used to oversample underrepresented groups and have shown promising improvements.
1 code implementation • 4 Jun 2024 • Reid McIlroy-Young, Katrina Brown, Conlan Olson, Linjun Zhang, Cynthia Dwork
One problematic inconsistency when LLMs are used to answer multiple-choice questions or analyze multiple inputs is order dependency: the output of an LLM can (and often does) change significantly when sub-sequences are swapped, despite both orderings being semantically identical.
1 code implementation • 23 May 2024 • Yiyang Zhou, Zhiyuan Fan, Dongjie Cheng, Sihan Yang, Zhaorun Chen, Chenhang Cui, Xiyao Wang, Yun Li, Linjun Zhang, Huaxiu Yao
In the reward modeling, we employ a step-wise strategy and incorporate visual constraints into the self-rewarding process to place greater emphasis on visual input.
Ranked #127 on
Visual Question Answering
on MM-Vet
no code implementations • 3 May 2024 • Lujing Zhang, Aaron Roth, Linjun Zhang
This paper introduces a framework for post-processing machine learning models so that their predictions satisfy multi-group fairness guarantees.
no code implementations • 25 Apr 2024 • Zhe Zhang, Ryumei Nakada, Linjun Zhang
Differentially private federated learning is crucial for maintaining privacy in distributed environments.
no code implementations • 14 Apr 2024 • Xiufan Yu, Linjun Zhang, Arun Srinivasan, Min-ge Xie, Lingzhou Xue
Compared to the existing $p$-value combination methods, including the vanilla Cauchy combination method, the proposed combination framework can handle the dependence accurately and utilizes the information efficiently to construct tests with accurate size and enhanced power.
1 code implementation • 2 Apr 2024 • Sai Li, Linjun Zhang
Machine learning methods often assume that the test data have the same distribution as the training data.
no code implementations • 22 Mar 2024 • Tianxi Cai, Feiqing Huang, Ryumei Nakada, Linjun Zhang, Doudou Zhou
To accommodate the statistical analysis of multimodal EHR data, in this paper, we propose a novel multimodal feature embedding generative model and design a multimodal contrastive loss to obtain the multimodal EHR feature representation.
no code implementations • 8 Mar 2024 • Huiying Zhong, Zhun Deng, Weijie J. Su, Zhiwei Steven Wu, Linjun Zhang
Our work \textit{initiates} the theoretical study of multi-party RLHF that explicitly models the diverse preferences of multiple individuals.
no code implementations • 25 Feb 2024 • Qichuan Yin, Zexian Wang, Junzhou Huang, Huaxiu Yao, Linjun Zhang
As federated learning gains increasing importance in real-world applications due to its capacity for decentralized data training, addressing fairness concerns across demographic groups becomes critically important.
no code implementations • 13 Feb 2024 • Zongbo Han, Yifeng Yang, Changqing Zhang, Linjun Zhang, Joey Tianyi Zhou, QinGhua Hu
The objective can be understood as seeking a model that fits the ground-truth labels by increasing the confidence while also maximizing the entropy of predicted probabilities by decreasing the confidence.
no code implementations • 16 Jan 2024 • Xintao Xia, Linjun Zhang, Zhanrui Cai
Privacy preservation has become a critical concern in high-dimensional data analysis due to the growing prevalence of data-driven applications.
no code implementations • 3 Jan 2024 • Haonan Wang, James Zou, Michael Mozer, Anirudh Goyal, Alex Lamb, Linjun Zhang, Weijie J Su, Zhun Deng, Michael Qizhe Xie, Hannah Brown, Kenji Kawaguchi
With the rise of advanced generative AI models capable of tasks once reserved for human creativity, the study of AI's creative potential becomes imperative for its responsible development and application.
1 code implementation • 6 Nov 2023 • Chenhang Cui, Yiyang Zhou, Xinyu Yang, Shirley Wu, Linjun Zhang, James Zou, Huaxiu Yao
To bridge this gap, we introduce a new benchmark, namely, the Bias and Interference Challenges in Visual Language Models (Bingo).
3 code implementations • 10 Oct 2023 • Jianguo Huang, Huajun Xi, Linjun Zhang, Huaxiu Yao, Yue Qiu, Hongxin Wei
Conformal prediction is a statistical framework that generates prediction sets containing ground-truth labels with a desired coverage guarantee.
1 code implementation • 1 Oct 2023 • Yiyang Zhou, Chenhang Cui, Jaehong Yoon, Linjun Zhang, Zhun Deng, Chelsea Finn, Mohit Bansal, Huaxiu Yao
Large vision-language models (LVLMs) have shown remarkable abilities in understanding visual information with human languages.
1 code implementation • 18 Sep 2023 • Sai Li, Linjun Zhang
In conventional statistical and machine learning methods, it is typically assumed that the test data are identically distributed with the training data.
no code implementations • 6 Jul 2023 • Xinming Tu, James Zou, Weijie J. Su, Linjun Zhang
LLMs can also play a significant role in the classroom as interactive teaching and learning tools, contributing to personalized education.
1 code implementation • 13 Jun 2023 • Alyssa Huang, Peihan Liu, Ryumei Nakada, Linjun Zhang, Wanrong Zhang
The surge in multimodal AI's success has sparked concerns over data privacy in vision-and-language tasks.
1 code implementation • 1 May 2023 • Shirley Wu, Mert Yuksekgonul, Linjun Zhang, James Zou
Deep neural networks often rely on spurious correlations to make predictions, which hinders generalization beyond training environments.
no code implementations • 13 Mar 2023 • T. Tony Cai, Yichen Wang, Linjun Zhang
The score attack method is based on the tracing attack concept in differential privacy and can be applied to any statistical model with a well-defined score statistic.
no code implementations • 8 Mar 2023 • Zhun Deng, Cynthia Dwork, Linjun Zhang
Fairness is captured by incorporating demographic subgroups into the class of functions~$\mathcal{C}$.
1 code implementation • 13 Feb 2023 • Ryumei Nakada, Halil Ibrahim Gulluk, Zhun Deng, Wenlong Ji, James Zou, Linjun Zhang
We show that the algorithm can detect the ground-truth pairs and improve performance by fully exploiting unpaired datasets.
1 code implementation • 28 Nov 2022 • Puheng Li, James Zou, Linjun Zhang
Several group fairness notions and algorithms have been proposed.
no code implementations • 8 Nov 2022 • Zhun Deng, He Sun, Zhiwei Steven Wu, Linjun Zhang, David C. Parkes
AI methods are used in societally important settings, ranging from credit to employment to housing, and it is crucial to provide fairness in regard to algorithmic decision making.
1 code implementation • 20 Oct 2022 • Haotian Ye, James Zou, Linjun Zhang
This opens a promising strategy to first train a feature learner rather than a classifier, and then perform linear probing (last layer retraining) in the test environment.
1 code implementation • 11 Oct 2022 • Huaxiu Yao, Yiping Wang, Linjun Zhang, James Zou, Chelsea Finn
In this paper, we propose a simple yet powerful algorithm, C-Mixup, to improve generalization on regression tasks.
no code implementations • 6 Jun 2022 • Zhun Deng, Jiayao Zhang, Linjun Zhang, Ting Ye, Yates Coley, Weijie J. Su, James Zou
Specifically, FIFA encourages both classification and fairness generalization and can be flexibly combined with many existing fair learning methods with logits-based losses.
3 code implementations • 2 Jan 2022 • Huaxiu Yao, Yu Wang, Sai Li, Linjun Zhang, Weixin Liang, James Zou, Chelsea Finn
Machine learning algorithms typically assume that training and test examples are drawn from the same distribution.
no code implementations • 4 Nov 2021 • Maya Burhanpurkar, Zhun Deng, Cynthia Dwork, Linjun Zhang
Predictors map individual instances in a population to the interval $[0, 1]$.
no code implementations • 6 Oct 2021 • Wenlong Ji, Zhun Deng, Ryumei Nakada, James Zou, Linjun Zhang
Contrastive learning has achieved state-of-the-art performance in various self-supervised learning tasks and even outperforms its supervised counterpart.
no code implementations • 28 Jun 2021 • Kenji Kawaguchi, Linjun Zhang, Zhun Deng
Representation learning allows us to automatically discover suitable representations from raw sensory data.
no code implementations • NeurIPS 2021 • Zhun Deng, Linjun Zhang, Kailas Vodrahalli, Kenji Kawaguchi, James Zou
Recent works empirically demonstrate that adversarial training in the source data can improve the ability of models to transfer to new domains.
1 code implementation • ICLR 2022 • Huaxiu Yao, Linjun Zhang, Chelsea Finn
Meta-learning enables algorithms to quickly learn a newly encountered task with just a few labeled examples by transferring previously learned knowledge.
no code implementations • 1 Apr 2021 • Zhe Zhang, Linjun Zhang
In this paper, we develop a general framework to design differentially private expectation-maximization (EM) algorithms in high-dimensional latent variable models, based on the noisy iterative hard-thresholding.
no code implementations • NeurIPS 2021 • Jinshuo Dong, Weijie J. Su, Linjun Zhang
The central question, therefore, is to understand which noise distribution optimizes the privacy-accuracy trade-off, especially when the dimension of the answer vector is high.
no code implementations • 11 Feb 2021 • Linjun Zhang, Zhun Deng, Kenji Kawaguchi, James Zou
In addition, we study how Mixup improves calibration in semi-supervised learning.
no code implementations • 8 Nov 2020 • T. Tony Cai, Yichen Wang, Linjun Zhang
We propose differentially private algorithms for parameter estimation in both low-dimensional and high-dimensional sparse generalized linear models (GLMs) by constructing private versions of projected gradient descent.
no code implementations • 6 Nov 2020 • Linjun Zhang, Rong Ma, T. Tony Cai, Hongzhe Li
Based on the iterative estimators, we further construct debiased estimators and establish their asymptotic normality.
no code implementations • ICLR 2021 • Linjun Zhang, Zhun Deng, Kenji Kawaguchi, Amirata Ghorbani, James Zou
For robustness, we show that minimizing the Mixup loss corresponds to approximately minimizing an upper bound of the adversarial loss.
no code implementations • ICML 2020 • Zhun Deng, Cynthia Dwork, Jialiang Wang, Linjun Zhang
Robust optimization has been widely used in nowadays data science, especially in adversarial training.
no code implementations • 7 Sep 2020 • Jin Cao, Yibo Zhao, Linjun Zhang, Jason Li
The key to our approach is a computationally lightweight forward addition algorithm that we developed to recursively extract the functional dependencies between table columns that are scalable to tables with many columns.
1 code implementation • 26 Jul 2020 • Huaxiu Yao, Long-Kai Huang, Linjun Zhang, Ying WEI, Li Tian, James Zou, Junzhou Huang, Zhenhui Li
Moreover, both MetaMix and Channel Shuffle outperform state-of-the-art results by a large margin across many datasets and are compatible with existing meta-learning algorithms.
no code implementations • 15 Jun 2020 • Zhun Deng, Linjun Zhang, Amirata Ghorbani, James Zou
In this work, we investigate how adversarial robustness can be enhanced by leveraging out-of-domain unlabeled data.
no code implementations • 12 Feb 2019 • T. Tony Cai, Yichen Wang, Linjun Zhang
By refining the "tracing adversary" technique for lower bounds in the theoretical computer science literature, we formulate a general lower bound argument for minimax risks with differential privacy constraints, and apply this argument to high-dimensional mean estimation and linear regression problems.
no code implementations • 16 Feb 2016 • T. Tony Cai, Linjun Zhang
We discuss a clustering method for Gaussian mixture model based on the sparse principal component analysis (SPCA) method and compare it with the IF-PCA method.