Search Results for author: Chao Du

Found 62 papers, 46 papers with code

Improving Your Model Ranking on Chatbot Arena by Vote Rigging

1 code implementation29 Jan 2025 Rui Min, Tianyu Pang, Chao Du, Qian Liu, Minhao Cheng, Min Lin

We first introduce a straightforward target-only rigging strategy that focuses on new battles involving $m_{t}$, identifying it via watermarking or a binary classifier, and exclusively voting for $m_{t}$ wins.

Chatbot

Human-like conceptual representations emerge from language prediction

no code implementations21 Jan 2025 Ningyu Xu, Qi Zhang, Chao Du, Qiang Luo, Xipeng Qiu, Xuanjing Huang, Menghan Zhang

Recent advances in large language models (LLMs) provide a new opportunity to address the long-standing question of how concepts are represented and organized in the mind, which is central to unravelling the nature of human cognition.

Prediction Reverse Dictionary

Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models

1 code implementation24 Dec 2024 Zehan Wang, Ziang Zhang, Tianyu Pang, Chao Du, Hengshuang Zhao, Zhou Zhao

Orientation is a key attribute of objects, crucial for understanding their spatial pose and arrangement in images.

Attribute

Real-time Identity Defenses against Malicious Personalization of Diffusion Models

1 code implementation13 Dec 2024 Hanzhong Guo, Shen Nie, Chao Du, Tianyu Pang, Hao Sun, Chongxuan Li

Personalized generative diffusion models, capable of synthesizing highly realistic images based on a few reference portraits, may pose substantial social, ethical, and legal risks via identity replication.

Image Compression

When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training

1 code implementation20 Nov 2024 Haonan Wang, Qian Liu, Chao Du, Tongyao Zhu, Cunxiao Du, Kenji Kawaguchi, Tianyu Pang

To address this, we develop AnchorAttention, a plug-and-play attention method that alleviates numerical issues caused by BFloat16, improves long-context capabilities, and speeds up training.

Computational Efficiency Position

Sample-Efficient Alignment for LLMs

1 code implementation3 Nov 2024 Zichen Liu, Changyu Chen, Chao Du, Wee Sun Lee, Min Lin

The results demonstrate that SEA achieves highly sample-efficient alignment with oracle's preferences, outperforming recent active exploration methods for LLMs.

Thompson Sampling

Scaling up Masked Diffusion Models on Text

1 code implementation24 Oct 2024 Shen Nie, Fengqi Zhu, Chao Du, Tianyu Pang, Qian Liu, Guangtao Zeng, Min Lin, Chongxuan Li

Masked diffusion models (MDMs) have shown promise in language modeling, yet their scalability and effectiveness in core language tasks, such as text generation and language understanding, remain underexplored.

GSM8K Language Modeling +3

SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction

1 code implementation17 Oct 2024 Xuan Zhang, Cunxiao Du, Chao Du, Tianyu Pang, Wei Gao, Min Lin

To mitigate this issue, we present SimLayerKV, a simple yet effective method that reduces inter-layer KV cache redundancies by selectively dropping cache in identified lazy layers.

Quantization

Meta-Unlearning on Diffusion Models: Preventing Relearning Unlearned Concepts

1 code implementation16 Oct 2024 Hongcheng Gao, Tianyu Pang, Chao Du, Taihang Hu, Zhijie Deng, Min Lin

With the rapid progress of diffusion-based content generation, significant efforts are being made to unlearn harmful or copyrighted concepts from pretrained diffusion models (DMs) to prevent potential model misuse.

Improving Long-Text Alignment for Text-to-Image Diffusion Models

1 code implementation15 Oct 2024 Luping Liu, Chao Du, Tianyu Pang, Zehan Wang, Chongxuan Li, Dong Xu

To tackle these issues, we propose LongAlign, which includes a segment-level encoding method for processing long texts and a decomposed preference optimization method for effective alignment training.

When Attention Sink Emerges in Language Models: An Empirical View

1 code implementation14 Oct 2024 Xiangming Gu, Tianyu Pang, Chao Du, Qian Liu, Fengzhuo Zhang, Cunxiao Du, Ye Wang, Min Lin

In this work, we first demonstrate that attention sinks exist universally in LMs with various inputs, even in small models.

Quantization

Denial-of-Service Poisoning Attacks against Large Language Models

1 code implementation14 Oct 2024 Kuofeng Gao, Tianyu Pang, Chao Du, Yong Yang, Shu-Tao Xia, Min Lin

To overcome this limitation, we propose poisoning-based DoS (P-DoS) attacks for LLMs, demonstrating that injecting a single poisoned sample designed for DoS purposes can break the output length limit.

16k

A Closer Look at Machine Unlearning for Large Language Models

1 code implementation10 Oct 2024 Xiaojian Yuan, Tianyu Pang, Chao Du, Kejiang Chen, Weiming Zhang, Min Lin

Specifically, the behavior that untargeted unlearning attempts to approximate is unpredictable and may involve hallucinations, and existing regularization is insufficient for targeted unlearning.

Diversity Machine Unlearning +1

Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates

1 code implementation9 Oct 2024 Xiaosen Zheng, Tianyu Pang, Chao Du, Qian Liu, Jing Jiang, Min Lin

Achieving high win rates on these benchmarks can significantly boost the promotional impact of newly released language models.

AI Delegates with a Dual Focus: Ensuring Privacy and Strategic Self-Disclosure

no code implementations26 Sep 2024 Xi Chen, Zhiyang Zhang, Fangkai Yang, Xiaoting Qin, Chao Du, Xi Cheng, Hangxin Liu, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang

Large language model (LLM)-based AI delegates are increasingly utilized to act on behalf of users, assisting them with a wide range of tasks through conversational interfaces.

Language Modeling Language Modelling +1

Revisiting Backdoor Attacks against Large Vision-Language Models from Domain Shift

no code implementations27 Jun 2024 Siyuan Liang, Jiawei Liang, Tianyu Pang, Chao Du, Aishan Liu, Mingli Zhu, Xiaochun Cao, DaCheng Tao

Instruction tuning enhances large vision-language models (LVLMs) but increases their vulnerability to backdoor attacks due to their open design.

Backdoor Attack Domain Generalization

Bootstrapping Language Models with DPO Implicit Rewards

1 code implementation14 Jun 2024 Changyu Chen, Zichen Liu, Chao Du, Tianyu Pang, Qian Liu, Arunesh Sinha, Pradeep Varakantham, Min Lin

In this work, we make a novel observation that this implicit reward model can by itself be used in a bootstrapping fashion to further align the LLM.

Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs

1 code implementation13 Jun 2024 Xuan Zhang, Chao Du, Tianyu Pang, Qian Liu, Wei Gao, Min Lin

The recent development of chain-of-thought (CoT) decoding has enabled large language models (LLMs) to generate explicit logical reasoning paths for complex problem-solving.

Arithmetic Reasoning Fact Verification +2

Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses

1 code implementation3 Jun 2024 Xiaosen Zheng, Tianyu Pang, Chao Du, Qian Liu, Jing Jiang, Min Lin

In addition, we conduct comprehensive and elaborate (e. g., making sure to use correct system prompts) evaluations against other aligned LLMs and advanced defenses, where our method consistently achieves nearly 100% ASRs.

Improved Techniques for Optimization-Based Jailbreaking on Large Language Models

1 code implementation31 May 2024 Xiaojun Jia, Tianyu Pang, Chao Du, Yihao Huang, Jindong Gu, Yang Liu, Xiaochun Cao, Min Lin

Many red-teaming efforts aim to jailbreak LLMs, where among these efforts, the Greedy Coordinate Gradient (GCG) attack's success has led to a growing interest in the study of optimization-based jailbreaking techniques.

Red Teaming

Graph Diffusion Policy Optimization

1 code implementation26 Feb 2024 Yijing Liu, Chao Du, Tianyu Pang, Chongxuan Li, Min Lin, Wei Chen

Recent research has made significant progress in optimizing diffusion models for downstream objectives, which is an important pursuit in fields such as graph generation for drug design.

Drug Design Graph Generation

Purifying Large Language Models by Ensembling a Small Language Model

no code implementations19 Feb 2024 Tianlin Li, Qian Liu, Tianyu Pang, Chao Du, Qing Guo, Yang Liu, Min Lin

The emerging success of large language models (LLMs) heavily relies on collecting abundant training data from external (untrusted) sources.

Data Poisoning Language Modeling +1

Your Large Language Model is Secretly a Fairness Proponent and You Should Prompt it Like One

no code implementations19 Feb 2024 Tianlin Li, XiaoYu Zhang, Chao Du, Tianyu Pang, Qian Liu, Qing Guo, Chao Shen, Yang Liu

Building on this insight and observation, we develop FairThinking, a pipeline designed to automatically generate roles that enable LLMs to articulate diverse perspectives for fair expressions.

Fairness Language Modeling +2

Test-Time Backdoor Attacks on Multimodal Large Language Models

1 code implementation13 Feb 2024 Dong Lu, Tianyu Pang, Chao Du, Qian Liu, Xianjun Yang, Min Lin

Backdoor attacks are commonly executed by contaminating training data, such that a trigger can activate predetermined harmful effects during the test phase.

Backdoor Attack

Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast

1 code implementation13 Feb 2024 Xiangming Gu, Xiaosen Zheng, Tianyu Pang, Chao Du, Qian Liu, Ye Wang, Jing Jiang, Min Lin

A multimodal large language model (MLLM) agent can receive instructions, capture images, retrieve histories from memory, and decide which tools to use.

Language Modelling Large Language Model +2

Weak-to-Strong Jailbreaking on Large Language Models

1 code implementation30 Jan 2024 Xuandong Zhao, Xianjun Yang, Tianyu Pang, Chao Du, Lei LI, Yu-Xiang Wang, William Yang Wang

In this paper, we propose the weak-to-strong jailbreaking attack, an efficient method to attack aligned LLMs to produce harmful text.

Locality Sensitive Sparse Encoding for Learning World Models Online

no code implementations23 Jan 2024 Zichen Liu, Chao Du, Wee Sun Lee, Min Lin

Unfortunately, NN-based models need re-training on all accumulated data at every interaction step to achieve FTL, which is computationally expensive for lifelong agents.

Continual Learning Model-based Reinforcement Learning

Benchmarking Large Multimodal Models against Common Corruptions

1 code implementation22 Jan 2024 Jiawei Zhang, Tianyu Pang, Chao Du, Yi Ren, Bo Li, Min Lin

This technical report aims to fill a deficiency in the assessment of large multimodal models (LMMs) by specifically examining the self-consistency of their outputs when subjected to common corruptions.

Benchmarking Image to text +1

Contrastive Learning with Negative Sampling Correction

no code implementations13 Jan 2024 Lu Wang, Chao Du, Pu Zhao, Chuan Luo, Zhangchi Zhu, Bo Qiao, Wei zhang, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang

To correct the negative sampling bias, we propose a novel contrastive learning method named Positive-Unlabeled Contrastive Learning (PUCL).

Contrastive Learning Data Augmentation +2

TaskWeaver: A Code-First Agent Framework

1 code implementation29 Nov 2023 Bo Qiao, Liqun Li, Xu Zhang, Shilin He, Yu Kang, Chaoyun Zhang, Fangkai Yang, Hang Dong, Jue Zhang, Lu Wang, Minghua Ma, Pu Zhao, Si Qin, Xiaoting Qin, Chao Du, Yong Xu, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang

TaskWeaver provides support for rich data structures, flexible plugin usage, and dynamic plugin selection, and leverages LLM coding capabilities for complex logic.

Natural Language Understanding

Finetuning Text-to-Image Diffusion Models for Fairness

1 code implementation11 Nov 2023 Xudong Shen, Chao Du, Tianyu Pang, Min Lin, Yongkang Wong, Mohan Kankanhalli

The rapid adoption of text-to-image diffusion models in society underscores an urgent need to address their biases.

Fairness

On Memorization in Diffusion Models

2 code implementations4 Oct 2023 Xiangming Gu, Chao Du, Tianyu Pang, Chongxuan Li, Min Lin, Ye Wang

Looking into this, we first observe that memorization behaviors tend to occur on smaller-sized datasets, which motivates our definition of effective model memorization (EMM), a metric measuring the maximum size of training data at which a learned diffusion model approximates its theoretical optimum.

Denoising Memorization

Robust Positive-Unlabeled Learning via Noise Negative Sample Self-correction

1 code implementation1 Aug 2023 Zhangchi Zhu, Lu Wang, Pu Zhao, Chao Du, Wei zhang, Hang Dong, Bo Qiao, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang

To mitigate the impact of label uncertainty and improve the robustness of learning with positive and unlabeled data, we propose a new robust PU learning method with a training strategy motivated by the nature of human learning: easy cases should be learned first.

LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition

2 code implementations25 Jul 2023 Chengsong Huang, Qian Liu, Bill Yuchen Lin, Tianyu Pang, Chao Du, Min Lin

This paper investigates LoRA composability for cross-task generalization and introduces LoraHub, a simple framework devised for the purposive assembly of LoRA modules trained on diverse given tasks, with the objective of achieving adaptable performance on unseen tasks.

In-Context Learning

AdAM: Few-Shot Image Generation via Adaptation-Aware Kernel Modulation

no code implementations4 Jul 2023 Yunqing Zhao, Keshigeyan Chandrasegaran, Milad Abdollahzadeh, Chao Du, Tianyu Pang, Ruoteng Li, Henghui Ding, Ngai-Man Cheung

However, a major limitation of existing methods is that their knowledge preserving criteria consider only source domain/task and fail to consider target domain/adaptation in selecting source knowledge, casting doubt on their suitability for setups of different proximity between source and target domain.

Domain Adaptation Image Generation

Adversarial Constrained Bidding via Minimax Regret Optimization with Causality-Aware Reinforcement Learning

no code implementations12 Jun 2023 Haozhe Wang, Chao Du, Panyan Fang, Li He, Liang Wang, Bo Zheng

In this regard, we explore the problem of constrained bidding in adversarial bidding environments, which assumes no knowledge about the adversarial factors.

Meta-Learning reinforcement-learning

Exploring Model Dynamics for Accumulative Poisoning Discovery

1 code implementation6 Jun 2023 Jianing Zhu, Xiawei Guo, Jiangchao Yao, Chao Du, Li He, Shuo Yuan, Tongliang Liu, Liang Wang, Bo Han

In this paper, we dive into the perspective of model dynamics and propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information.

Memorization model

Efficient Diffusion Policies for Offline Reinforcement Learning

1 code implementation NeurIPS 2023 Bingyi Kang, Xiao Ma, Chao Du, Tianyu Pang, Shuicheng Yan

2) It is incompatible with maximum likelihood-based RL algorithms (e. g., policy gradient methods) as the likelihood of diffusion models is intractable.

D4RL Offline RL +4

On Evaluating Adversarial Robustness of Large Vision-Language Models

1 code implementation NeurIPS 2023 Yunqing Zhao, Tianyu Pang, Chao Du, Xiao Yang, Chongxuan Li, Ngai-Man Cheung, Min Lin

Large vision-language models (VLMs) such as GPT-4 have achieved unprecedented performance in response generation, especially with visual inputs, enabling more creative and adaptable interaction than large language models such as ChatGPT.

Adversarial Robustness multimodal generation +1

Nonparametric Generative Modeling with Conditional Sliced-Wasserstein Flows

2 code implementations3 May 2023 Chao Du, Tianbo Li, Tianyu Pang, Shuicheng Yan, Min Lin

Sliced-Wasserstein Flow (SWF) is a promising approach to nonparametric generative modeling but has not been widely adopted due to its suboptimal generative quality and lack of conditional modeling capabilities.

Exploring Incompatible Knowledge Transfer in Few-shot Image Generation

1 code implementation CVPR 2023 Yunqing Zhao, Chao Du, Milad Abdollahzadeh, Tianyu Pang, Min Lin, Shuicheng Yan, Ngai-Man Cheung

To this end, we propose knowledge truncation to mitigate this issue in FSIG, which is a complementary operation to knowledge preservation and is implemented by a lightweight pruning-based method.

Image Generation Transfer Learning

CoSDA: Continual Source-Free Domain Adaptation

1 code implementation13 Apr 2023 Haozhe Feng, Zhaorui Yang, Hesun Chen, Tianyu Pang, Chao Du, Minfeng Zhu, Wei Chen, Shuicheng Yan

Recently, SFDA has gained popularity due to the need to protect the data privacy of the source domain, but it suffers from catastrophic forgetting on the source domain due to the lack of data.

Source-Free Domain Adaptation

A Recipe for Watermarking Diffusion Models

1 code implementation17 Mar 2023 Yunqing Zhao, Tianyu Pang, Chao Du, Xiao Yang, Ngai-Man Cheung, Min Lin

Diffusion models (DMs) have demonstrated advantageous potential on generative tasks.

On Calibrating Diffusion Probabilistic Models

1 code implementation NeurIPS 2023 Tianyu Pang, Cheng Lu, Chao Du, Min Lin, Shuicheng Yan, Zhijie Deng

In this work, we observe that the stochastic reverse process of data scores is a martingale, from which concentration bounds and the optional stopping theorem for data scores can be derived.

Better Diffusion Models Further Improve Adversarial Training

4 code implementations9 Feb 2023 Zekai Wang, Tianyu Pang, Chao Du, Min Lin, Weiwei Liu, Shuicheng Yan

Under the $\ell_\infty$-norm threat model with $\epsilon=8/255$, our models achieve $70. 69\%$ and $42. 67\%$ robust accuracy on CIFAR-10 and CIFAR-100, respectively, i. e. improving upon previous state-of-the-art models by $+4. 58\%$ and $+8. 03\%$.

Denoising

BAFFLE: A Baseline of Backpropagation-Free Federated Learning

1 code implementation28 Jan 2023 Haozhe Feng, Tianyu Pang, Chao Du, Wei Chen, Shuicheng Yan, Min Lin

BAFFLE is 1) memory-efficient and easily fits uploading bandwidth; 2) compatible with inference-only hardware optimization and model quantization or pruning; and 3) well-suited to trusted execution environments, because the clients in BAFFLE only execute forward propagation and return a set of scalars to the server.

Federated Learning Quantization

Exploration in Online Advertising Systems with Deep Uncertainty-Aware Learning

1 code implementation25 Nov 2020 Chao Du, Zhifeng Gao, Shuo Yuan, Lining Gao, Ziyan Li, Yifan Zeng, Xiaoqiang Zhu, Jian Xu, Kun Gai, Kuang-Chih Lee

In this paper, we propose a novel Deep Uncertainty-Aware Learning (DUAL) method to learn CTR models based on Gaussian processes, which can provide predictive uncertainty estimations while maintaining the flexibility of deep neural networks.

Click-Through Rate Prediction Gaussian Processes

Rethinking Softmax Cross-Entropy Loss for Adversarial Robustness

2 code implementations ICLR 2020 Tianyu Pang, Kun Xu, Yinpeng Dong, Chao Du, Ning Chen, Jun Zhu

Previous work shows that adversarially robust generalization requires larger sample complexity, and the same dataset, e. g., CIFAR-10, which enables good standard accuracy may not suffice to train robust models.

Adversarial Robustness

Improving Adversarial Robustness via Promoting Ensemble Diversity

6 code implementations25 Jan 2019 Tianyu Pang, Kun Xu, Chao Du, Ning Chen, Jun Zhu

Though deep neural networks have achieved significant progress on various tasks, often enhanced by model ensemble, existing high-performance models can be vulnerable to adversarial attacks.

Adversarial Robustness Diversity

To Relieve Your Headache of Training an MRF, Take AdVIL

no code implementations ICLR 2020 Chongxuan Li, Chao Du, Kun Xu, Max Welling, Jun Zhu, Bo Zhang

We propose a black-box algorithm called {\it Adversarial Variational Inference and Learning} (AdVIL) to perform inference and learning on a general Markov random field (MRF).

Variational Inference

Max-Mahalanobis Linear Discriminant Analysis Networks

2 code implementations ICML 2018 Tianyu Pang, Chao Du, Jun Zhu

In this paper, we show that a properly designed classifier can improve robustness to adversarial attacks and lead to better prediction results.

Towards Robust Detection of Adversarial Examples

1 code implementation NeurIPS 2018 Tianyu Pang, Chao Du, Yinpeng Dong, Jun Zhu

Although the recent progress is substantial, deep learning methods can be vulnerable to the maliciously generated adversarial examples.

Field-Programmable Crossbar Array (FPCA) for Reconfigurable Computing

no code implementations9 Dec 2016 Mohammed A. Zidan, YeonJoo Jeong, Jong Hong Shin, Chao Du, Zhengya Zhang, Wei D. Lu

The proposed computing architecture is based on a uniform, physical, resistive, memory-centric fabric that can be optimally reconfigured and utilized to perform different computing and data storage tasks in a massively parallel approach.

Interactive Visual Hull Refinement for Specular and Transparent Object Surface Reconstruction

no code implementations ICCV 2015 Xinxin Zuo, Chao Du, Sen Wang, Jiangbin Zheng, Ruigang Yang

We discovered that these internal contours, which are results of convex parts on an object's surface, can lead to a tighter fit than the original visual hull.

Contour Detection Surface Reconstruction +2

Learning Deep Generative Models with Doubly Stochastic MCMC

no code implementations15 Jun 2015 Chao Du, Jun Zhu, Bo Zhang

We present doubly stochastic gradient MCMC, a simple and generic method for (approximate) Bayesian inference of deep generative models (DGMs) in a collapsed continuous parameter space.

Bayesian Inference Density Estimation +1

Inner Product Similarity Search using Compositional Codes

no code implementations19 Jun 2014 Chao Du, Jingdong Wang

This paper addresses the nearest neighbor search problem under inner product similarity and introduces a compact code-based approach.

Cannot find the paper you are looking for? You can Submit a new open access paper.