Search Results for author: Hanze Dong

Found 26 papers, 11 papers with code

Faster Sampling via Stochastic Gradient Proximal Sampler

no code implementations27 May 2024 Xunpeng Huang, Difan Zou, Yi-An Ma, Hanze Dong, Tong Zhang

Stochastic gradients have been widely integrated into Langevin-based methods to improve their scalability and efficiency in solving large-scale sampling problems.

Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference

no code implementations26 May 2024 Xunpeng Huang, Difan Zou, Hanze Dong, Yi Zhang, Yi-An Ma, Tong Zhang

To generate data from trained diffusion models, most inference algorithms, such as DDPM, DDIM, and other variants, rely on discretizing the reverse SDEs or their equivalent ODEs.

Denoising

RLHF Workflow: From Reward Modeling to Online RLHF

3 code implementations13 May 2024 Hanze Dong, Wei Xiong, Bo Pang, Haoxiang Wang, Han Zhao, Yingbo Zhou, Nan Jiang, Doyen Sahoo, Caiming Xiong, Tong Zhang

We present the workflow of Online Iterative Reinforcement Learning from Human Feedback (RLHF) in this technical report, which is widely reported to outperform its offline counterpart by a large margin in the recent large language model (LLM) literature.

Chatbot Language Modelling +1

An Improved Analysis of Langevin Algorithms with Prior Diffusion for Non-Log-Concave Sampling

no code implementations10 Mar 2024 Xunpeng Huang, Hanze Dong, Difan Zou, Tong Zhang

Along this line, Freund et al. (2022) suggest that the modified Langevin algorithm with prior diffusion is able to converge dimension independently for strongly log-concave target distributions.

Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo

no code implementations12 Jan 2024 Xunpeng Huang, Difan Zou, Hanze Dong, Yian Ma, Tong Zhang

Specifically, DMC follows the reverse SDE of a diffusion process that transforms the target distribution to the standard Gaussian, utilizing a non-parametric score estimation.

MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance

1 code implementation5 Jan 2024 Renjie Pi, Tianyang Han, Jianshu Zhang, Yueqi Xie, Rui Pan, Qing Lian, Hanze Dong, Jipeng Zhang, Tong Zhang

The deployment of multimodal large language models (MLLMs) has brought forth a unique vulnerability: susceptibility to malicious attacks through visual inputs.

Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint

3 code implementations18 Dec 2023 Wei Xiong, Hanze Dong, Chenlu Ye, Ziqi Wang, Han Zhong, Heng Ji, Nan Jiang, Tong Zhang

We investigate its behavior in three distinct settings -- offline, online, and hybrid -- and propose efficient algorithms with finite-sample theoretical guarantees.

Language Modelling Large Language Model

Spurious Feature Diversification Improves Out-of-distribution Generalization

no code implementations29 Sep 2023 Yong Lin, Lu Tan, Yifan Hao, Honam Wong, Hanze Dong, Weizhong Zhang, Yujiu Yang, Tong Zhang

Contrary to the conventional wisdom that focuses on learning invariant features for better OOD performance, our findings suggest that incorporating a large number of diverse spurious features weakens their individual contributions, leading to improved overall OOD generalization performance.

Out-of-Distribution Generalization

Mitigating the Alignment Tax of RLHF

no code implementations12 Sep 2023 Yong Lin, Hangyu Lin, Wei Xiong, Shizhe Diao, Jianmeng Liu, Jipeng Zhang, Rui Pan, Haoxiang Wang, Wenbin Hu, Hanning Zhang, Hanze Dong, Renjie Pi, Han Zhao, Nan Jiang, Heng Ji, Yuan YAO, Tong Zhang

Building on the analysis and the observation that averaging different layers of the transformer leads to significantly different reward-tax trade-offs, we propose Adaptive Model Averaging (AMA) to adaptively find various combination ratios of model layers.

Common Sense Reasoning Continual Learning

Reverse Diffusion Monte Carlo

no code implementations5 Jul 2023 Xunpeng Huang, Hanze Dong, Yifan Hao, Yi-An Ma, Tong Zhang

We propose a Monte Carlo sampler from the reverse diffusion process.

LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models

1 code implementation21 Jun 2023 Shizhe Diao, Rui Pan, Hanze Dong, Ka Shun Shum, Jipeng Zhang, Wei Xiong, Tong Zhang

As the number of available foundation models and specialized tasks keeps growing, the job of training scientific language models becomes highly nontrivial.

DetGPT: Detect What You Need via Reasoning

1 code implementation23 May 2023 Renjie Pi, Jiahui Gao, Shizhe Diao, Rui Pan, Hanze Dong, Jipeng Zhang, Lewei Yao, Jianhua Han, Hang Xu, Lingpeng Kong, Tong Zhang

Overall, our proposed paradigm and DetGPT demonstrate the potential for more sophisticated and intuitive interactions between humans and machines.

Autonomous Driving Object +2

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

3 code implementations13 Apr 2023 Hanze Dong, Wei Xiong, Deepanshu Goyal, Yihan Zhang, Winnie Chow, Rui Pan, Shizhe Diao, Jipeng Zhang, Kashun Shum, Tong Zhang

Utilizing a reward model and a sufficient number of samples, our approach selects the high-quality samples, discarding those that exhibit undesired behavior, and subsequently enhancing the model by fine-tuning on these filtered samples.

Ethics

PAPAL: A Provable PArticle-based Primal-Dual ALgorithm for Mixed Nash Equilibrium

no code implementations2 Mar 2023 Shihong Ding, Hanze Dong, Cong Fang, Zhouchen Lin, Tong Zhang

To circumvent this difficulty, we examine the problem of identifying a mixed Nash equilibrium, where strategies are randomized and characterized by probability distributions over continuous domains. To this end, we propose PArticle-based Primal-dual ALgorithm (PAPAL) tailored for a weakly entropy-regularized min-max optimization over probability distributions.

Vocabulary-informed Zero-shot and Open-set Learning

1 code implementation3 Jan 2023 Yanwei Fu, Xiaomei Wang, Hanze Dong, Yu-Gang Jiang, Meng Wang, xiangyang xue, Leonid Sigal

Despite significant progress in object categorization, in recent years, a number of important challenges remain; mainly, the ability to learn from limited labeled data and to recognize object classes within large, potentially open, set of labels.

Object Categorization Open Set Learning +1

Particle-based Variational Inference with Preconditioned Functional Gradient Flow

no code implementations25 Nov 2022 Hanze Dong, Xi Wang, Yong Lin, Tong Zhang

With the popularity of Stein variational gradient descent (SVGD), the focus of particle-based VI algorithms has been on the properties of functions in Reproducing Kernel Hilbert Space (RKHS) to approximate the gradient flow.

Variational Inference

Normalizing Flow with Variational Latent Representation

1 code implementation21 Nov 2022 Hanze Dong, Shizhe Diao, Weizhong Zhang, Tong Zhang

The resulting method is significantly more powerful than the standard normalization flow approach for generating data distributions with multiple modes.

Bayesian Invariant Risk Minimization

no code implementations CVPR 2022 Yong Lin, Hanze Dong, Hao Wang, Tong Zhang

Generalization under distributional shift is an open challenge for machine learning.

Bayesian Inference

Local Augmentation for Graph Neural Networks

1 code implementation8 Sep 2021 Songtao Liu, Rex Ying, Hanze Dong, Lanqing Li, Tingyang Xu, Yu Rong, Peilin Zhao, Junzhou Huang, Dinghao Wu

To address this, we propose a simple and efficient data augmentation strategy, local augmentation, to learn the distribution of the node features of the neighbors conditioned on the central node's feature and enhance GNN's expressive power with generated features.

Open-Ended Question Answering

Mathematical Models of Overparameterized Neural Networks

1 code implementation27 Dec 2020 Cong Fang, Hanze Dong, Tong Zhang

Deep learning has received considerable empirical successes in recent years.

Weakly Supervised Disentangled Generative Causal Representation Learning

1 code implementation6 Oct 2020 Xinwei Shen, Furui Liu, Hanze Dong, Qing Lian, Zhitang Chen, Tong Zhang

This paper proposes a Disentangled gEnerative cAusal Representation (DEAR) learning method under appropriate supervised information.

Disentanglement

Higher-order Weighted Graph Convolutional Networks

no code implementations11 Nov 2019 Songtao Liu, Lingwei Chen, Hanze Dong, ZiHao Wang, Dinghao Wu, Zengfeng Huang

Graph Convolution Network (GCN) has been recognized as one of the most effective graph models for semi-supervised learning, but it extracts merely the first-order or few-order neighborhood information through information propagation, which suffers performance drop-off for deeper structure.

Node Classification

Over Parameterized Two-level Neural Networks Can Learn Near Optimal Feature Representations

no code implementations25 Oct 2019 Cong Fang, Hanze Dong, Tong Zhang

Recently, over-parameterized neural networks have been extensively analyzed in the literature.

Learning the Compositional Spaces for Generalized Zero-shot Learning

no code implementations ICLR 2019 Hanze Dong, Yanwei Fu, Sung Ju Hwang, Leonid Sigal, xiangyang xue

This paper studies the problem of Generalized Zero-shot Learning (G-ZSL), whose goal is to classify instances belonging to both seen and unseen classes at the test time.

Generalized Zero-Shot Learning Open Set Learning

Vocabulary-informed Extreme Value Learning

no code implementations28 May 2017 Yanwei Fu, Hanze Dong, Yu-feng Ma, Zhengjun Zhang, xiangyang xue

To solve this problem, we propose the Extreme Value Learning (EVL) formulation to learn the mapping from visual feature to semantic space.

Open Set Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.