Search Results for author: Meihua Dang

Found 7 papers, 4 papers with code

Diffusion Model Alignment Using Direct Preference Optimization

no code implementations21 Nov 2023 Bram Wallace, Meihua Dang, Rafael Rafailov, Linqi Zhou, Aaron Lou, Senthil Purushwalkam, Stefano Ermon, Caiming Xiong, Shafiq Joty, Nikhil Naik

Large language models (LLMs) are fine-tuned using human comparison data with Reinforcement Learning from Human Feedback (RLHF) methods to make them better aligned with users' preferences.

Scaling Pareto-Efficient Decision Making Via Offline Multi-Objective RL

1 code implementation30 Apr 2023 Baiting Zhu, Meihua Dang, Aditya Grover

In this work, we propose a new data-driven setup for offline MORL, where we wish to learn a preference-agnostic policy agent using only a finite dataset of offline demonstrations of other agents and their preferences.

Decision Making Multi-Objective Reinforcement Learning

Tractable Control for Autoregressive Language Generation

1 code implementation15 Apr 2023 Honghua Zhang, Meihua Dang, Nanyun Peng, Guy Van Den Broeck

To overcome this challenge, we propose to use tractable probabilistic models (TPMs) to impose lexical constraints in autoregressive text generation models, which we refer to as GeLaTo (Generating Language with Tractable Constraints).

Text Generation

Sparse Probabilistic Circuits via Pruning and Growing

1 code implementation22 Nov 2022 Meihua Dang, Anji Liu, Guy Van Den Broeck

The growing operation increases model capacity by increasing the size of the latent space.

Model Compression

Strudel: Learning Structured-Decomposable Probabilistic Circuits

1 code implementation18 Jul 2020 Meihua Dang, Antonio Vergari, Guy Van Den Broeck

Probabilistic circuits (PCs) represent a probability distribution as a computational graph.

Density Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.