Search Results for author: Dexun Li

Found 8 papers, 0 papers with code

Aligning Crowd Feedback via Distributional Preference Reward Modeling

no code implementations • 15 Feb 2024 • Dexun Li, Cong Zhang, Kuicai Dong, Derrick Goh Xin Deik, Ruiming Tang, Yong liu

In this paper, we introduce the Distributional Preference Reward Model (DPRM), a simple yet effective framework to align large language models with a diverse set of human preferences.

Paper
Add Code

Enhancing the Hierarchical Environment Design via Generative Trajectory Modeling

no code implementations • 30 Sep 2023 • Dexun Li, Pradeep Varakantham

Unsupervised Environment Design (UED) is a paradigm for automatically generating a curriculum of training environments, enabling agents trained in these environments to develop general capabilities, i. e., achieving good zero-shot transfer performance.

Trajectory Modeling

Paper
Add Code

Diversity Induced Environment Design via Self-Play

no code implementations • 4 Feb 2023 • Dexun Li, Wenjun Li, Pradeep Varakantham

In this paper, we aim to introduce diversity in the Unsupervised Environment Design (UED) framework.

Paper
Add Code

Generalization through Diversity: Improving Unsupervised Environment Design

no code implementations • 19 Jan 2023 • Wenjun Li, Pradeep Varakantham, Dexun Li

Agent decision making using Reinforcement Learning (RL) heavily relies on either a model or simulator of the environment (e. g., moving in an 8x8 maze with three rooms, playing Chess on an 8x8 board).

Decision Making Reinforcement Learning (RL)

Paper
Add Code

Hidden State Approximation in Recurrent Neural Networks Using Continuous Particle Filtering

no code implementations • 18 Dec 2022 • Dexun Li

Using historical data to predict future events has many applications in the real world, such as stock price prediction; the robot localization.

Stock Price Prediction

Paper
Add Code

Towards Soft Fairness in Restless Multi-Armed Bandits

no code implementations • 27 Jul 2022 • Dexun Li, Pradeep Varakantham

To avoid starvation in the executed interventions across individuals/regions/communities, we first provide a soft fairness constraint and then provide an approach to enforce the soft fairness constraint in RMABs.

Fairness Multi-Armed Bandits

Paper
Add Code

Efficient Resource Allocation with Fairness Constraints in Restless Multi-Armed Bandits

no code implementations • 8 Jun 2022 • Dexun Li, Pradeep Varakantham

In this paper, we are interested in ensuring that RMAB decision making is also fair to different arms while maximizing expected value.

Decision Making Fairness +1

Paper
Add Code

CLAIM: Curriculum Learning Policy for Influence Maximization in Unknown Social Networks

no code implementations • 8 Jul 2021 • Dexun Li, Meghna Lowalekar, Pradeep Varakantham

Influence maximization is the problem of finding a small subset of nodes in a network that can maximize the diffusion of information.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.