Search Results for author: Paul Weng

Found 29 papers, 8 papers with code

Learning Fair Policies in Multi-Objective (Deep) Reinforcement Learning with Average and Discounted Rewards

no code implementations ICML 2020 Umer Siddique, Paul Weng, Matthieu Zimmer

During this analysis, we notably derive a new result in the standard RL setting, which is of independent interest: it states a novel bound on the approximation error with respect to the optimal average reward of that of a policy optimal for the discounted reward.

Fairness Reinforcement Learning (RL)

Revisiting Data Augmentation in Deep Reinforcement Learning

1 code implementation19 Feb 2024 Jianshu Hu, Yunpeng Jiang, Paul Weng

To tackle this question, we analyze existing methods to better understand them and to uncover how they are connected.

Data Augmentation reinforcement-learning

INViT: A Generalizable Routing Problem Solver with Invariant Nested View Transformer

no code implementations4 Feb 2024 Han Fang, Zhihao Song, Paul Weng, Yutong Ban

Recently, deep reinforcement learning has shown promising results for learning fast heuristics to solve routing problems.

A Survey of Reinforcement Learning from Human Feedback

no code implementations22 Dec 2023 Timo Kaufmann, Paul Weng, Viktor Bengs, Eyke Hüllermeier

Reinforcement learning from human feedback (RLHF) is a variant of reinforcement learning (RL) that learns from human feedback instead of relying on an engineered reward function.

reinforcement-learning Reinforcement Learning (RL)

Learning Rewards to Optimize Global Performance Metrics in Deep Reinforcement Learning

no code implementations16 Mar 2023 Junqi Qian, Paul Weng, Chenmien Tan

LR4GPM alternates between two phases: (1) learning a (possibly vector) reward function used to fit the performance metric, and (2) training a policy to optimize an approximation of this performance metric based on the learned rewards.

Autonomous Driving reinforcement-learning +1

A Survey on Interpretable Reinforcement Learning

no code implementations24 Dec 2021 Claire Glanois, Paul Weng, Matthieu Zimmer, Dong Li, Tianpei Yang, Jianye Hao, Wulong Liu

To that aim, we distinguish interpretability (as a property of a model) and explainability (as a post-hoc operation, with the intervention of a proxy) and discuss them in the context of RL with an emphasis on the former notion.

Autonomous Driving Decision Making +2

Generalization in Deep RL for TSP Problems via Equivariance and Local Search

no code implementations7 Oct 2021 Wenbin Ouyang, Yisen Wang, Paul Weng, Shaochen Han

Since training on large instances is impractical, we design a novel deep RL approach with a focus on generalizability.

reinforcement-learning Reinforcement Learning (RL)

Improving Generalization of Deep Reinforcement Learning-based TSP Solvers

no code implementations6 Oct 2021 Wenbin Ouyang, Yisen Wang, Shaochen Han, Zhejian Jin, Paul Weng

In this work, we propose a novel approach named MAGIC that includes a deep learning architecture and a DRL training method.

reinforcement-learning Reinforcement Learning (RL)

Learning Symbolic Rules for Interpretable Deep Reinforcement Learning

no code implementations15 Mar 2021 Zhihao Ma, Yuzheng Zhuang, Paul Weng, Hankz Hankui Zhuo, Dong Li, Wulong Liu, Jianye Hao

To address this challenge and improve the transparency, we propose a Neural Symbolic Reinforcement Learning framework by introducing symbolic logic into DRL.

reinforcement-learning Reinforcement Learning (RL)

Safe Distributional Reinforcement Learning

no code implementations26 Feb 2021 Jianyi Zhang, Paul Weng

Safety in reinforcement learning (RL) is a key property in both training and execution in many domains such as autonomous driving or finance.

Autonomous Driving Distributional Reinforcement Learning +2

Differentiable Logic Machines

no code implementations23 Feb 2021 Matthieu Zimmer, Xuening Feng, Claire Glanois, Zhaohui Jiang, Jianyi Zhang, Paul Weng, Dong Li, Jianye Hao, Wulong Liu

As a step in this direction, we propose a novel neural-logic architecture, called differentiable logic machine (DLM), that can solve both inductive logic programming (ILP) and reinforcement learning (RL) problems, where the solution can be interpreted as a first-order logic program.

Decision Making Inductive logic programming +1

Analytics and Machine Learning in Vehicle Routing Research

no code implementations19 Feb 2021 Ruibin Bai, Xinan Chen, Zhi-Long Chen, Tianxiang Cui, Shuhui Gong, Wentao He, Xiaoping Jiang, Huan Jin, Jiahuan Jin, Graham Kendall, Jiawei Li, Zheng Lu, Jianfeng Ren, Paul Weng, Ning Xue, Huayan Zhang

The Vehicle Routing Problem (VRP) is one of the most intensively studied combinatorial optimisation problems for which numerous models and algorithms have been proposed.

BIG-bench Machine Learning

Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning

3 code implementations17 Dec 2020 Matthieu Zimmer, Claire Glanois, Umer Siddique, Paul Weng

As a solution method, we propose a novel neural network architecture, which is composed of two sub-networks specifically designed for taking into account the two aspects of fairness.

Fairness Multi-agent Reinforcement Learning +2

Hyperparameter Auto-tuning in Self-Supervised Robotic Learning

2 code implementations16 Oct 2020 Jiancong Huang, Juan Rojas, Matthieu Zimmer, Hongmin Wu, Yisheng Guan, Paul Weng

Insufficient learning (due to convergence to local optima) results in under-performing policies whilst redundant learning wastes time and resources.

Multi-Task Learning reinforcement-learning +1

Learning Fair Policies in Multiobjective (Deep) Reinforcement Learning with Average and Discounted Rewards

1 code implementation18 Aug 2020 Umer Siddique, Paul Weng, Matthieu Zimmer

Since learning with discounted rewards is generally easier, this discussion further justifies finding a fair policy for the average reward by learning a fair policy for the discounted reward.

Fairness reinforcement-learning +1

Reinforcement Learning

no code implementations29 May 2020 Olivier Buffet, Olivier Pietquin, Paul Weng

Reinforcement learning (RL) is a general framework for adaptive control, which has proven to be efficient in many domains, e. g., board games, video games or autonomous vehicles.

Autonomous Vehicles Board Games +3

Towards More Sample Efficiency in Reinforcement Learning with Data Augmentation

1 code implementation19 Oct 2019 Yijiong Lin, Jiancong Huang, Matthieu Zimmer, Juan Rojas, Paul Weng

Deep reinforcement learning (DRL) is a promising approach for adaptive robot control, but its current application to robotics is currently hindered by high sample requirements.

Data Augmentation reinforcement-learning +1

Invariant Transform Experience Replay: Data Augmentation for Deep Reinforcement Learning

1 code implementation24 Sep 2019 Yijiong Lin, Jiancong Huang, Matthieu Zimmer, Yisheng Guan, Juan Rojas, Paul Weng

Our work demonstrates that invariant transformations on RL trajectories are a promising methodology to speed up learning in deep RL.

Data Augmentation OpenAI Gym +2

Fairness in Reinforcement Learning

no code implementations24 Jul 2019 Paul Weng

Decision support systems (e. g., for ecological conservation) and autonomous systems (e. g., adaptive controllers in smart cities) start to be deployed in real applications.

BIG-bench Machine Learning Fairness +2

Exploiting the Sign of the Advantage Function to Learn Deterministic Policies in Continuous Domains

1 code implementation10 Jun 2019 Matthieu Zimmer, Paul Weng

In the context of learning deterministic policies in continuous domains, we revisit an approach, which was first proposed in Continuous Actor Critic Learning Automaton (CACLA) and later extended in Neural Fitted Actor Critic (NFAC).

Multi-objective Bandits: Optimizing the Generalized Gini Index

no code implementations ICML 2017 Robert Busa-Fekete, Balazs Szorenyi, Paul Weng, Shie Mannor

We study the multi-armed bandit (MAB) problem where the agent receives a vectorial feedback that encodes many possibly competing objectives to be optimized.

Finding Risk-Averse Shortest Path with Time-dependent Stochastic Costs

no code implementations3 Jan 2017 Dajian Li, Paul Weng, Orkun Karabasoglu

We also present a case study of our algorithm on the Manhattan, NYC, transportation network.

From Preference-Based to Multiobjective Sequential Decision-Making

no code implementations3 Jan 2017 Paul Weng

In this paper, we present a link between preference-based and multiobjective sequential decision-making.

Decision Making

Optimizing Quantiles in Preference-based Markov Decision Processes

no code implementations1 Dec 2016 Hugo Gilbert, Paul Weng, Yan Xu

In the Markov decision process model, policies are usually evaluated by expected cumulative rewards.

Quantile Reinforcement Learning

no code implementations3 Nov 2016 Hugo Gilbert, Paul Weng

In reinforcement learning, the standard criterion to evaluate policies in a state is the expectation of (discounted) sum of rewards.

reinforcement-learning Reinforcement Learning (RL)

Approximation of Lorenz-Optimal Solutions in Multiobjective Markov Decision Processes

no code implementations26 Sep 2013 Patrice Perny, Paul Weng, Judy Goldsmith, Josiah Hanna

This paper is devoted to fair optimization in Multiobjective Markov Decision Processes (MOMDPs).

Cannot find the paper you are looking for? You can Submit a new open access paper.