Search Results for author: Long Yang

Found 25 papers, 8 papers with code

TARGO: Benchmarking Target-driven Object Grasping under Occlusions

no code implementations8 Jul 2024 Yan Xia, Ran Ding, Ziyuan Qin, Guanqi Zhan, Kaichen Zhou, Long Yang, Hao Dong, Daniel Cremers

3) We also generate a large-scale training dataset via a scalable pipeline, which can be used to boost the performance of grasping under occlusion and generalized to the real world.

Benchmarking Object +1

Off-OAB: Off-Policy Policy Gradient Method with Optimal Action-Dependent Baseline

no code implementations4 May 2024 Wenjia Meng, Qian Zheng, Long Yang, Yilong Yin, Gang Pan

In this paper, we propose an off-policy policy gradient method with the optimal action-dependent baseline (Off-OAB) to mitigate this variance issue.

Computational Efficiency OpenAI Gym +1

rFaceNet: An End-to-End Network for Enhanced Physiological Signal Extraction through Identity-Specific Facial Contours

no code implementations14 Mar 2024 Dali Zhu, Wenli Zhang, Hualin Zeng, Xiaohao Liu, Long Yang, Jiaqi Zheng

Remote photoplethysmography (rPPG) technique extracts blood volume pulse (BVP) signals from subtle pixel changes in video frames.

Heart rate estimation

A General Perspective on Objectives of Reinforcement Learning

no code implementations5 Jun 2023 Long Yang

In this lecture, we present a general perspective on reinforcement learning (RL) objectives, where we show three versions of objectives.

reinforcement-learning Reinforcement Learning +1

Policy Representation via Diffusion Probability Model for Reinforcement Learning

1 code implementation22 May 2023 Long Yang, Zhixiong Huang, Fenghao Lei, Yucun Zhong, Yiming Yang, Cong Fang, Shiting Wen, Binbin Zhou, Zhouchen Lin

Popular reinforcement learning (RL) algorithms tend to produce a unimodal policy distribution, which weakens the expressiveness of complicated policy and decays the ability of exploration.

continuous-control Continuous Control +3

Constrained Update Projection Approach to Safe Policy Optimization

3 code implementations15 Sep 2022 Long Yang, Jiaming Ji, Juntao Dai, Linrui Zhang, Binbin Zhou, Pengfei Li, Yaodong Yang, Gang Pan

Compared to previous safe RL methods, CUP enjoys the benefits of 1) CUP generalizes the surrogate functions to generalized advantage estimator (GAE), leading to strong empirical performance.

Reinforcement Learning (RL) Safe Reinforcement Learning

Penalized Proximal Policy Optimization for Safe Reinforcement Learning

no code implementations24 May 2022 Linrui Zhang, Li Shen, Long Yang, Shixiang Chen, Bo Yuan, Xueqian Wang, DaCheng Tao

Safe reinforcement learning aims to learn the optimal policy while satisfying safety constraints, which is essential in real-world applications.

reinforcement-learning Reinforcement Learning +2

A Review of Safe Reinforcement Learning: Methods, Theory and Applications

1 code implementation20 May 2022 Shangding Gu, Long Yang, Yali Du, Guang Chen, Florian Walter, Jun Wang, Alois Knoll

To establish a good foundation for future safe RL research, in this paper, we provide a review of safe RL from the perspectives of methods, theories, and applications.

Autonomous Driving Decision Making +4

CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning

1 code implementation15 Feb 2022 Long Yang, Jiaming Ji, Juntao Dai, Yu Zhang, Pengfei Li, Gang Pan

Although using bounds as surrogate functions to design safe RL algorithms have appeared in some existing works, we develop them at least three aspects: (i) We provide a rigorous theoretical analysis to extend the surrogate functions to generalized advantage estimator (GAE).

reinforcement-learning Reinforcement Learning +3

Secure Transmission for IRS-Assisted MIMO MmWave Systems

no code implementations9 Jan 2022 Long Yang, Jiangtao Wang, Xuan Xue, Jia Shi, Yongchao Wang

In this paper, we investigate the secure beamforming design in an intelligent reflection surface (IRS) assisted millimeter wave (mmWave) system, where the hybrid beamforming (HB) and the passive beamforming (PB) are employed by the transmitter and the IRS, respectively.

Thompson Sampling for Unimodal Bandits

no code implementations15 Jun 2021 Long Yang, Zhao Li, Zehong Hu, Shasha Ruan, Shijian Li, Gang Pan, Hongyang Chen

In this paper, we propose a Thompson Sampling algorithm for \emph{unimodal} bandits, where the expected reward is unimodal over the partially ordered arms.

Thompson Sampling

On Convergence of Gradient Expected Sarsa($λ$)

no code implementations14 Dec 2020 Long Yang, Gang Zheng, Yu Zhang, Qian Zheng, Pengfei Li, Gang Pan

We study the convergence of $\mathtt{Expected~Sarsa}(\lambda)$ with linear function approximation.

Sample Complexity of Policy Gradient Finding Second-Order Stationary Points

no code implementations2 Dec 2020 Long Yang, Qian Zheng, Gang Pan

However, due to the inherent non-concavity of its objective, convergence to a first-order stationary point (FOSP) can not guarantee the policy gradient methods finding a maximal point.

Policy Gradient Methods Reinforcement Learning (RL)

Gradient Q$(σ, λ)$: A Unified Algorithm with Function Approximation for Reinforcement Learning

no code implementations6 Sep 2019 Long Yang, Yu Zhang, Qian Zheng, Pengfei Li, Gang Pan

To address above problem, we propose a GQ$(\sigma,\lambda)$ that extends tabular Q$(\sigma,\lambda)$ with linear function approximation.

Q-Learning Reinforcement Learning +1

FiDi-RL: Incorporating Deep Reinforcement Learning with Finite-Difference Policy Search for Efficient Learning of Continuous Control

no code implementations1 Jul 2019 Longxiang Shi, Shijian Li, Longbing Cao, Long Yang, Gang Zheng, Gang Pan

Alternatively, derivative-based methods treat the optimization process as a blackbox and show robustness and stability in learning continuous control tasks, but not data efficient in learning.

continuous-control Continuous Control +3

Expected Sarsa($λ$) with Control Variate for Variance Reduction

no code implementations25 Jun 2019 Long Yang, Yu Zhang, Jun Wen, Qian Zheng, Pengfei Li, Gang Pan

In this paper, for reducing the variance, we introduce control variate technique to $\mathtt{Expected}$ $\mathtt{Sarsa}$($\lambda$) and propose a tabular $\mathtt{ES}$($\lambda$)-$\mathtt{CV}$ algorithm.

Off-policy evaluation Reinforcement Learning

TBQ($σ$): Improving Efficiency of Trace Utilization for Off-Policy Reinforcement Learning

no code implementations17 May 2019 Longxiang Shi, Shijian Li, Longbing Cao, Long Yang, Gang Pan

However, existing off-policy learning methods based on probabilistic policy measurement are inefficient when utilizing traces under a greedy target policy, which is ineffective for control problems.

reinforcement-learning Reinforcement Learning +1

Beetle Swarm Optimization Algorithm:Theory and Application

1 code implementation1 Aug 2018 Tiantian Wang, Long Yang

In this paper, a new meta-heuristic algorithm, called beetle swarm optimization algorithm, is proposed by enhancing the performance of swarm optimization through beetle foraging principles.

Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network

no code implementations14 Jun 2018 Wenjia Meng, Qian Zheng, Long Yang, Pengfei Li, Gang Pan

In this paper, we propose a general framework to combine DQN and most of the return-based reinforcement learning algorithms, named R-DQN.

OpenAI Gym reinforcement-learning +2

A Unified Approach for Multi-step Temporal-Difference Learning with Eligibility Traces in Reinforcement Learning

no code implementations9 Feb 2018 Long Yang, Minhao Shi, Qian Zheng, Wenjia Meng, Gang Pan

Results show that, with an intermediate value of $\sigma$, $Q(\sigma ,\lambda)$ creates a mixture of the existing algorithms that can learn the optimal value significantly faster than the extreme end ($\sigma=0$, or $1$).

Reinforcement Learning

Distinguishing the Indistinguishable: Exploring Structural Ambiguities via Geodesic Context

1 code implementation CVPR 2017 Qingan Yan, Long Yang, Ling Zhang, Chunxia Xiao

A perennial problem in structure from motion (SfM) is visual ambiguity posed by repetitive structures.

Cannot find the paper you are looking for? You can Submit a new open access paper.