Search Results for author: Junyu Zhang

Found 16 papers, 0 papers with code

Provably Efficient Gauss-Newton Temporal Difference Learning Method with Function Approximation

no code implementations25 Feb 2023 Zhifa Ke, Zaiwen Wen, Junyu Zhang

Compared to the popular Temporal Difference (TD) learning, which can be viewed as taking a single gradient descent step to FQI's subproblem per iteration, the Gauss-Newton step of GNTD better retains the structure of FQI and hence leads to better convergence.

Offline RL

A Near-Optimal Primal-Dual Method for Off-Policy Learning in CMDP

no code implementations13 Jul 2022 Fan Chen, Junyu Zhang, Zaiwen Wen

As an important framework for safe Reinforcement Learning, the Constrained Markov Decision Process (CMDP) has been extensively studied in the recent literature.

Safe Reinforcement Learning

On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control

no code implementations15 Jun 2021 Amrit Singh Bedi, Anjaly Parayil, Junyu Zhang, Mengdi Wang, Alec Koppel

To close this gap, we step towards persistent exploration in continuous space through policy parameterizations defined by distributions of heavier tails defined by tail-index parameter alpha, which increases the likelihood of jumping in state space.

Continuous Control Decision Making

MARL with General Utilities via Decentralized Shadow Reward Actor-Critic

no code implementations29 May 2021 Junyu Zhang, Amrit Singh Bedi, Mengdi Wang, Alec Koppel

DSAC augments the classic critic step by requiring agents to (i) estimate their local occupancy measure in order to (ii) estimate the derivative of the local utility with respect to their occupancy measure, i. e., the "shadow reward".

Multi-agent Reinforcement Learning

On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method

no code implementations NeurIPS 2021 Junyu Zhang, Chengzhuo Ni, Zheng Yu, Csaba Szepesvari, Mengdi Wang

By assuming the overparameterizaiton of policy and exploiting the hidden convexity of the problem, we further show that TSIVR-PG converges to global $\epsilon$-optimal policy with $\tilde{\mathcal{O}}(\epsilon^{-2})$ samples.

Reinforcement Learning (RL)

Variational Policy Gradient Method for Reinforcement Learning with General Utilities

no code implementations NeurIPS 2020 Junyu Zhang, Alec Koppel, Amrit Singh Bedi, Csaba Szepesvari, Mengdi Wang

Analogously to the Policy Gradient Theorem \cite{sutton2000policy} available for RL with cumulative rewards, we derive a new Variational Policy Gradient Theorem for RL with general utilities, which establishes that the parametrized policy gradient may be obtained as the solution of a stochastic saddle point problem involving the Fenchel dual of the utility function.

reinforcement-learning Reinforcement Learning (RL) +1

Wireless Communication Based on Microwave Photon-Level Detection With Superconducting Devices: Achievable Rate Prediction

no code implementations25 Jun 2020 Junyu Zhang, Chen Gong, Shangbin Li, Rui Ni, Chengjie Zuo, Jinkang Zhu, Ming Zhao, Zhengyuan Xu

Future wireless communication system embraces physical-layer signal detection with high sensitivity, especially in the microwave photon level.

On the Divergence of Decentralized Non-Convex Optimization

no code implementations20 Jun 2020 Mingyi Hong, Siliang Zeng, Junyu Zhang, Haoran Sun

However, by constructing some counter-examples, we show that when certain local Lipschitz conditions (LLC) on the local function gradient $\nabla f_i$'s are not satisfied, most of the existing decentralized algorithms diverge, even if the global Lipschitz condition (GLC) is satisfied, where the sum function $f$ has Lipschitz gradient.

Weak Radio Frequency Signal Detection Based on Piezo-Opto-Electro-Mechanical System: Architecture Design and Sensitivity Prediction

no code implementations29 Mar 2020 Shanchi Wu, Chen Gong, Chengjie Zuo, Shangbin Li, Junyu Zhang, Zhongbin Dai, Kai Yang, Ming Zhao, Rui Ni, Zhengyuan Xu, Jinkang Zhu

We propose a novel radio-frequency (RF) receiving architecture based on micro-electro-mechanical system (MEMS) and optical coherent detection module.

Cautious Reinforcement Learning via Distributional Risk in the Dual Domain

no code implementations27 Feb 2020 Junyu Zhang, Amrit Singh Bedi, Mengdi Wang, Alec Koppel

To ameliorate this issue, we propose a new definition of risk, which we call caution, as a penalty function added to the dual objective of the linear programming (LP) formulation of reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Multi-Level Composite Stochastic Optimization via Nested Variance Reduction

no code implementations29 Aug 2019 Junyu Zhang, Lin Xiao

We consider multi-level composite optimization problems where each mapping in the composition is the expectation over a family of random smooth mappings or the sum of some finite number of smooth mappings.

Stochastic Optimization

From low probability to high confidence in stochastic convex optimization

no code implementations31 Jul 2019 Damek Davis, Dmitriy Drusvyatskiy, Lin Xiao, Junyu Zhang

Standard results in stochastic convex optimization bound the number of samples that an algorithm needs to generate a point with small function value in expectation.

Stochastic Optimization

A Stochastic Composite Gradient Method with Incremental Variance Reduction

no code implementations NeurIPS 2019 Junyu Zhang, Lin Xiao

We show that this method achieves the same orders of complexity as the best known first-order methods for minimizing expected-value and finite-sum nonconvex functions, despite the additional outer composition which renders the composite gradient estimator biased.

Highly accurate model for prediction of lung nodule malignancy with CT scans

no code implementations6 Feb 2018 Jason Causey, Junyu Zhang, Shiqian Ma, Bo Jiang, Jake Qualls, David G. Politte, Fred Prior, Shuzhong Zhang, Xiuzhen Huang

Here we present NoduleX, a systematic approach to predict lung nodule malignancy from CT data, based on deep learning convolutional neural networks (CNN).

Computed Tomography (CT)

Primal-Dual Optimization Algorithms over Riemannian Manifolds: an Iteration Complexity Analysis

no code implementations5 Oct 2017 Junyu Zhang, Shiqian Ma, Shuzhong Zhang

For prohibitively large-size tensor or machine learning models, we present a sampling-based stochastic algorithm with the same iteration complexity bound in expectation.

BIG-bench Machine Learning Community Detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.