Search Results for author: Wenpeng Zhang

Found 11 papers, 3 papers with code

Marketing Budget Allocation with Offline Constrained Deep Reinforcement Learning

no code implementations6 Sep 2023 Tianchi Cai, Jiyan Jiang, Wenpeng Zhang, Shiji Zhou, Xierui Song, Li Yu, Lihong Gu, Xiaodong Zeng, Jinjie Gu, Guannan Zhang

We further show that this method is guaranteed to converge to the optimal policy, which cannot be achieved by previous value-based reinforcement learning methods for marketing budget allocation.

Marketing reinforcement-learning

Model-free Reinforcement Learning with Stochastic Reward Stabilization for Recommender Systems

no code implementations25 Aug 2023 Tianchi Cai, Shenliao Bao, Jiyan Jiang, Shiji Zhou, Wenpeng Zhang, Lihong Gu, Jinjie Gu, Guannan Zhang

Model-free RL-based recommender systems have recently received increasing research attention due to their capability to handle partial feedback and long-term rewards.

Recommendation Systems reinforcement-learning

Hierarchical Disentanglement-Alignment Network for Robust SAR Vehicle Recognition

1 code implementation7 Apr 2023 Weijie Li, Wei Yang, Wenpeng Zhang, Tianpeng Liu, Yongxiang Liu, Li Liu

However, robustly recognizing vehicle targets is a challenging task in SAR due to the large intraclass variations and small interclass variations.

Data Augmentation Disentanglement

Discovering and Explaining the Non-Causality of Deep Learning in SAR ATR

2 code implementations3 Apr 2023 Weijie Li, Wei Yang, Li Liu, Wenpeng Zhang, Yongxiang Liu

Therefore, the degree of overfitting for clutter reflects the non-causality of deep learning in SAR ATR.

Selection bias

Asynchronous Decentralized Online Learning

no code implementations NeurIPS 2021 Jiyan Jiang, Wenpeng Zhang, Jinjie Gu, Wenwu Zhu

To overcome this problem, we study decentralized online learning in the asynchronous setting, which allows different learners to work at their own pace.

Group-based Interleaved Pipeline Parallelism for Large-scale DNN Training

1 code implementation ICLR 2022 Pengcheng Yang, XiaoMing Zhang, Wenpeng Zhang, Ming Yang, Hong Wei

The recent trend of using large-scale deep neural networks (DNN) to boost performance has propelled the development of the parallel pipelining technique for efficient DNN training, which has resulted in the development of several prominent pipelines such as GPipe, PipeDream, and PipeDream-2BW.

Meta Learning with Minimax Regularization

no code implementations29 Sep 2021 Lianzhe Wang, Shiji Zhou, Shanghang Zhang, Wenpeng Zhang, Heng Chang, Wenwu Zhu

Even though meta-learning has attracted research wide attention in recent years, the generalization problem of meta-learning is still not well addressed.

Few-Shot Learning

Multi-Objective Online Learning

no code implementations29 Sep 2021 Jiyan Jiang, Wenpeng Zhang, Shiji Zhou, Lihong Gu, Xiaodong Zeng, Wenwu Zhu

This paper presents a systematic study of multi-objective online learning.

A Policy Efficient Reduction Approach to Convex Constrained Deep Reinforcement Learning

no code implementations29 Aug 2021 Tianchi Cai, Wenpeng Zhang, Lihong Gu, Xiaodong Zeng, Jinjie Gu

To apply value-based methods to CRL, a recent groundbreaking line of game-theoretic approaches uses the mixed policy that randomizes among a set of carefully generated policies to converge to the desired constraint-satisfying policy.

General Reinforcement Learning reinforcement-learning +1

Online Compact Convexified Factorization Machine

no code implementations5 Feb 2018 Wenpeng Zhang, Xiao Lin, Peilin Zhao

To address this subsequent challenge, we follow the general projection-free algorithmic framework of Online Conditional Gradient and propose an Online Compact Convex Factorization Machine (OCCFM) algorithm that eschews the projection operation with efficient linear optimization steps.

Binary Classification Feature Engineering

Projection-free Distributed Online Learning in Networks

no code implementations ICML 2017 Wenpeng Zhang, Peilin Zhao, Wenwu Zhu, Steven C. H. Hoi, Tong Zhang

The conditional gradient algorithm has regained a surge of research interest in recent years due to its high efficiency in handling large-scale machine learning problems.

Cannot find the paper you are looking for? You can Submit a new open access paper.