Search Results for author: Zhaopeng Meng

Found 25 papers, 9 papers with code

Qibo: A Large Language Model for Traditional Chinese Medicine

no code implementations • 24 Mar 2024 • Heyi Zhang, Xin Wang, Zhaopeng Meng, Yongzhe Jia, Dawei Xu

Furthermore, we develop the Qibo-benchmark, a specialized tool for evaluating the performance of LLMs, which is a specialized tool for evaluating the performance of LLMs in the TCM domain.

Language Modelling Large Language Model

Paper
Add Code

Learning with Noisy Labels Using Collaborative Sample Selection and Contrastive Semi-Supervised Learning

no code implementations • 24 Oct 2023 • Qing Miao, Xiaohe Wu, Chao Xu, Yanli Ji, WangMeng Zuo, Yiwen Guo, Zhaopeng Meng

By incorporating auxiliary information from CLIP and utilizing prompt fine-tuning, we effectively eliminate noisy samples from the clean set and mitigate confirmation bias during training.

Learning with noisy labels

Paper
Add Code

Improving Offline-to-Online Reinforcement Learning with Q-Ensembles

no code implementations • 12 Jun 2023 • Kai Zhao, Yi Ma, Jianye Hao, Jinyi Liu, Yan Zheng, Zhaopeng Meng

Offline reinforcement learning (RL) is a learning paradigm where an agent learns from a fixed dataset of experience.

Offline RL reinforcement-learning +1

Paper
Add Code

HIPODE: Enhancing Offline Reinforcement Learning with High-Quality Synthetic Data from a Policy-Decoupled Approach

no code implementations • 10 Jun 2023 • Shixi Lian, Yi Ma, Jinyi Liu, Yan Zheng, Zhaopeng Meng

Offline reinforcement learning (ORL) has gained attention as a means of training reinforcement learning models using pre-collected static data.

D4RL Data Augmentation +1

Paper
Add Code

ERL-Re$^2$: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy Representation

1 code implementation • 26 Oct 2022 • Jianye Hao, Pengyi Li, Hongyao Tang, Yan Zheng, Xian Fu, Zhaopeng Meng

The state representation conveys expressive common features of the environment learned by all the agents collectively; the linear policy representation provides a favorable space for efficient policy optimization, where novel behavior-level crossover and mutation operations can be performed.

Continuous Control Evolutionary Algorithms +2

Paper
Code

PAnDR: Fast Adaptation to New Environments from Offline Experiences via Decoupling Policy and Environment Representations

1 code implementation • 6 Apr 2022 • Tong Sang, Hongyao Tang, Yi Ma, Jianye Hao, Yan Zheng, Zhaopeng Meng, Boyan Li, Zhen Wang

In online adaptation phase, with the environment context inferred from few experiences collected in new environments, the policy is optimized by gradient ascent with respect to the PDVF.

Contrastive Learning Decision Making

Paper
Code

A Hierarchical Reinforcement Learning Based Optimization Framework for Large-scale Dynamic Pickup and Delivery Problems

no code implementations • NeurIPS 2021 • Yi Ma, Xiaotian Hao, Jianye Hao, Jiawen Lu, Xing Liu, Tong Xialiang, Mingxuan Yuan, Zhigang Li, Jie Tang, Zhaopeng Meng

To address this problem, existing methods partition the overall DPDP into fixed-size sub-problems by caching online generated orders and solve each sub-problem, or on this basis to utilize the predicted future orders to optimize each sub-problem further.

Hierarchical Reinforcement Learning

Paper
Add Code

Uncertainty-aware Low-Rank Q-Matrix Estimation for Deep Reinforcement Learning

no code implementations • 19 Nov 2021 • Tong Sang, Hongyao Tang, Jianye Hao, Yan Zheng, Zhaopeng Meng

Such a reconstruction exploits the underlying structure of value matrix to improve the value approximation, thus leading to a more efficient learning process of value function.

Continuous Control reinforcement-learning +1

Paper
Add Code

Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain

no code implementations • 14 Sep 2021 • Jianye Hao, Tianpei Yang, Hongyao Tang, Chenjia Bai, Jinyi Liu, Zhaopeng Meng, Peng Liu, Zhen Wang

In addition to algorithmic analysis, we provide a comprehensive and unified empirical comparison of different exploration methods for DRL on a set of commonly used benchmarks.

Autonomous Vehicles Efficient Exploration +3

Paper
Add Code

HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation

1 code implementation • ICLR 2022 • Boyan Li, Hongyao Tang, Yan Zheng, Jianye Hao, Pengyi Li, Zhen Wang, Zhaopeng Meng, Li Wang

Discrete-continuous hybrid action space is a natural setting in many practical problems, such as robot control and game AI.

reinforcement-learning Reinforcement Learning (RL)

2,513

Paper
Code

Free-form tumor synthesis in computed tomography images via richer generative adversarial network

1 code implementation • 20 Apr 2021 • Qiangguo Jin, Hui Cui, Changming Sun, Zhaopeng Meng, Ran Su

The network is composed of a new richer convolutional feature enhanced dilated-gated generator (RicherDG) and a hybrid loss function.

Computed Tomography (CT) Generative Adversarial Network

Paper
Code

Domain adaptation based self-correction model for COVID-19 infection segmentation in CT images

1 code implementation • 20 Apr 2021 • Qiangguo Jin, Hui Cui, Changming Sun, Zhaopeng Meng, Leyi Wei, Ran Su

DASC-Net consists of a novel attention and feature domain enhanced domain adaptation model (AFD-DA) to solve the domain shifts and a self-correction learning process to refine segmentation results.

Domain Adaptation Segmentation

Paper
Code

Foresee then Evaluate: Decomposing Value Estimation with Latent Future Prediction

1 code implementation • 3 Mar 2021 • Hongyao Tang, Jianye Hao, Guangyong Chen, Pengfei Chen, Chen Chen, Yaodong Yang, Luo Zhang, Wulong Liu, Zhaopeng Meng

Value function is the central notion of Reinforcement Learning (RL).

Continuous Control Future prediction +2

Paper
Code

Addressing Action Oscillations through Learning Policy Inertia

no code implementations • 3 Mar 2021 • Chen Chen, Hongyao Tang, Jianye Hao, Wulong Liu, Zhaopeng Meng

We propose Nested Policy Iteration as a general training algorithm for PIC-augmented policy which ensures monotonically non-decreasing updates under some mild conditions.

Atari Games Autonomous Driving +1

Paper
Add Code

What About Inputing Policy in Value Function: Policy Representation and Policy-extended Value Function Approximator

1 code implementation • NeurIPS 2021 • Hongyao Tang, Zhaopeng Meng, Jianye Hao, Chen Chen, Daniel Graves, Dong Li, Changmin Yu, Hangyu Mao, Wulong Liu, Yaodong Yang, Wenyuan Tao, Li Wang

We study Policy-extended Value Function Approximator (PeVFA) in Reinforcement Learning (RL), which extends conventional value function approximator (VFA) to take as input not only the state (and action) but also an explicit policy representation.

Continuous Control Contrastive Learning +3

Paper
Code

Transfer among Agents: An Efficient Multiagent Transfer Learning Framework

no code implementations • 28 Sep 2020 • Tianpei Yang, Jianye Hao, Weixun Wang, Hongyao Tang, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Yujing Hu, Yingfeng Chen, Changjie Fan

In many cases, each agent's experience is inconsistent with each other which causes the option-value estimation to oscillate and to become inaccurate.

Open-Ended Question Answering Reinforcement Learning (RL) +1

Paper
Add Code

What About Taking Policy as Input of Value Function: Policy-extended Value Function Approximator

no code implementations • 28 Sep 2020 • Hongyao Tang, Zhaopeng Meng, Jianye Hao, Chen Chen, Daniel Graves, Dong Li, Wulong Liu, Yaodong Yang

The value function lies in the heart of Reinforcement Learning (RL), which defines the long-term evaluation of a policy in a given state.

Continuous Control Contrastive Learning +2

Paper
Add Code

Continuous Multiagent Control using Collective Behavior Entropy for Large-Scale Home Energy Management

no code implementations • 14 May 2020 • Jianwen Sun, Yan Zheng, Jianye Hao, Zhaopeng Meng, Yang Liu

With the increasing popularity of electric vehicles, distributed energy generation and storage facilities in smart grid systems, an efficient Demand-Side Management (DSM) is urgent for energy savings and peak loads reduction.

energy management Management

Paper
Add Code

Efficient Deep Reinforcement Learning via Adaptive Policy Transfer

1 code implementation • 19 Feb 2020 • Tianpei Yang, Jianye Hao, Zhaopeng Meng, Zongzhang Zhang, Yujing Hu, Yingfeng Cheng, Changjie Fan, Weixun Wang, Wulong Liu, Zhaodong Wang, Jiajie Peng

Transfer Learning (TL) has shown great potential to accelerate Reinforcement Learning (RL) by leveraging prior knowledge from past learned policies of relevant tasks.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Disentangling Dynamics and Returns: Value Function Decomposition with Future Prediction

no code implementations • 27 May 2019 • Hongyao Tang, Jianye Hao, Guangyong Chen, Pengfei Chen, Zhaopeng Meng, Yaodong Yang, Li Wang

Value functions are crucial for model-free Reinforcement Learning (RL) to obtain a policy implicitly or guide the policy updates.

Continuous Control Future prediction +1

Paper
Add Code

A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents

no code implementations • NeurIPS 2018 • Yan Zheng, Zhaopeng Meng, Jianye Hao, Zongzhang Zhang, Tianpei Yang, Changjie Fan

In multiagent domains, coping with non-stationary agents that change behaviors from time to time is a challenging problem, where an agent is usually required to be able to quickly detect the other agent's policy during online interaction, and then adapt its own policy accordingly.

Paper
Add Code

RA-UNet: A hybrid deep attention-aware network to extract liver and tumor in CT scans

1 code implementation • 4 Nov 2018 • Qiangguo Jin, Zhaopeng Meng, Changming Sun, Leyi Wei, Ran Su

Automatic extraction of liver and tumor from CT volumes is a challenging task due to their heterogeneous and diffusive shapes.

Brain Tumor Segmentation Deep Attention +3

123

Paper
Code

DUNet: A deformable network for retinal vessel segmentation

no code implementations • 3 Nov 2018 • Qiangguo Jin, Zhaopeng Meng, Tuan D. Pham, Qi Chen, Leyi Wei, Ran Su

Results show that more detailed vessels are extracted by DUNet and it exhibits state-of-the-art performance for retinal vessel segmentation with a global accuracy of 0. 9697/0. 9722/0. 9724 and AUC of 0. 9856/0. 9868/0. 9863 on DRIVE, STARE and CHASE_DB1 respectively.

Ranked #4 on Retinal Vessel Segmentation on STARE

Retinal Vessel Segmentation Segmentation

Paper
Add Code

Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction

no code implementations • 25 Sep 2018 • Hongyao Tang, Jianye Hao, Tangjie Lv, Yingfeng Chen, Zongzhang Zhang, Hangtian Jia, Chunxu Ren, Yan Zheng, Zhaopeng Meng, Changjie Fan, Li Wang

Besides, we propose a new experience replay mechanism to alleviate the issue of the sparse transitions at the high level of abstraction and the non-stationarity of multiagent learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Towards Efficient Detection and Optimal Response against Sophisticated Opponents

no code implementations • 12 Sep 2018 • Tianpei Yang, Zhaopeng Meng, Jianye Hao, Chongjie Zhang, Yan Zheng, Ze Zheng

This paper proposes a novel approach called Bayes-ToMoP which can efficiently detect the strategy of opponents using either stationary or higher-level reasoning strategies.

Multiagent Systems

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.