Search Results for author: Zhaopeng Meng

Found 25 papers, 9 papers with code

Qibo: A Large Language Model for Traditional Chinese Medicine

no code implementations24 Mar 2024 Heyi Zhang, Xin Wang, Zhaopeng Meng, Yongzhe Jia, Dawei Xu

Furthermore, we develop the Qibo-benchmark, a specialized tool for evaluating the performance of LLMs, which is a specialized tool for evaluating the performance of LLMs in the TCM domain.

Language Modelling Large Language Model

Learning with Noisy Labels Using Collaborative Sample Selection and Contrastive Semi-Supervised Learning

no code implementations24 Oct 2023 Qing Miao, Xiaohe Wu, Chao Xu, Yanli Ji, WangMeng Zuo, Yiwen Guo, Zhaopeng Meng

By incorporating auxiliary information from CLIP and utilizing prompt fine-tuning, we effectively eliminate noisy samples from the clean set and mitigate confirmation bias during training.

Learning with noisy labels

Improving Offline-to-Online Reinforcement Learning with Q-Ensembles

no code implementations12 Jun 2023 Kai Zhao, Yi Ma, Jianye Hao, Jinyi Liu, Yan Zheng, Zhaopeng Meng

Offline reinforcement learning (RL) is a learning paradigm where an agent learns from a fixed dataset of experience.

Offline RL reinforcement-learning +1

HIPODE: Enhancing Offline Reinforcement Learning with High-Quality Synthetic Data from a Policy-Decoupled Approach

no code implementations10 Jun 2023 Shixi Lian, Yi Ma, Jinyi Liu, Yan Zheng, Zhaopeng Meng

Offline reinforcement learning (ORL) has gained attention as a means of training reinforcement learning models using pre-collected static data.

D4RL Data Augmentation +1

ERL-Re$^2$: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy Representation

1 code implementation26 Oct 2022 Jianye Hao, Pengyi Li, Hongyao Tang, Yan Zheng, Xian Fu, Zhaopeng Meng

The state representation conveys expressive common features of the environment learned by all the agents collectively; the linear policy representation provides a favorable space for efficient policy optimization, where novel behavior-level crossover and mutation operations can be performed.

Continuous Control Evolutionary Algorithms +2

PAnDR: Fast Adaptation to New Environments from Offline Experiences via Decoupling Policy and Environment Representations

1 code implementation6 Apr 2022 Tong Sang, Hongyao Tang, Yi Ma, Jianye Hao, Yan Zheng, Zhaopeng Meng, Boyan Li, Zhen Wang

In online adaptation phase, with the environment context inferred from few experiences collected in new environments, the policy is optimized by gradient ascent with respect to the PDVF.

Contrastive Learning Decision Making

A Hierarchical Reinforcement Learning Based Optimization Framework for Large-scale Dynamic Pickup and Delivery Problems

no code implementations NeurIPS 2021 Yi Ma, Xiaotian Hao, Jianye Hao, Jiawen Lu, Xing Liu, Tong Xialiang, Mingxuan Yuan, Zhigang Li, Jie Tang, Zhaopeng Meng

To address this problem, existing methods partition the overall DPDP into fixed-size sub-problems by caching online generated orders and solve each sub-problem, or on this basis to utilize the predicted future orders to optimize each sub-problem further.

Hierarchical Reinforcement Learning

Uncertainty-aware Low-Rank Q-Matrix Estimation for Deep Reinforcement Learning

no code implementations19 Nov 2021 Tong Sang, Hongyao Tang, Jianye Hao, Yan Zheng, Zhaopeng Meng

Such a reconstruction exploits the underlying structure of value matrix to improve the value approximation, thus leading to a more efficient learning process of value function.

Continuous Control reinforcement-learning +1

Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain

no code implementations14 Sep 2021 Jianye Hao, Tianpei Yang, Hongyao Tang, Chenjia Bai, Jinyi Liu, Zhaopeng Meng, Peng Liu, Zhen Wang

In addition to algorithmic analysis, we provide a comprehensive and unified empirical comparison of different exploration methods for DRL on a set of commonly used benchmarks.

Autonomous Vehicles Efficient Exploration +3

Domain adaptation based self-correction model for COVID-19 infection segmentation in CT images

1 code implementation20 Apr 2021 Qiangguo Jin, Hui Cui, Changming Sun, Zhaopeng Meng, Leyi Wei, Ran Su

DASC-Net consists of a novel attention and feature domain enhanced domain adaptation model (AFD-DA) to solve the domain shifts and a self-correction learning process to refine segmentation results.

Domain Adaptation Segmentation

Free-form tumor synthesis in computed tomography images via richer generative adversarial network

1 code implementation20 Apr 2021 Qiangguo Jin, Hui Cui, Changming Sun, Zhaopeng Meng, Ran Su

The network is composed of a new richer convolutional feature enhanced dilated-gated generator (RicherDG) and a hybrid loss function.

Computed Tomography (CT) Generative Adversarial Network

Addressing Action Oscillations through Learning Policy Inertia

no code implementations3 Mar 2021 Chen Chen, Hongyao Tang, Jianye Hao, Wulong Liu, Zhaopeng Meng

We propose Nested Policy Iteration as a general training algorithm for PIC-augmented policy which ensures monotonically non-decreasing updates under some mild conditions.

Atari Games Autonomous Driving +1

What About Inputing Policy in Value Function: Policy Representation and Policy-extended Value Function Approximator

1 code implementation NeurIPS 2021 Hongyao Tang, Zhaopeng Meng, Jianye Hao, Chen Chen, Daniel Graves, Dong Li, Changmin Yu, Hangyu Mao, Wulong Liu, Yaodong Yang, Wenyuan Tao, Li Wang

We study Policy-extended Value Function Approximator (PeVFA) in Reinforcement Learning (RL), which extends conventional value function approximator (VFA) to take as input not only the state (and action) but also an explicit policy representation.

Continuous Control Contrastive Learning +3

Continuous Multiagent Control using Collective Behavior Entropy for Large-Scale Home Energy Management

no code implementations14 May 2020 Jianwen Sun, Yan Zheng, Jianye Hao, Zhaopeng Meng, Yang Liu

With the increasing popularity of electric vehicles, distributed energy generation and storage facilities in smart grid systems, an efficient Demand-Side Management (DSM) is urgent for energy savings and peak loads reduction.

energy management Management

Efficient Deep Reinforcement Learning via Adaptive Policy Transfer

1 code implementation19 Feb 2020 Tianpei Yang, Jianye Hao, Zhaopeng Meng, Zongzhang Zhang, Yujing Hu, Yingfeng Cheng, Changjie Fan, Weixun Wang, Wulong Liu, Zhaodong Wang, Jiajie Peng

Transfer Learning (TL) has shown great potential to accelerate Reinforcement Learning (RL) by leveraging prior knowledge from past learned policies of relevant tasks.

reinforcement-learning Reinforcement Learning (RL) +1

A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents

no code implementations NeurIPS 2018 Yan Zheng, Zhaopeng Meng, Jianye Hao, Zongzhang Zhang, Tianpei Yang, Changjie Fan

In multiagent domains, coping with non-stationary agents that change behaviors from time to time is a challenging problem, where an agent is usually required to be able to quickly detect the other agent's policy during online interaction, and then adapt its own policy accordingly.

RA-UNet: A hybrid deep attention-aware network to extract liver and tumor in CT scans

1 code implementation4 Nov 2018 Qiangguo Jin, Zhaopeng Meng, Changming Sun, Leyi Wei, Ran Su

Automatic extraction of liver and tumor from CT volumes is a challenging task due to their heterogeneous and diffusive shapes.

Brain Tumor Segmentation Deep Attention +3

DUNet: A deformable network for retinal vessel segmentation

no code implementations3 Nov 2018 Qiangguo Jin, Zhaopeng Meng, Tuan D. Pham, Qi Chen, Leyi Wei, Ran Su

Results show that more detailed vessels are extracted by DUNet and it exhibits state-of-the-art performance for retinal vessel segmentation with a global accuracy of 0. 9697/0. 9722/0. 9724 and AUC of 0. 9856/0. 9868/0. 9863 on DRIVE, STARE and CHASE_DB1 respectively.

Retinal Vessel Segmentation Segmentation

Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction

no code implementations25 Sep 2018 Hongyao Tang, Jianye Hao, Tangjie Lv, Yingfeng Chen, Zongzhang Zhang, Hangtian Jia, Chunxu Ren, Yan Zheng, Zhaopeng Meng, Changjie Fan, Li Wang

Besides, we propose a new experience replay mechanism to alleviate the issue of the sparse transitions at the high level of abstraction and the non-stationarity of multiagent learning.

reinforcement-learning Reinforcement Learning (RL)

Towards Efficient Detection and Optimal Response against Sophisticated Opponents

no code implementations12 Sep 2018 Tianpei Yang, Zhaopeng Meng, Jianye Hao, Chongjie Zhang, Yan Zheng, Ze Zheng

This paper proposes a novel approach called Bayes-ToMoP which can efficiently detect the strategy of opponents using either stationary or higher-level reasoning strategies.

Multiagent Systems

Cannot find the paper you are looking for? You can Submit a new open access paper.