Search Results for author: Huaimin Wang

Found 20 papers, 5 papers with code

Uncertainty-Penalized Reinforcement Learning from Human Feedback with Diverse Reward LoRA Ensembles

no code implementations30 Dec 2023 Yuanzhao Zhai, Han Zhang, Yu Lei, Yue Yu, Kele Xu, Dawei Feng, Bo Ding, Huaimin Wang

Reinforcement learning from human feedback (RLHF) emerges as a promising paradigm for aligning large language models (LLMs).

Uncertainty Quantification

Intelligent Computing: The Latest Advances, Challenges and Future

no code implementations21 Nov 2022 Shiqiang Zhu, Ting Yu, Tao Xu, Hongyang Chen, Schahram Dustdar, Sylvain Gigan, Deniz Gunduz, Ekram Hossain, Yaochu Jin, Feng Lin, Bo Liu, Zhiguo Wan, Ji Zhang, Zhifeng Zhao, Wentao Zhu, Zuoning Chen, Tariq Durrani, Huaimin Wang, Jiangxing Wu, Tongyi Zhang, Yunhe Pan

In recent years, we have witnessed the emergence of intelligent computing, a new computing paradigm that is reshaping traditional computing and promoting digital revolution in the era of big data, artificial intelligence and internet-of-things with new computing theories, architectures, methods, systems, and applications.

Self-Supervised Exploration via Temporal Inconsistency in Reinforcement Learning

no code implementations24 Aug 2022 Zijian Gao, Kele Xu, Yuanzhao Zhai, Dawei Feng, Bo Ding, XinJun Mao, Huaimin Wang

Our method involves training a self-supervised prediction model, saving snapshots of the model parameters, and using nuclear norm to evaluate the temporal inconsistency between the predictions of different snapshots as intrinsic rewards.

reinforcement-learning Reinforcement Learning (RL)

Dynamic Memory-based Curiosity: A Bootstrap Approach for Exploration

no code implementations24 Aug 2022 Zijian Gao, Yiying Li, Kele Xu, Yuanzhao Zhai, Dawei Feng, Bo Ding, XinJun Mao, Huaimin Wang

The curiosity arouses if memorized information can not deal with the current state, and the information gap between dual learners can be formulated as the intrinsic reward for agents, and then such state information can be consolidated into the dynamic memory.

Reinforcement Learning (RL)

Trusted Multi-Scale Classification Framework for Whole Slide Image

no code implementations12 Jul 2022 Ming Feng, Kele Xu, Nanhui Wu, Weiquan Huang, Yan Bai, Changjian Wang, Huaimin Wang

Leveraging the Vision Transformer as the backbone for multi branches, our framework can jointly classification modeling, estimating the uncertainty of each magnification of a microscope and integrate the evidence from different magnification.


Nuclear Norm Maximization Based Curiosity-Driven Learning

no code implementations21 May 2022 Chao Chen, Zijian Gao, Kele Xu, Sen yang, Yiying Li, Bo Ding, Dawei Feng, Huaimin Wang

To handle the sparsity of the extrinsic rewards in reinforcement learning, researchers have proposed intrinsic reward which enables the agent to learn the skills that might come in handy for pursuing the rewards in the future, such as encouraging the agent to visit novel states.

Atari Games

Multi-task Pre-training Language Model for Semantic Network Completion

1 code implementation13 Jan 2022 Da Li, Sen yang, Kele Xu, Ming Yi, Yukai He, Huaimin Wang

To demonstrate the effectiveness of our method, we conduct extensive experiments on three widely-used datasets, WN18RR, FB15k-237, and UMLS.

Contrastive Learning Data Augmentation +3

KnowSR: Knowledge Sharing among Homogeneous Agents in Multi-agent Reinforcement Learning

no code implementations25 May 2021 Zijian Gao, Kele Xu, Bo Ding, Huaimin Wang, Yiying Li, Hongda Jia

In this paper, we present an adaptation method of the majority of multi-agent reinforcement learning (MARL) algorithms called KnowSR which takes advantage of the differences in learning between agents.

Knowledge Distillation Multi-agent Reinforcement Learning +2

KnowRU: Knowledge Reusing via Knowledge Distillation in Multi-agent Reinforcement Learning

no code implementations27 Mar 2021 Zijian Gao, Kele Xu, Bo Ding, Huaimin Wang, Yiying Li, Hongda Jia

In this paper, we propose a method, named "KnowRU" for knowledge reusing which can be easily deployed in the majority of the multi-agent reinforcement learning algorithms without complicated hand-coded design.

Knowledge Distillation Multi-agent Reinforcement Learning +2

FedH2L: Federated Learning with Model and Statistical Heterogeneity

no code implementations27 Jan 2021 Yiying Li, Wei Zhou, Huaimin Wang, Haibo Mi, Timothy M. Hospedales

Federated learning (FL) enables distributed participants to collectively learn a strong global model without sacrificing their individual data privacy.

Federated Learning

Audio Tagging by Cross Filtering Noisy Labels

no code implementations16 Jul 2020 Boqing Zhu, Kele Xu, Qiuqiang Kong, Huaimin Wang, Yuxing Peng

Yet, it is labor-intensive to accurately annotate large amount of audio data, and the dataset may contain noisy labels in the practical settings.

Audio Tagging Memorization +1

Attention-based Fault-tolerant Approach for Multi-agent Reinforcement Learning Systems

no code implementations5 Oct 2019 Mingyang Geng, Kele Xu, Yiying Li, Shuqi Liu, Bo Ding, Huaimin Wang

The aim of multi-agent reinforcement learning systems is to provide interacting agents with the ability to collaboratively learn and adapt to the behavior of other agents.

Multi-agent Reinforcement Learning reinforcement-learning +1

Predicting tongue motion in unlabeled ultrasound videos using convolutional LSTM neural network

1 code implementation19 Feb 2019 Chaojie Zhao, Peng Zhang, Jian Zhu, Chengrui Wu, Huaimin Wang, Kele Xu

A challenge in speech production research is to predict future tongue movements based on a short period of past tongue movements.

motion prediction

Unsupervised Learning-based Depth Estimation aided Visual SLAM Approach

no code implementations22 Jan 2019 Mingyang Geng, Suning Shang, Bo Ding, Huaimin Wang, Pengfei Zhang, Lei Zhang

Furthermore, we successfully exploit our unsupervised learning framework to assist the traditional ORB-SLAM system when the initialization module of ORB-SLAM method could not match enough features.

Depth And Camera Motion Image Reconstruction +2

Learning data augmentation policies using augmented random search

no code implementations12 Nov 2018 Mingyang Geng, Kele Xu, Bo Ding, Huaimin Wang, Lei Zhang

AutoAugment searches for the augmentation polices in the discrete search space, which may lead to a sub-optimal solution.

Data Augmentation reinforcement-learning +1

Collaborative Deep Learning Across Multiple Data Centers

no code implementations16 Oct 2018 Kele Xu, Haibo Mi, Dawei Feng, Huaimin Wang, Chuan Chen, Zibin Zheng, Xu Lan

Valuable training data is often owned by independent organizations and located in multiple data centers.

Cannot find the paper you are looking for? You can Submit a new open access paper.