Search Results for author: Hanjing Wang

Found 12 papers, 2 papers with code

ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning

no code implementations12 Mar 2025 Ziyu Wan, Yunxiang Li, Yan Song, Hanjing Wang, Linyi Yang, Mark Schmidt, Jun Wang, Weinan Zhang, Shuyue Hu, Ying Wen

Recent research on Reasoning of Large Language Models (LLMs) has sought to further enhance their performance by integrating meta-thinking -- enabling models to monitor, evaluate, and control their reasoning processes for more adaptive and effective problem-solving.

Multi-agent Reinforcement Learning reinforcement-learning +1

Boost, Disentangle, and Customize: A Robust System2-to-System1 Pipeline for Code Generation

no code implementations18 Feb 2025 Kounianhua Du, Hanjing Wang, Jianxing Liu, Jizheng Chen, Xinyi Dai, Yasheng Wang, Ruiming Tang, Yong Yu, Jun Wang, Weinan Zhang

This work lays the groundwork for advancing LLM capabilities in complex reasoning tasks, offering a novel System2-to-System1 solution.

Code Generation

Epistemic Uncertainty Quantification For Pre-trained Neural Network

no code implementations15 Apr 2024 Hanjing Wang, Qiang Ji

Specifically, we propose a gradient-based approach to assess epistemic uncertainty, analyzing the gradients of outputs relative to model parameters, and thereby indicating necessary model adjustments to accurately represent the inputs.

Active Learning Out-of-Distribution Detection +1

Epistemic Uncertainty Quantification For Pre-Trained Neural Networks

no code implementations CVPR 2024 Hanjing Wang, Qiang Ji

Specifically we propose a gradient-based approach to assess epistemic uncertainty analyzing the gradients of outputs relative to model parameters and thereby indicating necessary model adjustments to accurately represent the inputs.

Active Learning Out-of-Distribution Detection +1

GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models

1 code implementation8 Oct 2023 Hanjing Wang, Man-Kit Sit, Congjie He, Ying Wen, Weinan Zhang, Jun Wang, Yaodong Yang, Luo Mai

This paper introduces a distributed, GPU-centric experience replay system, GEAR, designed to perform scalable reinforcement learning (RL) with large sequence models (such as transformers).

Reinforcement Learning (RL)

Body Knowledge and Uncertainty Modeling for Monocular 3D Human Body Reconstruction

no code implementations ICCV 2023 Yufei Zhang, Hanjing Wang, Jeffrey O. Kephart, Qiang Ji

While 3D body reconstruction methods have made remarkable progress recently, it remains difficult to acquire the sufficiently accurate and numerous 3D supervisions required for training.

3D Reconstruction

Large Sequence Models for Sequential Decision-Making: A Survey

no code implementations24 Jun 2023 Muning Wen, Runji Lin, Hanjing Wang, Yaodong Yang, Ying Wen, Luo Mai, Jun Wang, Haifeng Zhang, Weinan Zhang

Transformer architectures have facilitated the development of large-scale and general-purpose sequence models for prediction tasks in natural language processing and computer vision, e. g., GPT-3 and Swin Transformer.

Decision Making Sequential Decision Making +1

MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning

1 code implementation5 Jun 2021 Ming Zhou, Ziyu Wan, Hanjing Wang, Muning Wen, Runzhe Wu, Ying Wen, Yaodong Yang, Weinan Zhang, Jun Wang

Our framework is comprised of three key components: (1) a centralized task dispatching model, which supports the self-generated tasks and scalable training with heterogeneous policy combinations; (2) a programming architecture named Actor-Evaluator-Learner, which achieves high parallelism for both training and sampling, and meets the evaluation requirement of auto-curriculum learning; (3) a higher-level abstraction of MARL training paradigms, which enables efficient code reuse and flexible deployments on different distributed computing paradigms.

Atari Games Distributed Computing +4

Towards Efficient Discrete Integration via Adaptive Quantile Queries

no code implementations13 Oct 2019 Fan Ding, Hanjing Wang, Ashish Sabharwal, Yexiang Xue

On a suite of UAI inference challenge benchmarks, it saves 81. 5% of WISH queries while retaining the quality of results.

DCNN-GAN: Reconstructing Realistic Image from fMRI

no code implementations13 Jan 2019 Yunfeng Lin, Jiangbei Li, Hanjing Wang

Visualizing the perceptual content by analyzing human functional magnetic resonance imaging (fMRI) has been an active research area.

Generative Adversarial Network

Cannot find the paper you are looking for? You can Submit a new open access paper.