no code implementations • 12 Mar 2025 • Ziyu Wan, Yunxiang Li, Yan Song, Hanjing Wang, Linyi Yang, Mark Schmidt, Jun Wang, Weinan Zhang, Shuyue Hu, Ying Wen
Recent research on Reasoning of Large Language Models (LLMs) has sought to further enhance their performance by integrating meta-thinking -- enabling models to monitor, evaluate, and control their reasoning processes for more adaptive and effective problem-solving.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
no code implementations • 18 Feb 2025 • Kounianhua Du, Hanjing Wang, Jianxing Liu, Jizheng Chen, Xinyi Dai, Yasheng Wang, Ruiming Tang, Yong Yu, Jun Wang, Weinan Zhang
This work lays the groundwork for advancing LLM capabilities in complex reasoning tasks, offering a novel System2-to-System1 solution.
no code implementations • 15 Apr 2024 • Hanjing Wang, Qiang Ji
Specifically, we propose a gradient-based approach to assess epistemic uncertainty, analyzing the gradients of outputs relative to model parameters, and thereby indicating necessary model adjustments to accurately represent the inputs.
no code implementations • CVPR 2024 • Hanjing Wang, Qiang Ji
Specifically we propose a gradient-based approach to assess epistemic uncertainty analyzing the gradients of outputs relative to model parameters and thereby indicating necessary model adjustments to accurately represent the inputs.
1 code implementation • 8 Oct 2023 • Hanjing Wang, Man-Kit Sit, Congjie He, Ying Wen, Weinan Zhang, Jun Wang, Yaodong Yang, Luo Mai
This paper introduces a distributed, GPU-centric experience replay system, GEAR, designed to perform scalable reinforcement learning (RL) with large sequence models (such as transformers).
no code implementations • ICCV 2023 • Yufei Zhang, Hanjing Wang, Jeffrey O. Kephart, Qiang Ji
While 3D body reconstruction methods have made remarkable progress recently, it remains difficult to acquire the sufficiently accurate and numerous 3D supervisions required for training.
no code implementations • 24 Jun 2023 • Muning Wen, Runji Lin, Hanjing Wang, Yaodong Yang, Ying Wen, Luo Mai, Jun Wang, Haifeng Zhang, Weinan Zhang
Transformer architectures have facilitated the development of large-scale and general-purpose sequence models for prediction tasks in natural language processing and computer vision, e. g., GPT-3 and Swin Transformer.
no code implementations • CVPR 2023 • Hanjing Wang, Dhiraj Joshi, Shiqiang Wang, Qiang Ji
Predictions made by deep learning models are prone to data perturbations, adversarial attacks, and out-of-distribution inputs.
no code implementations • CVPR 2022 • Hongji Guo, Hanjing Wang, Qiang Ji
The model prediction uncertainty is used to improve both training and inference.
1 code implementation • 5 Jun 2021 • Ming Zhou, Ziyu Wan, Hanjing Wang, Muning Wen, Runzhe Wu, Ying Wen, Yaodong Yang, Weinan Zhang, Jun Wang
Our framework is comprised of three key components: (1) a centralized task dispatching model, which supports the self-generated tasks and scalable training with heterogeneous policy combinations; (2) a programming architecture named Actor-Evaluator-Learner, which achieves high parallelism for both training and sampling, and meets the evaluation requirement of auto-curriculum learning; (3) a higher-level abstraction of MARL training paradigms, which enables efficient code reuse and flexible deployments on different distributed computing paradigms.
no code implementations • 13 Oct 2019 • Fan Ding, Hanjing Wang, Ashish Sabharwal, Yexiang Xue
On a suite of UAI inference challenge benchmarks, it saves 81. 5% of WISH queries while retaining the quality of results.
no code implementations • 13 Jan 2019 • Yunfeng Lin, Jiangbei Li, Hanjing Wang
Visualizing the perceptual content by analyzing human functional magnetic resonance imaging (fMRI) has been an active research area.