Search Results for author: Lang Feng

Found 11 papers, 5 papers with code

TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning

1 code implementation16 Jun 2025 Junru Zhang, Lang Feng, Xu Guo, Yuhan Wu, Yabo Dong, Duanqing Xu

Time-series reasoning remains a significant challenge in multimodal large language models (MLLMs) due to the dynamic temporal patterns, ambiguous semantics, and lack of temporal priors.

Reinforcement Learning (RL) Time Series +1

Bidirectional Distillation: A Mixed-Play Framework for Multi-Agent Generalizable Behaviors

no code implementations16 May 2025 Lang Feng, Jiahao Lin, Dong Xing, Li Zhang, De Ma, Gang Pan

Population-population generalization is a challenging problem in multi-agent reinforcement learning (MARL), particularly when agents encounter unseen co-players.

Knowledge Distillation Multi-agent Reinforcement Learning

Group-in-Group Policy Optimization for LLM Agent Training

1 code implementation16 May 2025 Lang Feng, Zhenghai Xue, Tingcong Liu, Bo An

In this work, we propose Group-in-Group Policy Optimization (GiGPO), a novel RL algorithm that achieves fine-grained credit assignment for LLM agents while preserving the appealing properties of group-based RL: critic-free, low memory, and stable convergence.

Mathematical Reasoning Reinforcement Learning (RL)

Policy Regularization on Globally Accessible States in Cross-Dynamics Reinforcement Learning

no code implementations10 Mar 2025 Zhenghai Xue, Lang Feng, Jiacheng Xu, Kang Kang, Xiang Wen, Bo An, Shuicheng Yan

Additionally, as the environment dynamics change, certain expert states may become inaccessible, rendering their distributions less valuable for imitation.

Imitation Learning Offline RL

Diverse Intra- and Inter-Domain Activity Style Fusion for Cross-Person Generalization in Activity Recognition

no code implementations7 Jun 2024 Junru Zhang, Lang Feng, Zhidan Liu, Yuhan Wu, Yang He, Yabo Dong, Duanqing Xu

We instantiate this concept using a conditional diffusion model and introduce a style-fused sampling strategy to enhance data generation diversity.

Diversity Domain Generalization +1

Resisting Stochastic Risks in Diffusion Planners with the Trajectory Aggregation Tree

1 code implementation28 May 2024 Lang Feng, Pengjie Gu, Bo An, Gang Pan

As the structure evolves with the integration of new trajectories, unreliable states are marginalized, and the most impactful nodes are prioritized for decision-making.

Decision Making

Temporal Convolutional Explorer Helps Understand 1D-CNN's Learning Behavior in Time Series Classification from Frequency Domain

1 code implementation9 Oct 2023 Junru Zhang, Lang Feng, Yang He, Yuhan Wu, Yabo Dong

While one-dimensional convolutional neural networks (1D-CNNs) have been empirically proven effective in time series classification tasks, we find that there remain undesirable outcomes that could arise in their application, motivating us to further investigate and understand their underlying mechanisms.

Time Series Time Series Classification

FP3O: Enabling Proximal Policy Optimization in Multi-Agent Cooperation with Parameter-Sharing Versatility

no code implementations8 Oct 2023 Lang Feng, Dong Xing, Junru Zhang, Gang Pan

Existing multi-agent PPO algorithms lack compatibility with different types of parameter sharing when extending the theoretical guarantee of PPO to cooperative multi-agent reinforcement learning (MARL).

MuJoCo Multi-agent Reinforcement Learning

Multi-Level Firing with Spiking DS-ResNet: Enabling Better and Deeper Directly-Trained Spiking Neural Networks

1 code implementation12 Oct 2022 Lang Feng, Qianhui Liu, Huajin Tang, De Ma, Gang Pan

Spiking neural networks (SNNs) are bio-inspired neural networks with asynchronous discrete and sparse characteristics, which have increasingly manifested their superiority in low energy consumption.

GANDSE: Generative Adversarial Network based Design Space Exploration for Neural Network Accelerator Design

no code implementations1 Aug 2022 Lang Feng, Wenjian Liu, Chuliang Guo, Ke Tang, Cheng Zhuo, Zhongfeng Wang

To improve the design quality while saving the cost, design automation for neural network accelerators was proposed, where design space exploration algorithms are used to automatically search the optimized accelerator design within a design space.

Deep Learning Deep Reinforcement Learning +1

Toward Taming the Overhead Monster for Data-Flow Integrity

no code implementations19 Feb 2021 Lang Feng, Jiayi Huang, Jeff Huang, Jiang Hu

Data-Flow Integrity (DFI) is a well-known approach to effectively detecting a wide range of software attacks.

Hardware Architecture

Cannot find the paper you are looking for? You can Submit a new open access paper.