1 code implementation • 16 Jun 2025 • Junru Zhang, Lang Feng, Xu Guo, Yuhan Wu, Yabo Dong, Duanqing Xu
Time-series reasoning remains a significant challenge in multimodal large language models (MLLMs) due to the dynamic temporal patterns, ambiguous semantics, and lack of temporal priors.
no code implementations • 16 May 2025 • Lang Feng, Jiahao Lin, Dong Xing, Li Zhang, De Ma, Gang Pan
Population-population generalization is a challenging problem in multi-agent reinforcement learning (MARL), particularly when agents encounter unseen co-players.
1 code implementation • 16 May 2025 • Lang Feng, Zhenghai Xue, Tingcong Liu, Bo An
In this work, we propose Group-in-Group Policy Optimization (GiGPO), a novel RL algorithm that achieves fine-grained credit assignment for LLM agents while preserving the appealing properties of group-based RL: critic-free, low memory, and stable convergence.
no code implementations • 10 Mar 2025 • Zhenghai Xue, Lang Feng, Jiacheng Xu, Kang Kang, Xiang Wen, Bo An, Shuicheng Yan
Additionally, as the environment dynamics change, certain expert states may become inaccessible, rendering their distributions less valuable for imitation.
no code implementations • 7 Jun 2024 • Junru Zhang, Lang Feng, Zhidan Liu, Yuhan Wu, Yang He, Yabo Dong, Duanqing Xu
We instantiate this concept using a conditional diffusion model and introduce a style-fused sampling strategy to enhance data generation diversity.
1 code implementation • 28 May 2024 • Lang Feng, Pengjie Gu, Bo An, Gang Pan
As the structure evolves with the integration of new trajectories, unreliable states are marginalized, and the most impactful nodes are prioritized for decision-making.
1 code implementation • 9 Oct 2023 • Junru Zhang, Lang Feng, Yang He, Yuhan Wu, Yabo Dong
While one-dimensional convolutional neural networks (1D-CNNs) have been empirically proven effective in time series classification tasks, we find that there remain undesirable outcomes that could arise in their application, motivating us to further investigate and understand their underlying mechanisms.
no code implementations • 8 Oct 2023 • Lang Feng, Dong Xing, Junru Zhang, Gang Pan
Existing multi-agent PPO algorithms lack compatibility with different types of parameter sharing when extending the theoretical guarantee of PPO to cooperative multi-agent reinforcement learning (MARL).
1 code implementation • 12 Oct 2022 • Lang Feng, Qianhui Liu, Huajin Tang, De Ma, Gang Pan
Spiking neural networks (SNNs) are bio-inspired neural networks with asynchronous discrete and sparse characteristics, which have increasingly manifested their superiority in low energy consumption.
no code implementations • 1 Aug 2022 • Lang Feng, Wenjian Liu, Chuliang Guo, Ke Tang, Cheng Zhuo, Zhongfeng Wang
To improve the design quality while saving the cost, design automation for neural network accelerators was proposed, where design space exploration algorithms are used to automatically search the optimized accelerator design within a design space.
no code implementations • 19 Feb 2021 • Lang Feng, Jiayi Huang, Jeff Huang, Jiang Hu
Data-Flow Integrity (DFI) is a well-known approach to effectively detecting a wide range of software attacks.
Hardware Architecture