Search Results for author: Kun Shao

Found 28 papers, 9 papers with code

ViMo: A Generative Visual GUI World Model for App Agents

no code implementations15 Apr 2025 Dezhao Luo, Bohan Tang, Kang Li, Georgios Papoudakis, Jifei Song, Shaogang Gong, Jianye Hao, Jun Wang, Kun Shao

We propose a novel data representation, the Symbolic Text Representation~(STR) to overlay text content with symbolic placeholders while preserving graphics.

ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning

no code implementations22 Feb 2025 Shulin Huang, Linyi Yang, Yan Song, Shuang Chen, Leyang Cui, Ziyu Wan, Qingcheng Zeng, Ying Wen, Kun Shao, Weinan Zhang, Jun Wang, Yue Zhang

Evaluating large language models (LLMs) poses significant challenges, particularly due to issues of data contamination and the leakage of correct answers.

Advancing Autonomous VLM Agents via Variational Subgoal-Conditioned Reinforcement Learning

no code implementations11 Feb 2025 Qingyuan Wu, Jianheng Liu, Jianye Hao, Jun Wang, Kun Shao

State-of-the-art (SOTA) reinforcement learning (RL) methods have enabled vision-language model (VLM) agents to learn from interaction with online environments without human supervision.

Decision Making reinforcement-learning +3

GUI Agents with Foundation Models: A Comprehensive Survey

no code implementations7 Nov 2024 Shuai Wang, Weiwen Liu, Jingxuan Chen, Yuqi Zhou, Weinan Gan, Xingshan Zeng, Yuhan Che, Shuai Yu, Xinlong Hao, Kun Shao, Bin Wang, Chuhan Wu, Yasheng Wang, Ruiming Tang, Jianye Hao

Recent advances in foundation models, particularly Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs), have facilitated the development of intelligent agents capable of performing complex tasks.

Survey

Lightweight Neural App Control

no code implementations23 Oct 2024 Filippos Christianos, Georgios Papoudakis, Thomas Coste, Jianye Hao, Jun Wang, Kun Shao

This paper introduces a novel mobile phone control architecture, termed ``app agents", for efficient interactions and controls across various Android apps.

Decision Making Language Modeling +2

SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation

1 code implementation19 Oct 2024 Jingxuan Chen, Derek Yuen, Bin Xie, Yuhao Yang, Gongwei Chen, Zhihao Wu, Li Yixing, Xurui Zhou, Weiwen Liu, Shuai Wang, Kaiwen Zhou, Rui Shao, Liqiang Nie, Yasheng Wang, Jianye Hao, Jun Wang, Kun Shao

SPA-Bench offers three key contributions: (1) A diverse set of tasks covering system and third-party apps in both English and Chinese, focusing on features commonly used in daily routines; (2) A plug-and-play framework enabling real-time agent interaction with Android devices, integrating over ten agents with the flexibility to add more; (3) A novel evaluation pipeline that automatically assesses agent performance across multiple dimensions, encompassing seven metrics related to task completion and resource consumption.

AI Agent Benchmarking +2

DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents

1 code implementation18 Oct 2024 Taiyi Wang, Zhihao Wu, Jianheng Liu, Jianye Hao, Jun Wang, Kun Shao

This paper introduces DistRL, a novel framework designed to enhance the efficiency of online RL fine-tuning for mobile device control agents.

Learning Precise Affordances from Egocentric Videos for Robotic Manipulation

no code implementations19 Aug 2024 Gen Li, Nikolaos Tsagkas, Jifei Song, Ruaridh Mon-Williams, Sethu Vijayakumar, Kun Shao, Laura Sevilla-Lara

In this paper, we present a streamlined affordance learning system that encompasses data collection, effective model training, and robot deployment.

Grasp Generation

ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

1 code implementation28 Jun 2024 Christopher E. Mower, Yuhui Wan, Hongzhan Yu, Antoine Grosnit, Jonas Gonzalez-Billandon, Matthieu Zimmer, Jinlong Wang, Xinyu Zhang, Yao Zhao, Anbang Zhai, Puze Liu, Daniel Palenicek, Davide Tateo, Cesar Cadena, Marco Hutter, Jan Peters, Guangjian Tian, Yuzheng Zhuang, Kun Shao, Xingyue Quan, Jianye Hao, Jun Wang, Haitham Bou-Ammar

Key features of the framework include: integration of ROS with an AI agent connected to a plethora of open-source and commercial LLMs, automatic extraction of a behavior from the LLM output and execution of ROS actions/services, support for three behavior modes (sequence, behavior tree, state machine), imitation learning for adding new robot actions to the library of possible actions, and LLM reflection via human and environment feedback.

AI Agent Imitation Learning

Vim-F: Visual State Space Model Benefiting from Learning in the Frequency Domain

1 code implementation29 May 2024 Juntao Zhang, Shaogeng Liu, Kun Bian, You Zhou, Pei Zhang, Wenbo An, Jun Zhou, Kun Shao

In recent years, State Space Models (SSMs) with efficient hardware-aware designs, known as the Mamba deep learning models, have made significant progress in modeling long sequences such as language understanding.

Mamba State Space Models

Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control

1 code implementation9 Feb 2024 Zheng Xiong, Risto Vuorio, Jacob Beck, Matthieu Zimmer, Kun Shao, Shimon Whiteson

Learning a universal policy across different robot morphologies can significantly improve learning efficiency and enable zero-shot generalization to unseen morphologies.

Zero-shot Generalization

A survey on algorithms for Nash equilibria in finite normal-form games

no code implementations18 Dec 2023 Hanyu Li, Wenhan Huang, Zhijian Duan, David Henry Mguni, Kun Shao, Jun Wang, Xiaotie Deng

This paper reviews various algorithms computing the Nash equilibrium and its approximation solutions in finite normal-form games from both theoretical and empirical perspectives.

Form

ChessGPT: Bridging Policy Learning and Language Modeling

1 code implementation NeurIPS 2023 Xidong Feng, Yicheng Luo, Ziyan Wang, Hongrui Tang, Mengyue Yang, Kun Shao, David Mguni, Yali Du, Jun Wang

Thus, we propose ChessGPT, a GPT model bridging policy learning and language modeling by integrating data from these two sources in Chess games.

Decision Making Language Modeling +1

Traj-MAE: Masked Autoencoders for Trajectory Prediction

no code implementations ICCV 2023 Hao Chen, Jiaze Wang, Kun Shao, Furui Liu, Jianye Hao, Chenyong Guan, Guangyong Chen, Pheng-Ann Heng

Specifically, our Traj-MAE employs diverse masking strategies to pre-train the trajectory encoder and map encoder, allowing for the capture of social and temporal information among agents while leveraging the effect of environment from multiple granularities.

Autonomous Driving Prediction +1

Taming Multi-Agent Reinforcement Learning with Estimator Variance Reduction

no code implementations2 Sep 2022 Taher Jafferjee, Juliusz Ziomek, Tianpei Yang, Zipeng Dai, Jianhong Wang, Matthew Taylor, Kun Shao, Jun Wang, David Mguni

Centralised training with decentralised execution (CT-DE) serves as the foundation of many leading multi-agent reinforcement learning (MARL) algorithms.

MuJoCo Multi-agent Reinforcement Learning +5

Timing is Everything: Learning to Act Selectively with Costly Actions and Budgetary Constraints

no code implementations31 May 2022 David Mguni, Aivar Sootla, Juliusz Ziomek, Oliver Slumbers, Zipeng Dai, Kun Shao, Jun Wang

In this paper, we introduce a reinforcement learning (RL) framework named \textbf{L}earnable \textbf{I}mpulse \textbf{C}ontrol \textbf{R}einforcement \textbf{A}lgorithm (LICRA), for learning to optimally select both when to act and which actions to take when actions incur costs.

Reinforcement Learning (RL)

Learning Explicit Credit Assignment for Multi-agent Joint Q-learning

no code implementations29 Sep 2021 Hangyu Mao, Jianye Hao, Dong Li, Jun Wang, Weixun Wang, Xiaotian Hao, Bin Wang, Kun Shao, Zhen Xiao, Wulong Liu

In contrast, we formulate an \emph{explicit} credit assignment problem where each agent gives its suggestion about how to weight individual Q-values to explicitly maximize the joint Q-value, besides guaranteeing the Bellman optimality of the joint Q-value.

Q-Learning

Cooperative Multi-Agent Transfer Learning with Level-Adaptive Credit Assignment

no code implementations1 Jun 2021 Tianze Zhou, Fubiao Zhang, Kun Shao, Kai Li, Wenhan Huang, Jun Luo, Weixun Wang, Yaodong Yang, Hangyu Mao, Bin Wang, Dong Li, Wulong Liu, Jianye Hao

In addition, we use a novel agent network named Population Invariant agent with Transformer (PIT) to realize the coordination transfer in more varieties of scenarios.

Management Multi-agent Reinforcement Learning +3

Robust Multi-Agent Reinforcement Learning Driven by Correlated Equilibrium

no code implementations1 Jan 2021 Yizheng Hu, Kun Shao, Dong Li, Jianye Hao, Wulong Liu, Yaodong Yang, Jun Wang, Zhanxing Zhu

Therefore, to achieve robust CMARL, we introduce novel strategies to encourage agents to learn correlated equilibrium while maximally preserving the convenience of the decentralized execution.

Adversarial Robustness reinforcement-learning +3

Multi-Agent Determinantal Q-Learning

1 code implementation ICML 2020 Yaodong Yang, Ying Wen, Li-Heng Chen, Jun Wang, Kun Shao, David Mguni, Wei-Nan Zhang

Though practical, current methods rely on restrictive assumptions to decompose the centralized value function across agents for execution.

Q-Learning

A Survey of Deep Reinforcement Learning in Video Games

no code implementations23 Dec 2019 Kun Shao, Zhentao Tang, Yuanheng Zhu, Nannan Li, Dongbin Zhao

In this paper, we survey the progress of DRL methods, including value-based, policy gradient, and model-based algorithms, and compare their main techniques and properties.

Deep Reinforcement Learning Real-Time Strategy Games +3

StarCraft Micromanagement with Reinforcement Learning and Curriculum Transfer Learning

1 code implementation3 Apr 2018 Kun Shao, Yuanheng Zhu, Dongbin Zhao

With reinforcement learning and curriculum transfer learning, our units are able to learn appropriate strategies in StarCraft micromanagement scenarios.

reinforcement-learning Reinforcement Learning +3

Cannot find the paper you are looking for? You can Submit a new open access paper.