Search Results for author: Longxi Gao

Found 3 papers, 2 papers with code

UIShift: Enhancing VLM-based GUI Agents through Self-supervised Reinforcement Learning

no code implementations18 May 2025 Longxi Gao, Li Zhang, Mengwei Xu

Using this training task, we propose UI-shift, a framework for enhancing VLM-based GUI agents through self-supervised reinforcement learning (RL).

2k Reinforcement Learning (RL)

Does Chain-of-Thought Reasoning Help Mobile GUI Agent? An Empirical Study

1 code implementation21 Mar 2025 Li Zhang, Longxi Gao, Mengwei Xu

Reasoning capabilities have significantly improved the performance of vision-language models (VLMs) in domains such as mathematical problem-solving, coding, and visual question-answering.

Attribute Mathematical Problem-Solving +2

LlamaTouch: A Faithful and Scalable Testbed for Mobile UI Task Automation

1 code implementation12 Apr 2024 Li Zhang, Shihe Wang, Xianqing Jia, Zhihan Zheng, Yunhe Yan, Longxi Gao, Yuanchun Li, Mengwei Xu

LlamaTouch comprises three key techniques: (1) On-device task execution that enables mobile agents to interact with realistic mobile environments for task execution.

Cannot find the paper you are looking for? You can Submit a new open access paper.