Search Results for author: Kaiyu Tang

Found 4 papers, 2 papers with code

A Navigation Framework Utilizing Vision-Language Models

1 code implementation11 Jun 2025 Yicheng Duan, Kaiyu Tang

Vision-and-Language Navigation (VLN) presents a complex challenge in embodied AI, requiring agents to interpret natural language instructions and navigate through visually rich, unfamiliar environments.

Navigate Prompt Engineering +1

R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

1 code implementation5 May 2025 Yi-Fan Zhang, Xingyu Lu, Xiao Hu, Chaoyou Fu, Bin Wen, Tianke Zhang, Changyi Liu, Kaiyu Jiang, Kaibing Chen, Kaiyu Tang, Haojie Ding, Jiankang Chen, Fan Yang, Zhang Zhang, Tingting Gao, Liang Wang

Our reward model, R1-Reward, trained using the StableReinforce algorithm on this dataset, significantly improves performance on multimodal reward modeling benchmarks.

Reinforcement Learning (RL)

VLM as Policy: Common-Law Content Moderation Framework for Short Video Platform

no code implementations21 Apr 2025 Xingyu Lu, Tianke Zhang, Chang Meng, Xiaobei Wang, Jinpeng Wang, Yifan Zhang, Shisong Tang, Changyi Liu, Haojie Ding, Kaiyu Jiang, Kaiyu Tang, Bin Wen, Hai-Tao Zheng, Fan Yang, Tingting Gao, Di Zhang, Kun Gai

Offline experiments and large-scale online A/B test demonstrates the superiority of KuaiMod: KuaiMod achieves the best moderation performance on our benchmark.

Kwai-STaR: Transform LLMs into State-Transition Reasoners

no code implementations7 Nov 2024 Xingyu Lu, Yuhang Hu, Changyi Liu, Tianke Zhang, Zhenyu Yang, Zhixiang Ding, Shengsheng Qian, Meng Du, Ruiwen Kang, Kaiyu Tang, Fan Yang, Tingting Gao, Di Zhang, Hai-Tao Zheng, Bin Wen

In this work, we define mathematical problem-solving as a process of transiting from an initial unsolved state to the final resolved state, and propose Kwai-STaR framework, which transforms LLMs into State-Transition Reasoners to improve their intuitive reasoning capabilities.

GSM8K Mathematical Problem-Solving +1

Cannot find the paper you are looking for? You can Submit a new open access paper.