Search Results for author: Weizhou Shen

Found 19 papers, 14 papers with code

Writing-RL: Advancing Long-form Writing via Adaptive Curriculum Reinforcement Learning

no code implementations6 Jun 2025 Xuanyu Lei, Chenliang Li, Yuning Wu, Kaiming Liu, Weizhou Shen, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Yang Liu

Recent advances in Large Language Models (LLMs) have enabled strong performance in long-form writing, yet existing supervised fine-tuning (SFT) approaches suffer from limitations such as data saturation and restricted learning capacity bounded by teacher signals.

MUSEG: Reinforcing Video Temporal Understanding via Timestamp-Aware Multi-Segment Grounding

no code implementations27 May 2025 Fuwen Luo, Shengfeng Lou, Chi Chen, Ziyue Wang, Chenliang Li, Weizhou Shen, Jiyue Guo, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Yang Liu

Video temporal understanding is crucial for multimodal large language models (MLLMs) to reason over events in videos.

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

1 code implementation23 May 2025 Fanqi Wan, Weizhou Shen, Shengyi Liao, Yingcheng Shi, Chenliang Li, ZiYi Yang, Ji Zhang, Fei Huang, Jingren Zhou, Ming Yan

To bridge this gap, we first formalize the paradigm of long-context reasoning RL, and identify key challenges in suboptimal training efficiency and unstable optimization process.

Question Answering Reinforcement Learning (RL)

QwenLong-CPRS: Towards $\infty$-LLMs with Dynamic Context Optimization

no code implementations23 May 2025 Weizhou Shen, Chenliang Li, Fanqi Wan, Shengyi Liao, Shaopeng Lai, Bo Zhang, Yingcheng Shi, Yuning Wu, Gang Fu, Zhansheng Li, Bin Yang, Ji Zhang, Fei Huang, Jingren Zhou, Ming Yan

This technical report presents QwenLong-CPRS, a context compression framework designed for explicit long-context optimization, addressing prohibitive computation overhead during the prefill stage and the "lost in the middle" performance degradation of large language models (LLMs) during long sequence processing.

4k Language Modeling +2

Mutual-Taught for Co-adapting Policy and Reward Models

no code implementations17 May 2025 Tianyuan Shi, Canbin Huang, Fanqi Wan, Longguang Zhong, ZiYi Yang, Weizhou Shen, Xiaojun Quan, Ming Yan

During the preference optimization of large language models (LLMs), distribution shifts may arise between newly generated model samples and the data used to train the reward model (RM).

SoLoPO: Unlocking Long-Context Capabilities in LLMs via Short-to-Long Preference Optimization

no code implementations16 May 2025 Huashan Sun, Shengyi Liao, Yansen Han, Yu Bai, Yang Gao, Cheng Fu, Weizhou Shen, Fanqi Wan, Ming Yan, Ji Zhang, Fei Huang

SoLoPO is compatible with mainstream preference optimization algorithms, while substantially improving the efficiency of data construction and training processes.

Domain Generalization

Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration

2 code implementations3 Jun 2024 Junyang Wang, Haiyang Xu, Haitao Jia, Xi Zhang, Ming Yan, Weizhou Shen, Ji Zhang, Fei Huang, Jitao Sang

However, the two major navigation challenges in mobile device operation tasks, task progress navigation and focus content navigation, are significantly complicated under the single-agent architecture of existing work.

SocialBench: Sociality Evaluation of Role-Playing Conversational Agents

2 code implementations20 Mar 2024 Hongzhan Chen, Hehong Chen, Ming Yan, Wenshen Xu, Xing Gao, Weizhou Shen, Xiaojun Quan, Chenliang Li, Ji Zhang, Fei Huang, Jingren Zhou

In this paper, we introduce SocialBench, the first benchmark designed to systematically evaluate the sociality of role-playing conversational agents at both individual and group levels of social interactions.

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception

1 code implementation29 Jan 2024 Junyang Wang, Haiyang Xu, Jiabo Ye, Ming Yan, Weizhou Shen, Ji Zhang, Fei Huang, Jitao Sang

To assess the performance of Mobile-Agent, we introduced Mobile-Eval, a benchmark for evaluating mobile device operations.

Small LLMs Are Weak Tool Learners: A Multi-LLM Agent

1 code implementation14 Jan 2024 Weizhou Shen, Chenliang Li, Hongzhan Chen, Ming Yan, Xiaojun Quan, Hehong Chen, Ji Zhang, Fei Huang

Each component is implemented by a single LLM that focuses on a specific capability and collaborates with others to accomplish the task.

Language Modelling Large Language Model +1

Retrieval-Generation Alignment for End-to-End Task-Oriented Dialogue System

1 code implementation13 Oct 2023 Weizhou Shen, Yingqi Gao, Canbin Huang, Fanqi Wan, Xiaojun Quan, Wei Bi

The results demonstrate that when combined with meta knowledge, the response generator can effectively leverage high-quality knowledge records from the retriever and enhance the quality of generated responses.

Response Generation Retrieval +1

ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models

3 code implementations2 Sep 2023 Chenliang Li, Hehong Chen, Ming Yan, Weizhou Shen, Haiyang Xu, Zhikai Wu, Zhicheng Zhang, Wenmeng Zhou, Yingda Chen, Chen Cheng, Hongzhu Shi, Ji Zhang, Fei Huang, Jingren Zhou

Large language models (LLMs) have recently demonstrated remarkable capabilities to comprehend human intentions, engage in reasoning, and design planning-like behavior.

Multi-Grained Knowledge Retrieval for End-to-End Task-Oriented Dialog

1 code implementation17 May 2023 Fanqi Wan, Weizhou Shen, Ke Yang, Xiaojun Quan, Wei Bi

Retrieving proper domain knowledge from an external database lies at the heart of end-to-end task-oriented dialog systems to generate informative responses.

Attribute Response Generation +1

Generic Dependency Modeling for Multi-Party Conversation

1 code implementation21 Feb 2023 Weizhou Shen, Xiaojun Quan, Ke Yang

To model the dependencies between utterances in multi-party conversations, we propose a simple and generic framework based on the dependency parsing results of utterances.

Dependency Parsing

Joint Generator-Ranker Learning for Natural Language Generation

2 code implementations28 Jun 2022 Weizhou Shen, Yeyun Gong, Yelong Shen, Song Wang, Xiaojun Quan, Nan Duan, Weizhu Chen

Generate-then-rank is a widely used mechanism for text generation, where a generator produces multiple text candidates and a ranker chooses the best one among the text candidates.

Question Generation Question-Generation +2

Directed Acyclic Graph Network for Conversational Emotion Recognition

1 code implementation ACL 2021 Weizhou Shen, Siyue Wu, Yunyi Yang, Xiaojun Quan

In this paper, we put forward a novel idea of encoding the utterances with a directed acyclic graph (DAG) to better model the intrinsic structure within a conversation, and design a directed acyclic neural network, namely DAG-ERC, to implement this idea.

Emotion Recognition in Conversation

DialogXL: All-in-One XLNet for Multi-Party Conversation Emotion Recognition

4 code implementations16 Dec 2020 Weizhou Shen, Junqing Chen, Xiaojun Quan, Zhixian Xie

Specifically, we first modify the recurrence mechanism of XLNet from segment-level to utterance-level in order to better model the conversational data.

All Emotion Recognition in Conversation

Constituency Lattice Encoding for Aspect Term Extraction

1 code implementation COLING 2020 Yunyi Yang, Kun Li, Xiaojun Quan, Weizhou Shen, Qinliang Su

One of the remaining challenges for aspect term extraction in sentiment analysis resides in the extraction of phrase-level aspect terms, which is non-trivial to determine the boundaries of such terms.

Aspect Term Extraction and Sentiment Classification Sentence +1

Cannot find the paper you are looking for? You can Submit a new open access paper.