Search Results for author: Fei Huang

Found 313 papers, 190 papers with code

S^4-Tuning: A Simple Cross-lingual Sub-network Tuning Method

no code implementations ACL 2022 Runxin Xu, Fuli Luo, Baobao Chang, Songfang Huang, Fei Huang

The emergence of multilingual pre-trained language models makes it possible to adapt to target languages with only few labeled examples. However, vanilla fine-tuning tends to achieve degenerated and unstable results, owing to the Language Interference among different languages, and Parameter Overload under the few-sample transfer learning scenarios. To address two problems elegantly, we propose S^4-Tuning, a Simple Cross-lingual Sub-network Tuning method.

Transfer Learning

Alibaba Speech Translation Systems for IWSLT 2018

no code implementations IWSLT (EMNLP) 2018 Nguyen Bach, Hongjie Chen, Kai Fan, Cheung-Chi Leung, Bo Li, Chongjia Ni, Rong Tong, Pei Zhang, Boxing Chen, Bin Ma, Fei Huang

This work describes the En→De Alibaba speech translation system developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2018.

Reranking Sentence +1

PALM: Pre-training an Autoencoding\&Autoregressive Language Model for Context-conditioned Generation

no code implementations EMNLP 2020 Bin Bi, Chenliang Li, Chen Wu, Ming Yan, Wei Wang, Songfang Huang, Fei Huang, Luo Si

An extensive set of experiments show that PALM achieves new state-of-the-art results on a variety of language generation benchmarks covering generative question answering (Rank 1 on the official MARCO leaderboard), abstractive summarization on CNN/DailyMail as well as Gigaword, question generation on SQuAD, and conversational response generation on Cornell Movie Dialogues.

Abstractive Text Summarization Conversational Response Generation +9

Bilingual Methods for Adaptive Training Data Selection for Machine Translation

no code implementations AMTA 2016 Boxing Chen, Roland Kuhn, George Foster, Colin Cherry, Fei Huang

In this paper, we propose a new data selection method which uses semi-supervised convolutional neural networks based on bitokens (Bi-SSCNNs) for training machine translation systems from a large bilingual corpus.

Machine Translation NMT +2

Rethinking Denoised Auto-Encoding in Language Pre-Training

no code implementations EMNLP 2021 Fuli Luo, Pengcheng Yang, Shicheng Li, Xuancheng Ren, Xu sun, Songfang Huang, Fei Huang

Pre-trained self-supervised models such as BERT have achieved striking success in learning sequence representations, especially for natural language processing.

Natural Language Understanding Sentence

TimeHC-RL: Temporal-aware Hierarchical Cognitive Reinforcement Learning for Enhancing LLMs' Social Intelligence

1 code implementation30 May 2025 Guiyang Hou, Xing Gao, Yuchuan Wu, Xiang Huang, Wenqi Zhang, Zhe Zheng, Yongliang Shen, Jialu Du, Fei Huang, Yongbin Li, Weiming Lu

Recognizing that the social world follows a distinct timeline and requires a richer blend of cognitive modes (from intuitive reactions (System 1) and surface-level thinking to deliberate thinking (System 2)) than mathematics, which primarily relies on System 2 cognition (careful, step-by-step reasoning), we introduce Temporal-aware Hierarchical Cognitive Reinforcement Learning (TimeHC-RL) for enhancing LLMs' social intelligence.

VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning

1 code implementation28 May 2025 Qiuchen Wang, Ruixue Ding, Yu Zeng, Zehui Chen, Lin Chen, Shihang Wang, Pengjun Xie, Fei Huang, Feng Zhao

As RL has been proven to be beneficial for model reasoning, we introduce VRAG-RL, a novel RL framework tailored for complex reasoning across visually rich information.

RAG

Scaling External Knowledge Input Beyond Context Windows of LLMs via Multi-Agent Collaboration

1 code implementation27 May 2025 Zijun Liu, Zhennan Wan, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Yang Liu

However, the limited context window of LLMs obstructs scaling the amount of external knowledge input, prohibiting further improvement, especially for tasks requiring significant amount of external knowledge.

Multi-hop Question Answering Question Answering

OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction

1 code implementation26 May 2025 Haonan Zhang, Run Luo, Xiong Liu, Yuchuan Wu, Ting-En Lin, Pengpeng Zeng, Qiang Qu, Feiteng Fang, Min Yang, Lianli Gao, Jingkuan Song, Fei Huang, Yongbin Li

Role-Playing Agents (RPAs), benefiting from large language models, is an emerging interactive AI system that simulates roles or characters with diverse personalities.

MASKSEARCH: A Universal Pre-Training Framework to Enhance Agentic Search Capability

1 code implementation26 May 2025 Weiqi Wu, Xin Guan, Shen Huang, Yong Jiang, Pengjun Xie, Fei Huang, Jiuxin Cao, Hai Zhao, Jingren Zhou

In the pre-training stage, we introduce the Retrieval Augmented Mask Prediction (RAMP) task, where the model learns to leverage search tools to fill masked spans on a large number of pre-training data, thus acquiring universal retrieval and reasoning capabilities for LLMs.

Multi-hop Question Answering Question Answering +2

Marginal Fairness: Fair Decision-Making under Risk Measures

no code implementations24 May 2025 Fei Huang, Silvana M. Pesenti

This paper introduces marginal fairness, a new individual fairness notion for equitable decision-making in the presence of protected attributes such as gender, race, and religion.

Attribute Decision Making +1

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

1 code implementation23 May 2025 Fanqi Wan, Weizhou Shen, Shengyi Liao, Yingcheng Shi, Chenliang Li, ZiYi Yang, Ji Zhang, Fei Huang, Jingren Zhou, Ming Yan

To bridge this gap, we first formalize the paradigm of long-context reasoning RL, and identify key challenges in suboptimal training efficiency and unstable optimization process.

Question Answering Reinforcement Learning (RL)

Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation

no code implementations20 May 2025 Junyang Wang, Haiyang Xu, Xi Zhang, Ming Yan, Ji Zhang, Fei Huang, Jitao Sang

The exponential rise in mobile device usage necessitates streamlined automation for effective task management, yet many AI frameworks fall short due to inadequate operational expertise.

Management

SoLoPO: Unlocking Long-Context Capabilities in LLMs via Short-to-Long Preference Optimization

no code implementations16 May 2025 Huashan Sun, Shengyi Liao, Yansen Han, Yu Bai, Yang Gao, Cheng Fu, Weizhou Shen, Fanqi Wan, Ming Yan, Ji Zhang, Fei Huang

SoLoPO is compatible with mainstream preference optimization algorithms, while substantially improving the efficiency of data construction and training processes.

Domain Generalization

WorldPM: Scaling Human Preference Modeling

1 code implementation15 May 2025 Binghai Wang, Runji Lin, Keming Lu, Le Yu, Zhenru Zhang, Fei Huang, Chujie Zheng, Kai Dang, Yang Fan, Xingzhang Ren, An Yang, Binyuan Hui, Dayiheng Liu, Tao Gui, Qi Zhang, Xuanjing Huang, Yu-Gang Jiang, Bowen Yu, Jingren Zhou, Junyang Lin

Motivated by scaling laws in language modeling that demonstrate how test loss scales as a power law with model and dataset sizes, we find that similar laws exist in preference modeling.

Language Modeling Language Modelling

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

1 code implementation10 May 2025 Zihan Qiu, Zekun Wang, Bo Zheng, Zeyu Huang, Kaiyue Wen, Songlin Yang, Rui Men, Le Yu, Fei Huang, Suozhi Huang, Dayiheng Liu, Jingren Zhou, Junyang Lin

Gating mechanisms have been widely utilized, from early models like LSTMs and Highway Networks to recent state space models, linear attention, and also softmax attention.

Attribute Mixture-of-Experts +1

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

1 code implementation7 May 2025 Hao Sun, Zile Qiao, Jiayan Guo, Xuanbo Fan, Yingyan Hou, Yong Jiang, Pengjun Xie, Yan Zhang, Fei Huang, Jingren Zhou

To address these challenges, we introduce ZeroSearch, a novel RL framework that incentivizes the capabilities of LLMs to use a real search engine with simulated searches during training.

Reinforcement Learning (RL) Retrieval

Adaptive Thinking via Mode Policy Optimization for Social Language Agents

1 code implementation4 May 2025 Minzheng Wang, Yongbin Li, Haobo Wang, Xinghua Zhang, Nan Xu, Bingli Wu, Fei Huang, Haiyang Yu, Wenji Mao

To address this, we propose an $\textbf{A}$daptive $\textbf{M}$ode $\textbf{L}$earning ($\textbf{AML}$) framework in this paper, aiming to improve the adaptive thinking ability of language agents in dynamic social interactions.

Supervised Optimism Correction: Be Confident When LLMs Are Sure

no code implementations10 Apr 2025 Junjie Zhang, Rushuai Yang, Shunyu Liu, Ting-En Lin, Fei Huang, Yi Chen, Yongbin Li, DaCheng Tao

In this work, we establish a novel theoretical connection between supervised fine-tuning and offline reinforcement learning under the token-level Markov decision process, revealing that large language models indeed learn an implicit $Q$-function for inference.

GSM8K Math +1

SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement

1 code implementation4 Apr 2025 Runnan Fang, Xiaobin Wang, Yuan Liang, Shuofei Qiao, Jialong Wu, Zekun Xi, Ningyu Zhang, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

In the interaction between agents and their environments, agents expand their capabilities by planning and executing actions.

Navigate

Agentic Knowledgeable Self-awareness

1 code implementation4 Apr 2025 Shuofei Qiao, Zhisong Qiu, Baochang Ren, Xiaobin Wang, Xiangyuan Ru, Ningyu Zhang, Xiang Chen, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

Our experiments demonstrate that KnowSelf can outperform various strong baselines on different tasks and models with minimal use of external knowledge.

Decision Making

Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute

1 code implementation31 Mar 2025 Yingwei Ma, Yongbin Li, Yihong Dong, Xue Jiang, Rongyu Cao, Jue Chen, Fei Huang, Binhua Li

Recent advancements in software engineering agents have demonstrated promising capabilities in automating program improvements.

Fault localization

Long Is More Important Than Difficult for Training Reasoning Models

no code implementations23 Mar 2025 Si Shen, Fei Huang, Zhixiao Zhao, Chang Liu, Tiansheng Zheng, Danhao Zhu

Second, we identify a scaling law on reasoning length, showing that model performance increases in a log-linear fashion as the reasoning data length grows.

Math

WritingBench: A Comprehensive Benchmark for Generative Writing

1 code implementation7 Mar 2025 Yuning Wu, Jiahao Mei, Ming Yan, Chenliang Li, Shaopeng Lai, Yuran Ren, Zijia Wang, Ji Zhang, Mengyue Wu, Qin Jin, Fei Huang

Recent advancements in large language models (LLMs) have significantly enhanced text generation capabilities, yet evaluating their performance in generative writing remains a challenge.

Text Generation

AutoHete: An Automatic and Efficient Heterogeneous Training System for LLMs

1 code implementation27 Feb 2025 Zihao Zeng, Chubo Liu, Xin He, Juan Hu, Yong Jiang, Fei Huang, Kenli Li, Wei Yang Bryan Lim

Transformer-based large language models (LLMs) have demonstrated exceptional capabilities in sequence modeling and text generation, with improvements scaling proportionally with model size.

Scheduling Text Generation

Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference

no code implementations25 Feb 2025 Zhuo Chen, Xinyu Wang, Yong Jiang, Zhen Zhang, Xinyu Geng, Pengjun Xie, Fei Huang, Kewei Tu

To mitigate the dependence on retrieval and simultaneously maintain, or even improve, the performance benefits provided by retrieval, we propose a method to detect the knowledge boundary of VLLMs, allowing for more efficient use of techniques like RAG.

Question Answering RAG +3

AutoLogi: Automated Generation of Logic Puzzles for Evaluating Reasoning Abilities of Large Language Models

1 code implementation24 Feb 2025 Qin Zhu, Fei Huang, Runyu Peng, Keming Lu, Bowen Yu, Qinyuan Cheng, Xipeng Qiu, Xuanjing Huang, Junyang Lin

While logical reasoning evaluation of Large Language Models (LLMs) has attracted significant attention, existing benchmarks predominantly rely on multiple-choice formats that are vulnerable to random guessing, leading to overestimated performance and substantial performance fluctuations.

Logical Reasoning Multiple-choice

Enhancing Language Multi-Agent Learning with Multi-Agent Credit Re-Assignment for Interactive Environment Generalization

1 code implementation20 Feb 2025 Zhitao He, Zijun Liu, Peng Li, May Fung, Ming Yan, Ji Zhang, Fei Huang, Yang Liu

To address these issues, we propose CollabUIAgents, a multi-agent reinforcement learning framework with a novel multi-agent credit re-assignment (CR) strategy, assigning process rewards with LLMs rather than environment-specific rewards and learning with synthesized preference data, in order to foster generalizable, collaborative behaviors among the role-free agents' policies.

Multi-agent Reinforcement Learning

PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC

1 code implementation20 Feb 2025 Haowei Liu, Xi Zhang, Haiyang Xu, Yuyang Wanyan, Junyang Wang, Ming Yan, Ji Zhang, Chunfeng Yuan, Changsheng Xu, Weiming Hu, Fei Huang

From the decision-making perspective, to handle complex user instructions and interdependent subtasks more effectively, we propose a hierarchical multi-agent collaboration architecture that decomposes decision-making processes into Instruction-Subtask-Action levels.

Decision Making

EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning

no code implementations18 Feb 2025 Xiaoqian Liu, Ke Wang, Yongbin Li, Yuchuan Wu, Wentao Ma, Aobo Kong, Fei Huang, Jianbin Jiao, Junge Zhang

Large Language Models (LLMs) have shown impressive reasoning capabilities in well-defined problems with clear solutions, such as mathematics and coding.

Navigate Reinforcement Learning (RL)

LaRA: Benchmarking Retrieval-Augmented Generation and Long-Context LLMs - No Silver Bullet for LC or RAG Routing

no code implementations14 Feb 2025 Kuan Li, Liwen Zhang, Yong Jiang, Pengjun Xie, Fei Huang, Shuai Wang, Minhao Cheng

However, the advancements in context window size for LLMs offer an alternative approach, raising the question of whether RAG remains necessary for effectively handling external knowledge.

Benchmarking RAG +1

FishBargain: An LLM-Empowered Bargaining Agent for Online Fleamarket Platform Sellers

no code implementations22 Jan 2025 Dexin Kong, Xu Yan, Ming Chen, Shuguang Han, Jufeng Chen, Fei Huang

Different from traditional Business-to-Consumer e-commerce platforms~(e. g., Amazon), online fleamarket platforms~(e. g., Craigslist) mainly focus on individual sellers who are lack of time investment and business proficiency.

Debate Helps Weak-to-Strong Generalization

no code implementations21 Jan 2025 Hao Lang, Fei Huang, Yongbin Li

Specifically, we investigate ways of improving human supervision with a strong pretrained model and then supervise the strong model with enhanced weak human supervision.

Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks

1 code implementation20 Jan 2025 Zhenhailong Wang, Haiyang Xu, Junyang Wang, Xi Zhang, Ming Yan, Ji Zhang, Fei Huang, Heng Ji

By hierarchical, we mean an explicit separation of high-level planning and low-level action execution.

WebWalker: Benchmarking LLMs in Web Traversal

2 code implementations13 Jan 2025 Jialong Wu, Wenbiao Yin, Yong Jiang, Zhenglin Wang, Zekun Xi, Runnan Fang, Linhai Zhang, Yulan He, Deyu Zhou, Pengjun Xie, Fei Huang

Extensive experimental results show that WebWalkerQA is challenging and demonstrates the effectiveness of RAG combined with WebWalker, through the horizontal and vertical integration in real-world scenarios.

Benchmarking Open-Domain Question Answering +3

Enabling Scalable Oversight via Self-Evolving Critic

no code implementations10 Jan 2025 Zhengyang Tang, Ziniu Li, Zhenyang Xiao, Tian Ding, Ruoyu Sun, Benyou Wang, Dayiheng Liu, Fei Huang, Tianyu Liu, Bowen Yu, Junyang Lin

Despite their remarkable performance, the development of Large Language Models (LLMs) faces a critical challenge in scalable oversight: providing effective feedback for tasks where human evaluation is difficult or where LLMs outperform humans.

OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis

1 code implementation8 Jan 2025 Run Luo, Ting-En Lin, Haonan Zhang, Yuchuan Wu, Xiong Liu, Min Yang, Yongbin Li, Longze Chen, Jiaming Li, Lei Zhang, Yangyi Chen, Hamid Alinejad-Rokny, Fei Huang

In the alignment phase, a pre-trained speech model is further trained on text-image tasks to generalize from vision to speech in a (near) zero-shot manner, outperforming models trained on tri-modal datasets.

Decoder Emotional Speech Synthesis +3

SDPO: Segment-Level Direct Preference Optimization for Social Agents

1 code implementation3 Jan 2025 Aobo Kong, Wentao Ma, Shiwan Zhao, Yongbin Li, Yuchuan Wu, Ke Wang, Xiaoqian Liu, Qicheng Li, Yong Qin, Fei Huang

To address these limitations, we propose Segment-Level Direct Preference Optimization (SDPO), which focuses on specific key segments within interactions to optimize multi-turn agent behavior while minimizing training noise.

Unfolding the Headline: Iterative Self-Questioning for News Retrieval and Timeline Summarization

1 code implementation1 Jan 2025 Weiqi Wu, Shen Huang, Yong Jiang, Pengjun Xie, Fei Huang, Hai Zhao

In the fast-changing realm of information, the capacity to construct coherent timelines from extensive event-related content has become increasingly significant and challenging.

News Retrieval Retrieval +1

DISC: Plug-and-Play Decoding Intervention with Similarity of Characters for Chinese Spelling Check

no code implementations17 Dec 2024 Ziheng Qiao, Houquan Zhou, Yumeng Liu, Zhenghua Li, Min Zhang, Bo Zhang, Chen Li, Ji Zhang, Fei Huang

One key characteristic of the Chinese spelling check (CSC) task is that incorrect characters are usually similar to the correct ones in either phonetics or glyph.

DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling

1 code implementation6 Dec 2024 Minzheng Wang, Xinghua Zhang, Kun Chen, Nan Xu, Haiyang Yu, Fei Huang, Wenji Mao, Yongbin Li

Despite the large volumes of dialogue-related studies, there is a lack of benchmark that encompasses comprehensive dialogue elements, which hinders precise modeling, generation and systematic evaluation.

Dialogue Generation Imitation Learning

LLMs as Continuous Learners: Improving the Reproduction of Defective Code in Software Issues

no code implementations21 Nov 2024 Yalan Lin, Yingwei Ma, Rongyu Cao, Binhua Li, Fei Huang, Xiaodong Gu, Yongbin Li

Reproducing buggy code is the first and crucially important step in issue resolving, as it aids in identifying the underlying problems and validating that generated patches resolve the problem.

SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization

1 code implementation CVPR 2025 Hongrui Jia, Chaoya Jiang, Haiyang Xu, Wei Ye, Mengfan Dong, Ming Yan, Ji Zhang, Fei Huang, Shikun Zhang

This forces the model to carefully understand the demonstration images and establish a relationship between the images and the symbols to answer questions correctly.

In-Context Learning

P-MMEval: A Parallel Multilingual Multitask Benchmark for Consistent Evaluation of LLMs

no code implementations14 Nov 2024 Yidan Zhang, Yu Wan, Boyi Deng, Baosong Yang, Haoran Wei, Bowen Yu, Junyang Lin, Fei Huang, Jingren Zhou

Recent advancements in large language models (LLMs) showcase varied multilingual capabilities across tasks like translation, code generation, and reasoning.

Code Generation Transfer Learning

Exploring Knowledge Boundaries in Large Language Models for Retrieval Judgment

no code implementations9 Nov 2024 Zhen Zhang, Xinyu Wang, Yong Jiang, Zhuo Chen, Feiteng Mu, Mengting Hu, Pengjun Xie, Fei Huang

Actually, we find that the impact of RAG on the question answering capabilities of LLMs can be categorized into three groups: beneficial, neutral, and harmful.

Question Answering RAG +2

IOPO: Empowering LLMs with Complex Instruction Following via Input-Output Preference Optimization

1 code implementation9 Nov 2024 Xinghua Zhang, Haiyang Yu, Cheng Fu, Fei Huang, Yongbin Li

In the realm of large language models (LLMs), the ability of models to accurately follow instructions is paramount as more agents and applications leverage LLMs for construction, where the complexity of instructions are rapidly increasing.

Instruction Following

Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent

1 code implementation5 Nov 2024 Yangning Li, Yinghui Li, Xinyu Wang, Yong Jiang, Zhen Zhang, Xinran Zheng, Hui Wang, Hai-Tao Zheng, Pengjun Xie, Philip S. Yu, Fei Huang, Jingren Zhou

To bridge the dataset gap, we first construct Dyn-VQA dataset, consisting of three types of "dynamic" questions, which require complex knowledge retrieval strategies variable in query, tool, and time: (1) Questions with rapidly changing answers.

Benchmarking Hallucination +4

Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement

1 code implementation1 Nov 2024 Yingwei Ma, Rongyu Cao, Yongchang Cao, Yue Zhang, Jue Chen, Yibo Liu, Yuchen Liu, Binhua Li, Fei Huang, Yongbin Li

The results demonstrate that Lingma SWE-GPT 72B successfully resolves 30. 20% of the GitHub issues, marking a significant improvement in automatic issue resolution (22. 76% relative improvement compared to Llama 3. 1 405B), approaching the performance of closed-source models (31. 80\% issues of GPT-4o resolved).

Language Modeling Language Modelling

EvoCodeBench: An Evolving Code Generation Benchmark with Domain-Specific Evaluations

no code implementations30 Oct 2024 Jia Li, Ge Li, Xuanming Zhang, YunFei Zhao, Yihong Dong, Zhi Jin, Binhua Li, Fei Huang, Yongbin Li

These evaluations help practitioners select superior LLMs in specific domains and discover the shortcomings of existing LLMs.

Code Generation Fairness

An Adaptive Framework for Generating Systematic Explanatory Answer in Online Q&A Platforms

1 code implementation23 Oct 2024 Ziyang Chen, Xiaobin Wang, Yong Jiang, Jinzhi Liao, Pengjun Xie, Fei Huang, Xiang Zhao

The naive RAG models, although effective in information retrieval, struggle with complex questions that require comprehensive and in-depth answers.

Answer Generation Information Retrieval +2

IPL: Leveraging Multimodal Large Language Models for Intelligent Product Listing

no code implementations22 Oct 2024 Kang Chen, Qingheng Zhang, Chengbao Lian, Yixin Ji, Xuwei Liu, Shuguang Han, Guoqiang Wu, Fei Huang, Jufeng Chen

To this end, we develop IPL, an Intelligent Product Listing tool tailored to generate descriptions using various product attributes such as category, brand, color, condition, etc.

Hallucination RAG +1

On the Role of Attention Heads in Large Language Model Safety

1 code implementation17 Oct 2024 Zhenhong Zhou, Haiyang Yu, Xinghua Zhang, Rongwu Xu, Fei Huang, Kun Wang, Yang Liu, Junfeng Fang, Yongbin Li

In light of this, recent research on safety mechanisms has emerged, revealing that when safety representations or component are suppressed, the safety capability of LLMs are compromised.

Attribute Language Modeling +2

Benchmarking Agentic Workflow Generation

1 code implementation10 Oct 2024 Shuofei Qiao, Runnan Fang, Zhisong Qiu, Xiaobin Wang, Ningyu Zhang, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

Large Language Models (LLMs), with their exceptional ability to handle a wide range of tasks, have driven significant advancements in tackling reasoning and planning tasks, wherein decomposing complex problems into executable workflows is a crucial step in this process.

Benchmarking

A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Correction Based on Large Language Models

1 code implementation5 Oct 2024 Houquan Zhou, Zhenghua Li, Bo Zhang, Chen Li, Shaopeng Lai, Ji Zhang, Fei Huang, Min Zhang

This work proposes a simple training-free prompt-free approach to leverage large language models (LLMs) for the Chinese spelling correction (CSC) task, which is totally different from all previous CSC approaches.

Language Modeling Language Modelling +2

Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion?

1 code implementation2 Oct 2024 Zhenyu Pan, Rongyu Cao, Yongchang Cao, Yingwei Ma, Binhua Li, Fei Huang, Han Liu, Yongbin Li

Code completion, a key downstream task in code generation, is one of the most frequent and impactful methods for enhancing developer productivity in software development.

Code Completion Code Generation

In-Context Transfer Learning: Demonstration Synthesis by Transferring Similar Tasks

1 code implementation2 Oct 2024 Dingzirui Wang, Xuanliang Zhang, Qiguang Chen, Longxu Dou, Xiao Xu, Rongyu Cao, Yingwei Ma, Qingfu Zhu, Wanxiang Che, Binhua Li, Fei Huang, Yongbin Li

To address this, inspired by transfer learning, we propose In-Context Transfer Learning (ICTL), which synthesizes target task demonstrations by transferring labeled demonstrations from similar source tasks.

In-Context Learning Transfer Learning

The Imperative of Conversation Analysis in the Era of LLMs: A Survey of Tasks, Techniques, and Trends

no code implementations21 Sep 2024 Xinghua Zhang, Haiyang Yu, Yongbin Li, Minzheng Wang, Longze Chen, Fei Huang

In the era of large language models (LLMs), a vast amount of conversation logs will be accumulated thanks to the rapid development trend of language UI.

MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct

no code implementations9 Sep 2024 Run Luo, Haonan Zhang, Longze Chen, Ting-En Lin, Xiong Liu, Yuchuan Wu, Min Yang, Minzheng Wang, Pengpeng Zeng, Lianli Gao, Heng Tao Shen, Yunshui Li, Xiaobo Xia, Fei Huang, Jingkuan Song, Yongbin Li

This framework iteratively improve data quality through a refined combination of fine-grained perception, cognitive reasoning, and interaction evolution, generating a more complex and diverse image-text instruction dataset that empowers MLLMs with enhanced capabilities.

Diversity Visual Reasoning

mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding

1 code implementation5 Sep 2024 Anwen Hu, Haiyang Xu, Liang Zhang, Jiabo Ye, Ming Yan, Ji Zhang, Qin Jin, Fei Huang, Jingren Zhou

Multimodel Large Language Models(MLLMs) have achieved promising OCR-free Document Understanding performance by increasing the supported resolution of document images.

document understanding Optical Character Recognition (OCR) +1

Platypus: A Generalized Specialist Model for Reading Text in Various Forms

1 code implementation27 Aug 2024 Peng Wang, Zhaohai Li, Jun Tang, Humen Zhong, Fei Huang, Zhibo Yang, Cong Yao

Recently, generalist models (such as GPT-4V), trained on tremendous data in a unified way, have shown enormous potential in reading text in various scenarios, but with the drawbacks of limited accuracy and low efficiency.

Handwritten Text Recognition Scene Text Recognition

MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model

no code implementations22 Aug 2024 Chaoya Jiang, Jia Hongrui, Haiyang Xu, Wei Ye, Mengfan Dong, Ming Yan, Ji Zhang, Fei Huang, Shikun Zhang

This paper presents MaVEn, an innovative Multi-granularity Visual Encoding framework designed to enhance the capabilities of Multimodal Large Language Models (MLLMs) in multi-image reasoning.

Language Modeling Language Modelling +3

mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval

no code implementations29 Jul 2024 Xin Zhang, Yanzhao Zhang, Dingkun Long, Wen Xie, Ziqi Dai, Jialong Tang, Huan Lin, Baosong Yang, Pengjun Xie, Fei Huang, Meishan Zhang, Wenjie Li, Min Zhang

We first introduce a text encoder (base size) enhanced with RoPE and unpadding, pre-trained in a native 8192-token context (longer than 512 of previous multilingual encoders).

Contrastive Learning Reranking +2

MIBench: Evaluating Multimodal Large Language Models over Multiple Images

no code implementations21 Jul 2024 Haowei Liu, Xi Zhang, Haiyang Xu, Yaya Shi, Chaoya Jiang, Ming Yan, Ji Zhang, Fei Huang, Chunfeng Yuan, Bing Li, Weiming Hu

However, most existing MLLMs and benchmarks primarily focus on single-image input scenarios, leaving the performance of MLLMs when handling realistic multiple images underexplored.

In-Context Learning Multiple-choice

Visual Text Generation in the Wild

1 code implementation19 Jul 2024 Yuanzhi Zhu, Jiawei Liu, Feiyu Gao, Wenyu Liu, Xinggang Wang, Peng Wang, Fei Huang, Cong Yao, Zhibo Yang

However, it is still challenging to render high-quality text images in real-world scenarios, as three critical criteria should be satisfied: (1) Fidelity: the generated text images should be photo-realistic and the contents are expected to be the same as specified in the given conditions; (2) Reasonability: the regions and contents of the generated text should cohere with the scene; (3) Utility: the generated text images can facilitate related tasks (e. g., text detection and recognition).

Language Modelling Large Language Model +3

Retrieved In-Context Principles from Previous Mistakes

no code implementations8 Jul 2024 Hao Sun, Yong Jiang, Bo wang, Yingyan Hou, Yan Zhang, Pengjun Xie, Fei Huang

In RICP, the teacher model analyzes mistakes from the student model to generate reasons and insights for preventing similar mistakes.

In-Context Learning

ProductAgent: Benchmarking Conversational Product Search Agent with Asking Clarification Questions

no code implementations1 Jul 2024 Jingheng Ye, Yong Jiang, Xiaobin Wang, Yinghui Li, Yangning Li, Hai-Tao Zheng, Pengjun Xie, Fei Huang

To address this task, we propose ProductAgent, a conversational information seeking agent equipped with abilities of strategic clarification question generation and dynamic product retrieval.

Benchmarking Question Generation +2

Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA

1 code implementation25 Jun 2024 Minzheng Wang, Longze Chen, Cheng Fu, Shengyi Liao, Xinghua Zhang, Bingli Wu, Haiyang Yu, Nan Xu, Lei Zhang, Run Luo, Yunshui Li, Min Yang, Fei Huang, Yongbin Li

Long-context modeling capabilities have garnered widespread attention, leading to the emergence of Large Language Models (LLMs) with ultra-context windows.

Benchmarking Long-Context Understanding +3

FlowBench: Revisiting and Benchmarking Workflow-Guided Planning for LLM-based Agents

no code implementations21 Jun 2024 Ruixuan Xiao, Wentao Ma, Ke Wang, Yuchuan Wu, Junbo Zhao, Haobo Wang, Fei Huang, Yongbin Li

Motivated by this, we formalize different formats of workflow knowledge and present FlowBench, the first benchmark for workflow-guided planning.

Benchmarking

Query Routing for Homogeneous Tools: An Instantiation in the RAG Scenario

no code implementations18 Jun 2024 Feiteng Mu, Yong Jiang, Liwen Zhang, Chu Liu, Wenjie Li, Pengjun Xie, Fei Huang

Current research on tool learning primarily focuses on selecting the most effective tool from a wide array of options, often overlooking cost-effectiveness, a crucial factor in human problem-solving.

RAG

Supportiveness-based Knowledge Rewriting for Retrieval-augmented Language Modeling

no code implementations12 Jun 2024 Zile Qiao, Wei Ye, Yong Jiang, Tong Mo, Pengjun Xie, Weiping Li, Fei Huang, Shikun Zhang

Retrieval-augmented language models (RALMs) have recently shown great potential in mitigating the limitations of implicit knowledge in LLMs, such as untimely updating of the latest expertise and unreliable retention of long-tail knowledge.

Language Modeling Language Modelling +1

Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration

2 code implementations3 Jun 2024 Junyang Wang, Haiyang Xu, Haitao Jia, Xi Zhang, Ming Yan, Weizhou Shen, Ji Zhang, Fei Huang, Jitao Sang

However, the two major navigation challenges in mobile device operation tasks, task progress navigation and focus content navigation, are significantly complicated under the single-agent architecture of existing work.

Alibaba LingmaAgent: Improving Automated Issue Resolution via Comprehensive Repository Exploration

no code implementations3 Jun 2024 Yingwei Ma, Qingping Yang, Rongyu Cao, Binhua Li, Fei Huang, Yongbin Li

This paper presents Alibaba LingmaAgent, a novel Automated Software Engineering method designed to comprehensively understand and utilize whole software repositories for issue resolution.

Language Modelling Large Language Model

Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment

1 code implementation28 May 2024 Keming Lu, Bowen Yu, Fei Huang, Yang Fan, Runji Lin, Chang Zhou

Effectively aligning Large Language Models (LLMs) with human-centric values while preventing the degradation of abilities acquired through Pre-training and Supervised Fine-tuning (SFT) poses a central challenge in Reinforcement Learning from Human Feedback (RLHF).

Agent Planning with World Knowledge Model

1 code implementation23 May 2024 Shuofei Qiao, Runnan Fang, Ningyu Zhang, Yuqi Zhu, Xiang Chen, Shumin Deng, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

Imitating humans' mental world knowledge model which provides global prior knowledge before the task and maintains local dynamic knowledge during the task, in this paper, we introduce parametric World Knowledge Model (WKM) to facilitate agent planning.

model World Knowledge

RaFe: Ranking Feedback Improves Query Rewriting for RAG

no code implementations23 May 2024 Shengyu Mao, Yong Jiang, Boli Chen, Xiao Li, Peng Wang, Xinyu Wang, Pengjun Xie, Fei Huang, Huajun Chen, Ningyu Zhang

As Large Language Models (LLMs) and Retrieval Augmentation Generation (RAG) techniques have evolved, query rewriting has been widely incorporated into the RAG system for downstream tasks like open-domain QA.

RAG Retrieval

WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models

1 code implementation23 May 2024 Peng Wang, Zexi Li, Ningyu Zhang, Ziwen Xu, Yunzhi Yao, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

In WISE, we design a dual parametric memory scheme, which consists of the main memory for the pretrained knowledge and a side memory for the edited knowledge.

Hallucination Model Editing +2

Why You Should Not Trust Interpretations in Machine Learning: Adversarial Attacks on Partial Dependence Plots

no code implementations29 Apr 2024 Xi Xin, Giles Hooker, Fei Huang

The adoption of artificial intelligence (AI) across industries has led to the widespread use of complex black-box models and interpretation tools for decision making.

Decision Making Management

A Survey on Self-Evolution of Large Language Models

1 code implementation22 Apr 2024 Zhengwei Tao, Ting-En Lin, Xiancai Chen, Hangyu Li, Yuchuan Wu, Yongbin Li, Zhi Jin, Fei Huang, DaCheng Tao, Jingren Zhou

To address this issue, self-evolution approaches that enable LLM to autonomously acquire, refine, and learn from experiences generated by the model itself are rapidly growing.

Diversity Survey

Exploring Key Point Analysis with Pairwise Generation and Graph Partitioning

1 code implementation17 Apr 2024 Xiao Li, Yong Jiang, Shen Huang, Pengjun Xie, Gong Cheng, Fei Huang

Our objective is to train a generative model that can simultaneously provide a score indicating the presence of shared key point between a pair of arguments and generate the shared key point.

Argument Mining graph partitioning +2

Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts

1 code implementation2 Apr 2024 Zhuo Chen, Xinyu Wang, Yong Jiang, Pengjun Xie, Fei Huang, Kewei Tu

With our method, the origin language models can cover several times longer contexts while keeping the computing requirements close to the baseline.

In-Context Learning Language Modeling +4

Fine-Tuning Language Models with Reward Learning on Policy

1 code implementation28 Mar 2024 Hao Lang, Fei Huang, Yongbin Li

RLHF contains three steps, i. e., human preference collecting, reward learning, and policy optimization, which are usually performed serially.

MULTI-VIEW LEARNING

OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition

1 code implementation28 Mar 2024 Jianqiang Wan, Sibo Song, Wenwen Yu, Yuliang Liu, Wenqing Cheng, Fei Huang, Xiang Bai, Cong Yao, Zhibo Yang

Recently, visually-situated text parsing (VsTP) has experienced notable advancements, driven by the increasing demand for automated document understanding and the emergence of Generative Large Language Models (LLMs) capable of processing document-based questions.

Decoder document understanding +4

ReAct Meets ActRe: When Language Agents Enjoy Training Data Autonomy

no code implementations21 Mar 2024 Zonghan Yang, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Yang Liu

In WebShop, the 1-shot performance of the A$^3$T agent matches human average, and 4 rounds of iterative refinement lead to the performance approaching human experts.

Policy Gradient Methods

SocialBench: Sociality Evaluation of Role-Playing Conversational Agents

2 code implementations20 Mar 2024 Hongzhan Chen, Hehong Chen, Ming Yan, Wenshen Xu, Xing Gao, Weizhou Shen, Xiaojun Quan, Chenliang Li, Ji Zhang, Fei Huang, Jingren Zhou

In this paper, we introduce SocialBench, the first benchmark designed to systematically evaluate the sociality of role-playing conversational agents at both individual and group levels of social interactions.

mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding

1 code implementation19 Mar 2024 Anwen Hu, Haiyang Xu, Jiabo Ye, Ming Yan, Liang Zhang, Bo Zhang, Chen Li, Ji Zhang, Qin Jin, Fei Huang, Jingren Zhou

In this work, we emphasize the importance of structure information in Visual Document Understanding and propose the Unified Structure Learning to boost the performance of MLLMs.

document understanding Optical Character Recognition (OCR)

Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment

1 code implementation17 Mar 2024 Feifan Song, Bowen Yu, Hao Lang, Haiyang Yu, Fei Huang, Houfeng Wang, Yongbin Li

Additionally, the concept of diversity for prompts can be more complex than responses that are typically quantified by single digits.

Data Augmentation Diversity

Improving Cross-lingual Representation for Semantic Retrieval with Code-switching

no code implementations3 Mar 2024 Mieradilijiang Maimaiti, Yuanhang Zheng, Ji Zhang, Fei Huang, Yue Zhang, Wenpei Luo, Kaiyu Huang

Semantic Retrieval (SR) has become an indispensable part of the FAQ system in the task-oriented question-answering (QA) dialogue scenario.

Question Answering Retrieval +3

Let LLMs Take on the Latest Challenges! A Chinese Dynamic Question Answering Benchmark

1 code implementation29 Feb 2024 Zhikun Xu, Yinghui Li, Ruixue Ding, Xinyu Wang, Boli Chen, Yong Jiang, Hai-Tao Zheng, Wenlian Lu, Pengjun Xie, Fei Huang

To promote the improvement of Chinese LLMs' ability to answer dynamic questions, in this paper, we introduce CDQA, a Chinese Dynamic QA benchmark containing question-answer pairs related to the latest news on the Chinese Internet.

Question Answering

Training-Free Long-Context Scaling of Large Language Models

1 code implementation27 Feb 2024 Chenxin An, Fei Huang, Jun Zhang, Shansan Gong, Xipeng Qiu, Chang Zhou, Lingpeng Kong

The ability of Large Language Models (LLMs) to process and generate coherent text is markedly weakened when the number of input tokens exceeds their pretraining length.

16k

Unifying Latent and Lexicon Representations for Effective Video-Text Retrieval

no code implementations26 Feb 2024 Haowei Liu, Yaya Shi, Haiyang Xu, Chunfeng Yuan, Qinghao Ye, Chenliang Li, Ming Yan, Ji Zhang, Fei Huang, Bing Li, Weiming Hu

In this work, we propose the UNIFY framework, which learns lexicon representations to capture fine-grained semantics and combines the strengths of latent and lexicon representations for video-text retrieval.

Text Retrieval Video-Text Retrieval

Budget-Constrained Tool Learning with Planning

2 code implementations25 Feb 2024 Yuanhang Zheng, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Yang Liu

Despite intensive efforts devoted to tool learning, the problem of budget-constrained tool learning, which focuses on resolving user queries within a specific budget constraint, has been widely overlooked.

Self-Retrieval: End-to-End Information Retrieval with One Large Language Model

2 code implementations23 Feb 2024 Qiaoyu Tang, Jiawei Chen, Zhuoqun Li, Bowen Yu, Yaojie Lu, Cheng Fu, Haiyang Yu, Hongyu Lin, Fei Huang, Ben He, Xianpei Han, Le Sun, Yongbin Li

However, current interactions between IR systems and LLMs remain limited, with LLMs merely serving as part of components within IR systems, and IR systems being constructed independently of LLMs.

Information Retrieval Language Modeling +6

PANDA: Preference Adaptation for Enhancing Domain-Specific Abilities of LLMs

1 code implementation20 Feb 2024 An Liu, Zonghan Yang, Zhenhe Zhang, Qingyuan Hu, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Yang Liu

While Large language models (LLMs) have demonstrated considerable capabilities across various natural language tasks, they often fall short of the performance achieved by domain-specific state-of-the-art models.

text-classification Text Classification

Model Composition for Multimodal Large Language Models

1 code implementation20 Feb 2024 Chi Chen, Yiyang Du, Zheng Fang, Ziyue Wang, Fuwen Luo, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Maosong Sun, Yang Liu

In this paper, we propose a new paradigm through the model composition of existing MLLMs to create a new model that retains the modal understanding capabilities of each original model.

model

Browse and Concentrate: Comprehending Multimodal Content via prior-LLM Context Fusion

1 code implementation19 Feb 2024 Ziyue Wang, Chi Chen, Yiqi Zhu, Fuwen Luo, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Maosong Sun, Yang Liu

With the bloom of Large Language Models (LLMs), Multimodal Large Language Models (MLLMs) that incorporate LLMs with pre-trained vision models have recently demonstrated impressive performance across diverse vision-language tasks.

Enabling Weak LLMs to Judge Response Reliability via Meta Ranking

no code implementations19 Feb 2024 Zijun Liu, Boqun Kou, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Yang Liu

In model cascading, we combine open- and closed-source LLMs to achieve performance comparable to GPT-4-turbo with lower costs.

Hallucination In-Context Learning

AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator

1 code implementation15 Feb 2024 Zhihao Fan, Jialong Tang, Wei Chen, Siyuan Wang, Zhongyu Wei, Jun Xi, Fei Huang, Jingren Zhou

Artificial intelligence has significantly advanced healthcare, particularly through large language models (LLMs) that excel in medical question answering benchmarks.

Benchmarking Diagnostic +1

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception

1 code implementation29 Jan 2024 Junyang Wang, Haiyang Xu, Jiabo Ye, Ming Yan, Weizhou Shen, Ji Zhang, Fei Huang, Jitao Sang

To assess the performance of Mobile-Agent, we introduced Mobile-Eval, a benchmark for evaluating mobile device operations.

Small LLMs Are Weak Tool Learners: A Multi-LLM Agent

1 code implementation14 Jan 2024 Weizhou Shen, Chenliang Li, Hongzhan Chen, Ming Yan, Xiaojun Quan, Hehong Chen, Ji Zhang, Fei Huang

Each component is implemented by a single LLM that focuses on a specific capability and collaborates with others to accomplish the task.

Language Modelling Large Language Model +1

Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection

no code implementations11 Jan 2024 Wei Ye, Chaoya Jiang, Haiyang Xu, Chenhao Ye, Chenliang Li, Ming Yan, Shikun Zhang, Songhang Huang, Fei Huang

Vision Transformers (ViTs) have become increasingly popular in large-scale Vision and Language Pre-training (VLP) models.

LORE++: Logical Location Regression Network for Table Structure Recognition with Pre-training

no code implementations3 Jan 2024 Rujiao Long, Hangdi Xing, Zhibo Yang, Qi Zheng, Zhi Yu, Cong Yao, Fei Huang

We model TSR as a logical location regression problem and propose a new TSR framework called LORE, standing for LOgical location REgression network, which for the first time regresses logical location as well as spatial location of table cells in a unified network.

regression

DialCLIP: Empowering CLIP as Multi-Modal Dialog Retriever

no code implementations2 Jan 2024 Zhichao Yin, Binyuan Hui, Min Yang, Fei Huang, Yongbin Li

Recently, substantial advancements in pre-trained vision-language models have greatly enhanced the capabilities of multi-modal dialog systems.

Language Modelling Retrieval

Unifying Structured Data as Graph for Data-to-Text Pre-Training

1 code implementation2 Jan 2024 Shujie Li, Liang Li, Ruiying Geng, Min Yang, Binhua Li, Guanghu Yuan, Wanwei He, Shao Yuan, Can Ma, Fei Huang, Yongbin Li

In this paper, we unify different types of structured data (i. e., table, key-value data, knowledge graph) into the graph format and cast different data-to-text generation tasks as graph-to-text generation.

Data-to-Text Generation

OmniParser: A Unified Framework for Text Spotting Key Information Extraction and Table Recognition

1 code implementation CVPR 2024 Jianqiang Wan, Sibo Song, Wenwen Yu, Yuliang Liu, Wenqing Cheng, Fei Huang, Xiang Bai, Cong Yao, Zhibo Yang

Recently visually-situated text parsing (VsTP) has experienced notable advancements driven by the increasing demand for automated document understanding and the emergence of Generative Large Language Models (LLMs) capable of processing document-based questions.

Decoder document understanding +4

EcomGPT-CT: Continual Pre-training of E-commerce Large Language Models with Semi-structured Data

no code implementations25 Dec 2023 Shirong Ma, Shen Huang, Shulin Huang, Xiaobin Wang, Yangning Li, Hai-Tao Zheng, Pengjun Xie, Fei Huang, Yong Jiang

Experimental results demonstrate the effectiveness of continual pre-training of E-commerce LLMs and the efficacy of our devised data mixing strategy.

In-Context Learning

One-Shot Learning as Instruction Data Prospector for Large Language Models

1 code implementation16 Dec 2023 Yunshui Li, Binyuan Hui, Xiaobo Xia, Jiaxi Yang, Min Yang, Lei Zhang, Shuzheng Si, Ling-Hao Chen, Junhao Liu, Tongliang Liu, Fei Huang, Yongbin Li

Contemporary practices in instruction tuning often hinge on enlarging data scaling without a clear strategy for ensuring data quality, inadvertently introducing noise that may compromise model performance.

One-Shot Learning

Hallucination Augmented Contrastive Learning for Multimodal Large Language Model

1 code implementation CVPR 2024 Chaoya Jiang, Haiyang Xu, Mengfan Dong, Jiaxing Chen, Wei Ye, Ming Yan, Qinghao Ye, Ji Zhang, Fei Huang, Shikun Zhang

We first analyzed the representation distribution of textual and visual tokens in MLLM, revealing two important findings: 1) there is a significant gap between textual and visual representations, indicating unsatisfactory cross-modal representation alignment; 2) representations of texts that contain and do not contain hallucinations are entangled, making it challenging to distinguish them.

Contrastive Learning Hallucination +6

Jointly spatial-temporal representation learning for individual trajectories

no code implementations7 Dec 2023 Fei Huang, Jianrong Lv, Yang Yue

The proposed ST-GraphRL consists of three compositions: (i) a weighted directed spatial-temporal graph to explicitly construct mobility interactions in both space and time dimensions; (ii) a two-stage jointly encoder (i. e., decoupling and fusion), to learn entangled spatial-temporal dependencies by independently decomposing and jointly aggregating space and time information; (iii) a decoder guides ST-GraphRL to learn explicit mobility regularities by simulating the spatial-temporal distributions of trajectories.

Decoder Representation Learning

Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use

1 code implementation7 Dec 2023 Yuhan Chen, Ang Lv, Ting-En Lin, Changyu Chen, Yuchuan Wu, Fei Huang, Yongbin Li, Rui Yan

Specifically, the crucial information in the context will be potentially overlooked by model when it is positioned in the trough zone of the attention waveform, leading to decreased performance.

RAG Trajectory Planning

mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model

1 code implementation30 Nov 2023 Anwen Hu, Yaya Shi, Haiyang Xu, Jiabo Ye, Qinghao Ye, Ming Yan, Chenliang Li, Qi Qian, Ji Zhang, Fei Huang

In this work, towards a more versatile copilot for academic paper writing, we mainly focus on strengthening the multi-modal diagram analysis ability of Multimodal LLMs.

Language Modeling Language Modelling +2

Speculative Contrastive Decoding

no code implementations15 Nov 2023 Hongyi Yuan, Keming Lu, Fei Huang, Zheng Yuan, Chang Zhou

Large language models~(LLMs) exhibit exceptional performance in language tasks, yet their auto-regressive inference is limited due to high computational requirements and is sub-optimal due to the exposure bias.

Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

2 code implementations6 Nov 2023 Le Yu, Bowen Yu, Haiyang Yu, Fei Huang, Yongbin Li

We experiment with encoder- and decoder-based LMs, showing that: (1) SFT delta parameter value ranges are typically small (within 0. 002) with extreme redundancy, and DARE can effortlessly eliminate 90% or even 99% of them; (2) DARE can merge multiple task-specific LMs into one LM with diverse capabilities.

Decoder GSM8K +1

Improving Factual Consistency of Text Summarization by Adversarially Decoupling Comprehension and Embellishment Abilities of LLMs

no code implementations30 Oct 2023 Huawen Feng, Yan Fan, Xiong Liu, Ting-En Lin, Zekun Yao, Yuchuan Wu, Fei Huang, Yongbin Li, Qianli Ma

Despite the recent progress in text summarization made by large language models (LLMs), they often generate summaries that are factually inconsistent with original articles, known as "hallucinations" in text generation.

Text Generation Text Summarization

Diversify Question Generation with Retrieval-Augmented Style Transfer

1 code implementation23 Oct 2023 Qi Gou, Zehua Xia, Bowen Yu, Haiyang Yu, Fei Huang, Yongbin Li, Nguyen Cam-Tu

Given a textual passage and an answer, humans are able to ask questions with various expressions, but this ability is still challenging for most question generation (QG) systems.

Diversity Question Answering +5

Improving Seq2Seq Grammatical Error Correction via Decoding Interventions

1 code implementation23 Oct 2023 Houquan Zhou, Yumeng Liu, Zhenghua Li, Min Zhang, Bo Zhang, Chen Li, Ji Zhang, Fei Huang

In this paper, we propose a unified decoding intervention framework that employs an external critic to assess the appropriateness of the token to be generated incrementally, and then dynamically influence the choice of the next token.

Decoder Grammatical Error Correction +2

Improving Question Generation with Multi-level Content Planning

1 code implementation20 Oct 2023 Zehua Xia, Qi Gou, Bowen Yu, Haiyang Yu, Fei Huang, Yongbin Li, Cam-Tu Nguyen

Previous studies have suggested that key phrase selection is essential for question generation (QG), yet it is still challenging to connect such disjointed phrases into meaningful questions, particularly for long context.

Answer Generation Question Generation +2

FactCHD: Benchmarking Fact-Conflicting Hallucination Detection

1 code implementation18 Oct 2023 Xiang Chen, Duanzheng Song, Honghao Gui, Chenxi Wang, Ningyu Zhang, Yong Jiang, Fei Huang, Chengfei Lv, Dan Zhang, Huajun Chen

Despite their impressive generative capabilities, LLMs are hindered by fact-conflicting hallucinations in real-world applications.

Benchmarking Hallucination

Editing Personality for Large Language Models

1 code implementation3 Oct 2023 Shengyu Mao, Xiaohan Wang, Mengru Wang, Yong Jiang, Pengjun Xie, Fei Huang, Ningyu Zhang

This task seeks to adjust the models' responses to opinion-related questions on specified topics since an individual's personality often manifests in the form of their expressed opinions, thereby showcasing different personality traits.

Model Editing

VDialogUE: A Unified Evaluation Benchmark for Visually-grounded Dialogue

no code implementations14 Sep 2023 Yunshui Li, Binyuan Hui, Zhaochao Yin, Wanwei He, Run Luo, Yuxing Long, Min Yang, Fei Huang, Yongbin Li

Visually-grounded dialog systems, which integrate multiple modes of communication such as text and visual inputs, have become an increasingly popular area of investigation.

Geo-Encoder: A Chunk-Argument Bi-Encoder Framework for Chinese Geographic Re-Ranking

1 code implementation4 Sep 2023 Yong Cao, Ruixue Ding, Boli Chen, Xianzhi Li, Min Chen, Daniel Hershcovich, Pengjun Xie, Fei Huang

Chinese geographic re-ranking task aims to find the most relevant addresses among retrieved candidates, which is crucial for location-related services such as navigation maps.

Chunking Multi-Task Learning +1

ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models

3 code implementations2 Sep 2023 Chenliang Li, Hehong Chen, Ming Yan, Weizhou Shen, Haiyang Xu, Zhikai Wu, Zhicheng Zhang, Wenmeng Zhou, Yingda Chen, Chen Cheng, Hongzhu Shi, Ji Zhang, Fei Huang, Jingren Zhou

Large language models (LLMs) have recently demonstrated remarkable capabilities to comprehend human intentions, engage in reasoning, and design planning-like behavior.

EcomGPT: Instruction-tuning Large Language Models with Chain-of-Task Tasks for E-commerce

1 code implementation14 Aug 2023 Yangning Li, Shirong Ma, Xiaobin Wang, Shen Huang, Chengyue Jiang, Hai-Tao Zheng, Pengjun Xie, Fei Huang, Yong Jiang

EcomInstruct scales up the data size and task diversity by constructing atomic tasks with E-commerce basic data types, such as product information, user reviews.

Diversity Instruction Following +3

A Preliminary Study of the Intrinsic Relationship between Complexity and Alignment

1 code implementation10 Aug 2023 Yingxiu Zhao, Bowen Yu, Binyuan Hui, Haiyang Yu, Fei Huang, Yongbin Li, Nevin L. Zhang

Training large language models (LLMs) with open-domain instruction data has yielded remarkable success in aligning to end tasks and human preferences.

Wider and Deeper LLM Networks are Fairer LLM Evaluators

1 code implementation3 Aug 2023 Xinghua Zhang, Bowen Yu, Haiyang Yu, Yangyu Lv, Tingwen Liu, Fei Huang, Hongbo Xu, Yongbin Li

Each perspective corresponds to the role of a specific LLM neuron in the first layer.

CValues: Measuring the Values of Chinese Large Language Models from Safety to Responsibility

2 code implementations19 Jul 2023 Guohai Xu, Jiayi Liu, Ming Yan, Haotian Xu, Jinghui Si, Zhuoran Zhou, Peng Yi, Xing Gao, Jitao Sang, Rong Zhang, Ji Zhang, Chao Peng, Fei Huang, Jingren Zhou

In this paper, we present CValues, the first Chinese human values evaluation benchmark to measure the alignment ability of LLMs in terms of both safety and responsibility criteria.

BUS:Efficient and Effective Vision-language Pre-training with Bottom-Up Patch Summarization

no code implementations17 Jul 2023 Chaoya Jiang, Haiyang Xu, Wei Ye, Qinghao Ye, Chenliang Li, Ming Yan, Bin Bi, Shikun Zhang, Fei Huang, Songfang Huang

Specifically, We incorporate a Text-Semantics-Aware Patch Selector (TSPS) into the ViT backbone to perform a coarse-grained visual token extraction and then attach a flexible Transformer-based Patch Abstraction Decoder (PAD) upon the backbone for top-level visual abstraction.

Decoder Text Summarization

DecompEval: Evaluating Generated Texts as Unsupervised Decomposed Question Answering

1 code implementation13 Jul 2023 Pei Ke, Fei Huang, Fei Mi, Yasheng Wang, Qun Liu, Xiaoyan Zhu, Minlie Huang

Existing evaluation metrics for natural language generation (NLG) tasks face the challenges on generalization ability and interpretability.

Dialogue Generation nlg evaluation +3

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

1 code implementation4 Jul 2023 Jiabo Ye, Anwen Hu, Haiyang Xu, Qinghao Ye, Ming Yan, Yuhao Dan, Chenlin Zhao, Guohai Xu, Chenliang Li, Junfeng Tian, Qian Qi, Ji Zhang, Fei Huang

Nevertheless, without in-domain training, these models tend to ignore fine-grained OCR features, such as sophisticated tables or large blocks of text, which are essential for OCR-free document understanding.

document understanding Language Modeling +4

Preference Ranking Optimization for Human Alignment

1 code implementation30 Jun 2023 Feifan Song, Bowen Yu, Minghao Li, Haiyang Yu, Fei Huang, Yongbin Li, Houfeng Wang

In this manner, PRO effectively transforms human alignment into aligning the probability ranking of n responses generated by LLM with the preference ranking of humans towards these responses.

Unified Language Representation for Question Answering over Text, Tables, and Images

no code implementations29 Jun 2023 Bowen Yu, Cheng Fu, Haiyang Yu, Fei Huang, Yongbin Li

When trying to answer complex questions, people often rely on multiple sources of information, such as visual, textual, and tabular data.

Question Answering Retrieval

DiffDTM: A conditional structure-free framework for bioactive molecules generation targeted for dual proteins

no code implementations24 Jun 2023 Lei Huang, Zheng Yuan, Huihui Yan, Rong Sheng, Linjing Liu, Fuzhou Wang, Weidun Xie, Nanjun Chen, Fei Huang, Songfang Huang, Ka-Chun Wong, Yaoyun Zhang

However, molecule generation targeted for dual protein targets still faces formidable challenges including protein 3D structure data requisition for model training, auto-regressive sampling, and model generalization for unseen targets.

CATS: A Pragmatic Chinese Answer-to-Sequence Dataset with Large Scale and High Quality

no code implementations20 Jun 2023 Liang Li, Ruiying Geng, Chengyang Fang, Bing Li, Can Ma, Rongyu Cao, Binhua Li, Fei Huang, Yongbin Li

To alleviate these limitations, in this paper, we present CATS, a pragmatic Chinese answer-to-sequence dataset with large scale and high quality.

Universal Information Extraction with Meta-Pretrained Self-Retrieval

no code implementations18 Jun 2023 Xin Cong. Bowen Yu, Mengcheng Fang, Tingwen Liu, Haiyang Yu, Zhongkai Hu, Fei Huang, Yongbin Li, Bin Wang

Inspired by the fact that large amount of knowledge are stored in the pretrained language models~(PLM) and can be retrieved explicitly, in this paper, we propose MetaRetriever to retrieve task-specific knowledge from PLMs to enhance universal IE.

Retrieval

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks

1 code implementation7 Jun 2023 Haiyang Xu, Qinghao Ye, Xuan Wu, Ming Yan, Yuan Miao, Jiabo Ye, Guohai Xu, Anwen Hu, Yaya Shi, Guangwei Xu, Chenliang Li, Qi Qian, Maofei Que, Ji Zhang, Xiao Zeng, Fei Huang

In addition, to facilitate a comprehensive evaluation of video-language models, we carefully build the largest human-annotated Chinese benchmarks covering three popular video-language tasks of cross-modal retrieval, video captioning, and video category classification.

Cross-Modal Retrieval Language Modelling +4

Multimodal Recommendation Dialog with Subjective Preference: A New Challenge and Benchmark

no code implementations26 May 2023 Yuxing Long, Binyuan Hui, Caixia Yuan1, Fei Huang, Yongbin Li, Xiaojie Wang

Existing multimodal task-oriented dialog data fails to demonstrate the diverse expressions of user subjective preferences and recommendation acts in the real-life shopping scenario.

Diversity Multimodal Recommendation

NaSGEC: a Multi-Domain Chinese Grammatical Error Correction Dataset from Native Speaker Texts

1 code implementation25 May 2023 Yue Zhang, Bo Zhang, Haochen Jiang, Zhenghua Li, Chen Li, Fei Huang, Min Zhang

We introduce NaSGEC, a new dataset to facilitate research on Chinese grammatical error correction (CGEC) for native speaker texts from multiple domains.

Grammatical Error Correction

PaCE: Unified Multi-modal Dialogue Pre-training with Progressive and Compositional Experts

1 code implementation24 May 2023 Yunshui Li, Binyuan Hui, Zhichao Yin, Min Yang, Fei Huang, Yongbin Li

It utilizes a combination of several fundamental experts to accommodate multiple dialogue-related tasks and can be pre-trained using limited dialogue and extensive non-dialogue multi-modal data.

Dialogue State Tracking Image Retrieval +4

Optimizing Non-Autoregressive Transformers with Contrastive Learning

no code implementations23 May 2023 Chenxin An, Jiangtao Feng, Fei Huang, Xipeng Qiu, Lingpeng Kong

In this paper, we propose to ease the difficulty of modality learning via sampling from the model distribution instead of the data distribution.

Contrastive Learning Machine Translation +2

SpokenWOZ: A Large-Scale Speech-Text Benchmark for Spoken Task-Oriented Dialogue Agents

1 code implementation NeurIPS 2023 Shuzheng Si, Wentao Ma, Haoyu Gao, Yuchuan Wu, Ting-En Lin, Yinpei Dai, Hangyu Li, Rui Yan, Fei Huang, Yongbin Li

SpokenWOZ further incorporates common spoken characteristics such as word-by-word processing and reasoning in spoken language.

Iterative Forward Tuning Boosts In-Context Learning in Language Models

1 code implementation22 May 2023 Jiaxi Yang, Binyuan Hui, Min Yang, Bailin Wang, Bowen Li, Binhua Li, Fei Huang, Yongbin Li

Despite the advancements in in-context learning (ICL) for large language models (LLMs), current research centers on specific prompt engineering, such as demonstration selection, with the expectation that a single iteration of demonstrations processing can generalize effectively to a given test sample.

Decision Making In-Context Learning +2

Speech-Text Dialog Pre-training for Spoken Dialog Understanding with Explicit Cross-Modal Alignment

1 code implementation19 May 2023 Tianshu Yu, Haoyu Gao, Ting-En Lin, Min Yang, Yuchuan Wu, Wentao Ma, Chao Wang, Fei Huang, Yongbin Li

In this paper, we propose Speech-text dialog Pre-training for spoken dialog understanding with ExpliCiT cRoss-Modal Alignment (SPECTRA), which is the first-ever speech-text dialog pre-training model.

Ranked #2 on Multimodal Sentiment Analysis on CMU-MOSI (Acc-2 metric, using extra training data)

cross-modal alignment Emotion Recognition in Conversation +2

Causal Document-Grounded Dialogue Pre-training

1 code implementation18 May 2023 Yingxiu Zhao, Bowen Yu, Haiyang Yu, Bowen Li, Jinyang Li, Chao Wang, Fei Huang, Yongbin Li, Nevin L. Zhang

To tackle this issue, we are the first to present a causally-complete dataset construction strategy for building million-level DocGD pre-training corpora.

Knowledge Rumination for Pre-trained Language Models

1 code implementation15 May 2023 Yunzhi Yao, Peng Wang, Shengyu Mao, Chuanqi Tan, Fei Huang, Huajun Chen, Ningyu Zhang

Previous studies have revealed that vanilla pre-trained language models (PLMs) lack the capacity to handle knowledge-intensive NLP tasks alone; thus, several works have attempted to integrate external knowledge into PLMs.

Language Modeling Language Modelling

Distinguish Before Answer: Generating Contrastive Explanation as Knowledge for Commonsense Question Answering

no code implementations14 May 2023 Qianglong Chen, Guohai Xu, Ming Yan, Ji Zhang, Fei Huang, Luo Si, Yin Zhang

Existing knowledge-enhanced methods have achieved remarkable results in certain QA tasks via obtaining diverse knowledge from different knowledge bases.

Explanation Generation Question Answering

Domain Incremental Lifelong Learning in an Open World

1 code implementation11 May 2023 Yi Dai, Hao Lang, Yinhe Zheng, Bowen Yu, Fei Huang, Yongbin Li

Specifically, we dedicate task-level prompts to capture task-specific knowledge to retain high LL performances and maintain instance-level prompts to learn knowledge shared across input samples to improve the model's generalization performance.

Language Modelling Lifelong learning

Long-Tailed Question Answering in an Open World

1 code implementation11 May 2023 Yi Dai, Hao Lang, Yinhe Zheng, Fei Huang, Yongbin Li

A retrieve-then-rerank frame is further introduced to select in-context examples, which guild the LM to generate text that express knowledge for QA tasks.

Knowledge Distillation Language Modelling +1

GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark

no code implementations11 May 2023 Dongyang Li, Ruixue Ding, Qiang Zhang, Zheng Li, Boli Chen, Pengjun Xie, Yao Xu, Xin Li, Ning Guo, Fei Huang, Xiaofeng He

With a fast developing pace of geographic applications, automatable and intelligent models are essential to be designed to handle the large volume of information.

Entity Alignment Natural Language Understanding

Out-of-Domain Intent Detection Considering Multi-Turn Dialogue Contexts

no code implementations5 May 2023 Hao Lang, Yinhe Zheng, Binyuan Hui, Fei Huang, Yongbin Li

Out-of-Domain (OOD) intent detection is vital for practical dialogue systems, and it usually requires considering multi-turn dialogue contexts.

Intent Detection

A Survey on Out-of-Distribution Detection in NLP

no code implementations5 May 2023 Hao Lang, Yinhe Zheng, Yixuan Li, Jian Sun, Fei Huang, Yongbin Li

Out-of-distribution (OOD) detection is essential for the reliable and safe deployment of machine learning systems in the real world.

Out-of-Distribution Detection Out of Distribution (OOD) Detection +1

Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs

1 code implementation NeurIPS 2023 Jinyang Li, Binyuan Hui, Ge Qu, Jiaxi Yang, Binhua Li, Bowen Li, Bailin Wang, Bowen Qin, Rongyu Cao, Ruiying Geng, Nan Huo, Xuanhe Zhou, Chenhao Ma, Guoliang Li, Kevin C. C. Chang, Fei Huang, Reynold Cheng, Yongbin Li

Our emphasis on database values highlights the new challenges of dirty database contents, external knowledge between NL questions and database contents, and SQL efficiency, particularly in the context of massive databases.

SQL Parsing Text to SQL +1

Transforming Visual Scene Graphs to Image Captions

1 code implementation3 May 2023 Xu Yang, Jiawei Peng, Zihua Wang, Haiyang Xu, Qinghao Ye, Chenliang Li, Songfang Huang, Fei Huang, Zhangzikang Li, Yu Zhang

In TSG, we apply multi-head attention (MHA) to design the Graph Neural Network (GNN) for embedding scene graphs.

Attribute Decoder +3

Directed Acyclic Transformer Pre-training for High-quality Non-autoregressive Text Generation

1 code implementation24 Apr 2023 Fei Huang, Pei Ke, Minlie Huang

Non-AutoRegressive (NAR) text generation models have drawn much attention because of their significantly faster decoding speed and good generation quality in machine translation.

Machine Translation Text Generation

VECO 2.0: Cross-lingual Language Model Pre-training with Multi-granularity Contrastive Learning

no code implementations17 Apr 2023 Zhen-Ru Zhang, Chuanqi Tan, Songfang Huang, Fei Huang

Recent studies have demonstrated the potential of cross-lingual transferability by training a unified Transformer encoder for multiple languages.

Contrastive Learning Language Modeling +2

ChatPLUG: Open-Domain Generative Dialogue System with Internet-Augmented Instruction Tuning for Digital Human

1 code implementation16 Apr 2023 Junfeng Tian, Hehong Chen, Guohai Xu, Ming Yan, Xing Gao, Jianhai Zhang, Chenliang Li, Jiayi Liu, Wenshen Xu, Haiyang Xu, Qi Qian, Wei Wang, Qinghao Ye, Jiejing Zhang, Ji Zhang, Fei Huang, Jingren Zhou

In this paper, we present ChatPLUG, a Chinese open-domain dialogue system for digital human applications that instruction finetunes on a wide range of dialogue tasks in a unified internet-augmented format.

World Knowledge

RRHF: Rank Responses to Align Language Models with Human Feedback without tears

1 code implementation11 Apr 2023 Zheng Yuan, Hongyi Yuan, Chuanqi Tan, Wei Wang, Songfang Huang, Fei Huang

Reinforcement Learning from Human Feedback (RLHF) facilitates the alignment of large language models with human preferences, significantly enhancing the quality of interactions between humans and models.

Language Modelling Large Language Model

Cannot find the paper you are looking for? You can Submit a new open access paper.