Search Results for author: Wei Bi

Found 62 papers, 35 papers with code

Event Extraction as Machine Reading Comprehension

no code implementations EMNLP 2020 Jian Liu, Yubo Chen, Kang Liu, Wei Bi, Xiaojiang Liu

ii) Our model is excelled in the data-scarce scenario, for example, obtaining 49. 8{\%} in F1 for event argument extraction with only 1{\%} data, compared with 2. 2{\%} of the previous method.

Event Argument Extraction Event Extraction +5

GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers

1 code implementation29 Feb 2024 Qintong Li, Leyang Cui, Xueliang Zhao, Lingpeng Kong, Wei Bi

Large language models (LLMs) have achieved impressive performance across various mathematical reasoning benchmarks.

GSM8K Math +1

Retrieval is Accurate Generation

no code implementations27 Feb 2024 Bowen Cao, Deng Cai, Leyang Cui, Xuxin Cheng, Wei Bi, Yuexian Zou, Shuming Shi

To address this, we propose to initialize the training oracles using linguistic heuristics and, more importantly, bootstrap the oracles through iterative self-reinforcement.

Language Modelling Retrieval +1

FuseChat: Knowledge Fusion of Chat Models

1 code implementation25 Feb 2024 Fanqi Wan, ZiYi Yang, Longguang Zhong, Xiaojun Quan, Xinting Huang, Wei Bi

Recently, \textsc{FuseLLM} introduced the concept of knowledge fusion to transfer the collective knowledge of multiple structurally varied LLMs into a target LLM through lightweight continual training.

Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models

1 code implementation12 Feb 2024 Jiacheng Ye, Shansan Gong, Liheng Chen, Lin Zheng, Jiahui Gao, Han Shi, Chuan Wu, Zhenguo Li, Wei Bi, Lingpeng Kong

This work explores the integration of diffusion models and Chain-of-Thought (CoT), a well-established technique to improve the reasoning ability in autoregressive language models.

Math

Knowledge Fusion of Large Language Models

1 code implementation19 Jan 2024 Fanqi Wan, Xinting Huang, Deng Cai, Xiaojun Quan, Wei Bi, Shuming Shi

In this paper, we introduce the notion of knowledge fusion for LLMs, aimed at combining the capabilities of existing LLMs and transferring them into a single LLM.

Code Generation

Knowledge Verification to Nip Hallucination in the Bud

1 code implementation19 Jan 2024 Fanqi Wan, Xinting Huang, Leyang Cui, Xiaojun Quan, Wei Bi, Shuming Shi

While large language models (LLMs) have demonstrated exceptional performance across various tasks following human alignment, they may still generate responses that sound plausible but contradict factual knowledge, a phenomenon known as \emph{hallucination}.

Hallucination World Knowledge

Alleviating Hallucinations of Large Language Models through Induced Hallucinations

2 code implementations25 Dec 2023 Yue Zhang, Leyang Cui, Wei Bi, Shuming Shi

Experimental results on both discrimination-based and generation-based hallucination evaluation benchmarks, such as TruthfulQA and \textsc{FActScore}, demonstrate that our proposed ICD methods can effectively enhance the factuality of LLMs across various model sizes and families.

Hallucination Hallucination Evaluation

Collaborative Evaluation: Exploring the Synergy of Large Language Models and Humans for Open-ended Generation Evaluation

1 code implementation30 Oct 2023 Qintong Li, Leyang Cui, Lingpeng Kong, Wei Bi

To explore the synergy between humans and LLM-based evaluators and address the challenges of existing inconsistent evaluation criteria in open-ended NLG tasks, we propose a Collaborative Evaluation pipeline CoEval, involving the design of a checklist of task-specific criteria and the detailed evaluation of texts, in which LLM generates initial ideation, and then humans engage in scrutiny.

Text Generation

TRAMS: Training-free Memory Selection for Long-range Language Modeling

1 code implementation24 Oct 2023 Haofei Yu, Cunxiang Wang, Yue Zhang, Wei Bi

The Transformer architecture is crucial for numerous AI models, but it still faces challenges in long-range language modeling.

Language Modelling

SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving

no code implementations19 Oct 2023 Xueliang Zhao, Xinting Huang, Wei Bi, Lingpeng Kong

Large Language Models (LLMs) have driven substantial progress in artificial intelligence in recent years, exhibiting impressive capabilities across a wide range of tasks, including mathematical problem-solving.

GSM8K Math

Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active Exploration

1 code implementation13 Oct 2023 Fanqi Wan, Xinting Huang, Tao Yang, Xiaojun Quan, Wei Bi, Shuming Shi

Instruction-tuning can be substantially optimized through enhanced diversity, resulting in models capable of handling a broader spectrum of tasks.

Retrieval-Generation Alignment for End-to-End Task-Oriented Dialogue System

1 code implementation13 Oct 2023 Weizhou Shen, Yingqi Gao, Canbin Huang, Fanqi Wan, Xiaojun Quan, Wei Bi

The results demonstrate that when combined with meta knowledge, the response generator can effectively leverage high-quality knowledge records from the retriever and enhance the quality of generated responses.

Response Generation Retrieval +1

RobustGEC: Robust Grammatical Error Correction Against Subtle Context Perturbation

1 code implementation11 Oct 2023 Yue Zhang, Leyang Cui, Enbo Zhao, Wei Bi, Shuming Shi

In this paper, we introduce RobustGEC, a benchmark designed to evaluate the context robustness of GEC systems.

Grammatical Error Correction Sentence

A Benchmark for Text Expansion: Datasets, Metrics, and Baselines

no code implementations17 Sep 2023 Yi Chen, Haiyun Jiang, Wei Bi, Rui Wang, Longyue Wang, Shuming Shi, Ruifeng Xu

This work presents a new task of Text Expansion (TE), which aims to insert fine-grained modifiers into proper locations of the plain text to concretize or vivify human writings.

Informativeness Text Infilling

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

1 code implementation3 Sep 2023 Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, Longyue Wang, Anh Tuan Luu, Wei Bi, Freda Shi, Shuming Shi

While large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks, a significant concern revolves around their propensity to exhibit hallucinations: LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge.

Hallucination World Knowledge

Pre-training Multi-party Dialogue Models with Latent Discourse Inference

1 code implementation24 May 2023 Yiyang Li, Xinting Huang, Wei Bi, Hai Zhao

Multi-party dialogues are more difficult for models to understand than one-to-one two-party dialogues, since they involve multiple interlocutors, resulting in interweaving reply-to relations and information flows.

Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A Preliminary Study on Writing Assistance

no code implementations22 May 2023 Yue Zhang, Leyang Cui, Deng Cai, Xinting Huang, Tao Fang, Wei Bi

Proprietary Large Language Models (LLMs), such as ChatGPT, have garnered significant attention due to their exceptional capabilities in handling a diverse range of tasks.

Instruction Following

Deepfake Text Detection in the Wild

1 code implementation22 May 2023 Yafu Li, Qintong Li, Leyang Cui, Wei Bi, Longyue Wang, Linyi Yang, Shuming Shi, Yue Zhang

In practical scenarios, the detector faces texts from various domains or LLMs without knowing their sources.

Face Swapping Story Generation +1

A Frustratingly Simple Decoding Method for Neural Text Generation

1 code implementation22 May 2023 Haoran Yang, Deng Cai, Huayang Li, Wei Bi, Wai Lam, Shuming Shi

We introduce a frustratingly simple, super efficient and surprisingly effective decoding method, which we call Frustratingly Simple Decoding (FSD), for neural text generation.

Language Modelling Text Generation

Multi-Grained Knowledge Retrieval for End-to-End Task-Oriented Dialog

1 code implementation17 May 2023 Fanqi Wan, Weizhou Shen, Ke Yang, Xiaojun Quan, Wei Bi

Retrieving proper domain knowledge from an external database lies at the heart of end-to-end task-oriented dialog systems to generate informative responses.

Attribute Response Generation +1

Explanation Regeneration via Information Bottleneck

1 code implementation19 Dec 2022 Qintong Li, Zhiyong Wu, Lingpeng Kong, Wei Bi

Explaining the black-box predictions of NLP models naturally and accurately is an important open problem in natural language generation.

Explanation Generation Language Modelling +2

Effidit: Your AI Writing Assistant

no code implementations3 Aug 2022 Shuming Shi, Enbo Zhao, Duyu Tang, Yan Wang, Piji Li, Wei Bi, Haiyun Jiang, Guoping Huang, Leyang Cui, Xinting Huang, Cong Zhou, Yong Dai, Dongyang Ma

In Effidit, we significantly expand the capacities of a writing assistant by providing functions in five categories: text completion, error checking, text polishing, keywords to sentences (K2S), and cloud input methods (cloud IME).

Keywords to Sentences Retrieval +3

Spatial Entropy as an Inductive Bias for Vision Transformers

1 code implementation9 Jun 2022 Elia Peruzzo, Enver Sangineto, Yahui Liu, Marco De Nadai, Wei Bi, Bruno Lepri, Nicu Sebe

In this work, we propose a different and complementary direction, in which a local bias is introduced using an auxiliary self-supervised task, performed jointly with standard supervised training.

Inductive Bias Semantic Segmentation

Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision Transformers

1 code implementation CVPR 2023 Bin Ren, Yahui Liu, Yue Song, Wei Bi, Rita Cucchiara, Nicu Sebe, Wei Wang

In particular, MJP first shuffles the selected patches via our block-wise random jigsaw puzzle shuffle algorithm, and their corresponding PEs are occluded.

Federated Learning Position

Lexical Knowledge Internalization for Neural Dialog Generation

1 code implementation ACL 2022 Zhiyong Wu, Wei Bi, Xiang Li, Lingpeng Kong, Ben Kao

We propose knowledge internalization (KI), which aims to complement the lexical knowledge into neural dialog models.

Contrastive Learning

A Model-Agnostic Data Manipulation Method for Persona-based Dialogue Generation

1 code implementation ACL 2022 Yu Cao, Wei Bi, Meng Fang, Shuming Shi, DaCheng Tao

To alleviate the above data issues, we propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model to improve its performance.

Dialogue Generation

Event Transition Planning for Open-ended Text Generation

1 code implementation Findings (ACL) 2022 Qintong Li, Piji Li, Wei Bi, Zhaochun Ren, Yuxuan Lai, Lingpeng Kong

Open-ended text generation tasks, such as dialogue generation and story completion, require models to generate a coherent continuation given limited preceding context.

Dialogue Generation Story Completion

Efficient Training of Visual Transformers with Small Datasets

1 code implementation NeurIPS 2021 Yahui Liu, Enver Sangineto, Wei Bi, Nicu Sebe, Bruno Lepri, Marco De Nadai

This task encourages the VTs to learn spatial relations within an image and makes the VT training much more robust when training data are scarce.

Inductive Bias

REAM$\sharp$: An Enhancement Approach to Reference-based Evaluation Metrics for Open-domain Dialog Generation

no code implementations30 May 2021 Jun Gao, Wei Bi, Ruifeng Xu, Shuming Shi

We first clarify an assumption on reference-based metrics that, if more high-quality references are added into the reference set, the reliability of the metric will increase.

Open-Domain Dialog

Good for Misconceived Reasons: An Empirical Revisiting on the Need for Visual Context in Multimodal Machine Translation

no code implementations ACL 2021 Zhiyong Wu, Lingpeng Kong, Wei Bi, Xiang Li, Ben Kao

A neural multimodal machine translation (MMT) system is one that aims to perform better translation by extending conventional text-only translation models with multimodal information.

Multimodal Machine Translation Translation

Learning from My Friends: Few-Shot Personalized Conversation Systems via Social Networks

no code implementations21 May 2021 Zhiliang Tian, Wei Bi, Zihan Zhang, Dongkyu Lee, Yiping Song, Nevin L. Zhang

The task requires models to generate personalized responses for a speaker given a few conversations from the speaker and a social network.

Meta-Learning

Predicting Events in MOBA Games: Prediction, Attribution, and Evaluation

no code implementations17 Dec 2020 Zelong Yang, Yan Wang, Piji Li, Shaobin Lin, Shuming Shi, Shao-Lun Huang, Wei Bi

The multiplayer online battle arena (MOBA) games have become increasingly popular in recent years.

TableGPT: Few-shot Table-to-Text Generation with Table Structure Reconstruction and Content Matching

1 code implementation COLING 2020 Heng Gong, Yawei Sun, Xiaocheng Feng, Bing Qin, Wei Bi, Xiaojiang Liu, Ting Liu

Although neural table-to-text models have achieved remarkable progress with the help of large-scale datasets, they suffer insufficient learning problem with limited training data.

Few-Shot Learning Language Modelling +2

Dual Dynamic Memory Network for End-to-End Multi-turn Task-oriented Dialog Systems

1 code implementation COLING 2020 Jian Wang, Junhao Liu, Wei Bi, Xiaojiang Liu, Kejing He, Ruifeng Xu, Min Yang

To conquer these limitations, we propose a Dual Dynamic Memory Network (DDMN) for multi-turn dialog generation, which maintains two core components: dialog memory manager and KB memory manager.

Response-Anticipated Memory for On-Demand Knowledge Integration in Response Generation

no code implementations ACL 2020 Zhiliang Tian, Wei Bi, Dongkyu Lee, Lanqing Xue, Yiping Song, Xiaojiang Liu, Nevin L. Zhang

In previous work, the external document is utilized by (1) creating a context-aware document memory that integrates information from the document and the conversational context, and then (2) generating responses referring to the memory.

Informativeness Response Generation

A Batch Normalized Inference Network Keeps the KL Vanishing Away

1 code implementation ACL 2020 Qile Zhu, Jianlin Su, Wei Bi, Xiaojiang Liu, Xiyao Ma, Xiaolin Li, Dapeng Wu

Variational Autoencoder (VAE) is widely used as a generative model to approximate a model's posterior on latent variables by combining the amortized variational inference and deep neural networks.

Dialogue Generation Language Modelling +3

Learning to Select Bi-Aspect Information for Document-Scale Text Content Manipulation

1 code implementation24 Feb 2020 Xiaocheng Feng, Yawei Sun, Bing Qin, Heng Gong, Yibo Sun, Wei Bi, Xiaojiang Liu, Ting Liu

In this paper, we focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer and aims to preserve text styles while altering the content.

Sentence Style Transfer +2

Improving Knowledge-aware Dialogue Generation via Knowledge Base Question Answering

1 code implementation16 Dec 2019 Jian Wang, Junhao Liu, Wei Bi, Xiaojiang Liu, Kejing He, Ruifeng Xu, Min Yang

In this paper, we propose a novel knowledge-aware dialogue generation model (called TransDG), which transfers question representation and knowledge matching abilities from knowledge base question answering (KBQA) task to facilitate the utterance understanding and factual knowledge selection for dialogue generation.

Dialogue Generation Knowledge Base Question Answering +1

Relevance-Promoting Language Model for Short-Text Conversation

no code implementations26 Nov 2019 Xin Li, Piji Li, Wei Bi, Xiaojiang Liu, Wai Lam

In this paper, we propose to formulate the STC task as a language modeling problem and tailor-make a training strategy to adapt a language model for response generation.

Language Modelling Response Generation +1

A Discrete CVAE for Response Generation on Short-Text Conversation

no code implementations IJCNLP 2019 Jun Gao, Wei Bi, Xiaojiang Liu, Junhui Li, Guodong Zhou, Shuming Shi

In this paper, we introduce a discrete latent variable with an explicit semantic meaning to improve the CVAE on short-text conversation.

Response Generation Short-Text Conversation +1

Retrieval-guided Dialogue Response Generation via a Matching-to-Generation Framework

no code implementations IJCNLP 2019 Deng Cai, Yan Wang, Wei Bi, Zhaopeng Tu, Xiaojiang Liu, Shuming Shi

End-to-end sequence generation is a popular technique for developing open domain dialogue systems, though they suffer from the \textit{safe response problem}.

Response Generation Retrieval

Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks

1 code implementation ACL 2020 Yiping Song, Zequn Liu, Wei Bi, Rui Yan, Ming Zhang

Training the generative models with minimal corpus is one of the critical challenges for building open-domain dialogue systems.

Dialogue Generation Language Modelling +1

Subword ELMo

no code implementations18 Sep 2019 Jiangtong Li, Hai Zhao, Zuchao Li, Wei Bi, Xiaojiang Liu

Embedding from Language Models (ELMo) has shown to be effective for improving many natural language processing (NLP) tasks, and ELMo takes character information to compose word representation to train language models. However, the character is an insufficient and unnatural linguistic unit for word representation. Thus we introduce Embedding from Subword-aware Language Models (ESuLMo) which learns word representation from subwords using unsupervised segmentation over words. We show that ESuLMo can enhance four benchmark NLP tasks more effectively than ELMo, including syntactic dependency parsing, semantic role labeling, implicit discourse relation recognition and textual entailment, which brings a meaningful improvement over ELMo.

Dependency Parsing Natural Language Inference +1

Fine-Grained Sentence Functions for Short-Text Conversation

no code implementations ACL 2019 Wei Bi, Jun Gao, Xiaojiang Liu, Shuming Shi

Classification models are trained on this dataset to (i) recognize the sentence function of new data in a large corpus of short-text conversations; (ii) estimate a proper sentence function of the response given a test query.

Information Retrieval Retrieval +2

Are Training Samples Correlated? Learning to Generate Dialogue Responses with Multiple References

no code implementations ACL 2019 Lisong Qiu, Juntao Li, Wei Bi, Dongyan Zhao, Rui Yan

Due to its potential applications, open-domain dialogue generation has become popular and achieved remarkable progress in recent years, but sometimes suffers from generic responses.

Dialogue Generation valid

Unsupervised Rewriter for Multi-Sentence Compression

no code implementations ACL 2019 Yang Zhao, Xiaoyu Shen, Wei Bi, Akiko Aizawa

First, the word graph approach that simply concatenates fragments from multiple sentences may yield non-fluent or ungrammatical compression.

Sentence Sentence Compression

Learning to Abstract for Memory-augmented Conversational Response Generation

1 code implementation ACL 2019 Zhiliang Tian, Wei Bi, Xiaopeng Li, Nevin L. Zhang

In this work, we propose a memory-augmented generative model, which learns to abstract from the training corpus and saves the useful information to the memory to assist the response generation.

Conversational Response Generation Informativeness +2

Generating Multiple Diverse Responses for Short-Text Conversation

no code implementations14 Nov 2018 Jun Gao, Wei Bi, Xiaojiang Liu, Junhui Li, Shuming Shi

In this paper, we propose a novel response generation model, which considers a set of responses jointly and generates multiple diverse responses simultaneously.

Informativeness Response Generation +1

Towards Less Generic Responses in Neural Conversation Models: A Statistical Re-weighting Method

1 code implementation EMNLP 2018 Yahui Liu, Wei Bi, Jun Gao, Xiaojiang Liu, Jian Yao, Shuming Shi

We observe that in the conversation tasks, each query could have multiple responses, which forms a 1-to-n or m-to-n relationship in the view of the total corpus.

Dialogue Generation Machine Translation +1

Language Style Transfer from Sentences with Arbitrary Unknown Styles

no code implementations13 Aug 2018 Yanpeng Zhao, Wei Bi, Deng Cai, Xiaojiang Liu, Kewei Tu, Shuming Shi

Then, by recombining the content with the target style, we decode a sentence aligned in the target domain.

Sentence Sentence ReWriting +1

Mandatory Leaf Node Prediction in Hierarchical Multilabel Classification

no code implementations NeurIPS 2012 Wei Bi, James T. Kwok

However, while there have been a lot of MLNP methods in hierarchical multiclass classification, performing MLNP in hierarchical multilabel classification is much more difficult.

Classification General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.