Search Results for author: Xipeng Qiu

Found 223 papers, 125 papers with code

Star-Transformer

2 code implementations NAACL 2019 Qipeng Guo, Xipeng Qiu, PengFei Liu, Yunfan Shao, xiangyang xue, Zheng Zhang

Although Transformer has achieved great successes on many NLP tasks, its heavy structure with fully-connected attention connections leads to dependencies on large training data.

Named Entity Recognition (NER) Natural Language Inference +2

L-Eval: Instituting Standardized Evaluation for Long Context Language Models

3 code implementations20 Jul 2023 Chenxin An, Shansan Gong, Ming Zhong, Xingjian Zhao, Mukai Li, Jun Zhang, Lingpeng Kong, Xipeng Qiu

Recently, there has been growing interest in extending the context length of large language models (LLMs), aiming to effectively process long inputs of one turn or conversations with more extensive histories.

Instruction Following

InternLM2 Technical Report

1 code implementation26 Mar 2024 Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang, Penglong Jiao, Zhenjiang Jin, Zhikai Lei, Jiaxing Li, Jingwen Li, Linyang Li, Shuaibin Li, Wei Li, Yining Li, Hongwei Liu, Jiangning Liu, Jiawei Hong, Kaiwen Liu, Kuikun Liu, Xiaoran Liu, Chengqi Lv, Haijun Lv, Kai Lv, Li Ma, Runyuan Ma, Zerun Ma, Wenchang Ning, Linke Ouyang, Jiantao Qiu, Yuan Qu, FuKai Shang, Yunfan Shao, Demin Song, Zifan Song, Zhihao Sui, Peng Sun, Yu Sun, Huanze Tang, Bin Wang, Guoteng Wang, Jiaqi Wang, Jiayu Wang, Rui Wang, Yudong Wang, Ziyi Wang, Xingjian Wei, Qizhen Weng, Fan Wu, Yingtong Xiong, Chao Xu, Ruiliang Xu, Hang Yan, Yirong Yan, Xiaogui Yang, Haochen Ye, Huaiyuan Ying, JIA YU, Jing Yu, Yuhang Zang, Chuyu Zhang, Li Zhang, Pan Zhang, Peng Zhang, Ruijie Zhang, Shuo Zhang, Songyang Zhang, Wenjian Zhang, Wenwei Zhang, Xingcheng Zhang, Xinyue Zhang, Hui Zhao, Qian Zhao, Xiaomeng Zhao, Fengzhe Zhou, Zaida Zhou, Jingming Zhuo, Yicheng Zou, Xipeng Qiu, Yu Qiao, Dahua Lin

The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI).

4k Long-Context Understanding

TENER: Adapting Transformer Encoder for Named Entity Recognition

7 code implementations10 Nov 2019 Hang Yan, Bocao Deng, Xiaonan Li, Xipeng Qiu

The Bidirectional long short-term memory networks (BiLSTM) have been widely used as an encoder in models solving the named entity recognition (NER) task.

Chinese Named Entity Recognition Named Entity Recognition

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

4 code implementations EMNLP 2020 Linyang Li, Ruotian Ma, Qipeng Guo, xiangyang xue, Xipeng Qiu

Adversarial attacks for discrete data (such as texts) have been proved significantly more challenging than continuous data (such as images) since it is difficult to generate adversarial samples with gradient-based methods.

Adversarial Attack

Pre-trained Models for Natural Language Processing: A Survey

3 code implementations18 Mar 2020 Xipeng Qiu, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai, Xuanjing Huang

Recently, the emergence of pre-trained models (PTMs) has brought natural language processing (NLP) to a new era.

Representation Learning

FLAT: Chinese NER Using Flat-Lattice Transformer

1 code implementation ACL 2020 Xiaonan Li, Hang Yan, Xipeng Qiu, Xuanjing Huang

Recently, the character-word lattice structure has been proved to be effective for Chinese named entity recognition (NER) by incorporating the word information.

Chinese Named Entity Recognition named-entity-recognition +3

Full Parameter Fine-tuning for Large Language Models with Limited Resources

1 code implementation16 Jun 2023 Kai Lv, Yuqing Yang, Tengxiao Liu, Qinghui Gao, Qipeng Guo, Xipeng Qiu

Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP) but demand massive GPU resources for training.

AdaLomo: Low-memory Optimization with Adaptive Learning Rate

1 code implementation16 Oct 2023 Kai Lv, Hang Yan, Qipeng Guo, Haijun Lv, Xipeng Qiu

Building on this insight, we introduce the low-memory optimization with adaptive learning rate (AdaLomo), which offers an adaptive learning rate for each parameter.

CDEvalSumm: An Empirical Study of Cross-Dataset Evaluation for Neural Summarization Systems

2 code implementations Findings of the Association for Computational Linguistics 2020 Yiran Chen, PengFei Liu, Ming Zhong, Zi-Yi Dou, Danqing Wang, Xipeng Qiu, Xuanjing Huang

In this paper, we perform an in-depth analysis of characteristics of different datasets and investigate the performance of different summarization models under a cross-dataset setting, in which a summarizer trained on one corpus will be evaluated on a range of out-of-domain corpora.

Text Summarization

SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities

1 code implementation18 May 2023 Dong Zhang, ShiMin Li, Xin Zhang, Jun Zhan, Pengyu Wang, Yaqian Zhou, Xipeng Qiu

Multi-modal large language models are regarded as a crucial step towards Artificial General Intelligence (AGI) and have garnered significant interest with the emergence of ChatGPT.

Language Modelling Large Language Model +2

SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation

1 code implementation24 Jan 2024 Dong Zhang, Xin Zhang, Jun Zhan, ShiMin Li, Yaqian Zhou, Xipeng Qiu

It comprises an autoregressive model based on LLM for semantic information modeling and a non-autoregressive model employing flow matching for perceptual information modeling.

Voice Conversion

SpeechAlign: Aligning Speech Generation to Human Preferences

2 code implementations8 Apr 2024 Dong Zhang, Zhaowei Li, ShiMin Li, Xin Zhang, Pengyu Wang, Yaqian Zhou, Xipeng Qiu

However, the integration of human feedback to align speech outputs to human preferences is often neglected.

Language Modelling

fastHan: A BERT-based Multi-Task Toolkit for Chinese NLP

1 code implementation ACL 2021 Zhichao Geng, Hang Yan, Xipeng Qiu, Xuanjing Huang

The joint-model is trained and evaluated on 13 corpora of four tasks, yielding near state-of-the-art (SOTA) performance in dependency parsing and NER, achieving SOTA performance in CWS and POS.

Chinese Word Segmentation Dependency Parsing +6

How to Fine-Tune BERT for Text Classification?

16 code implementations14 May 2019 Chi Sun, Xipeng Qiu, Yige Xu, Xuanjing Huang

Language model pre-training has proven to be useful in learning universal language representations.

General Classification Language Modelling +2

EduChat: A Large-Scale Language Model-based Chatbot System for Intelligent Education

1 code implementation5 Aug 2023 Yuhao Dan, Zhikai Lei, Yiyang Gu, Yong Li, Jianghao Yin, Jiaju Lin, Linhao Ye, Zhiyan Tie, Yougen Zhou, Yilei Wang, Aimin Zhou, Ze Zhou, Qin Chen, Jie zhou, Liang He, Xipeng Qiu

Currently, EduChat is available online as an open-source project, with its code, data, and model parameters available on platforms (e. g., GitHub https://github. com/icalk-nlp/EduChat, Hugging Face https://huggingface. co/ecnu-icalk ).

Chatbot Language Modelling +1

CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation

1 code implementation13 Sep 2021 Yunfan Shao, Zhichao Geng, Yitao Liu, Junqi Dai, Hang Yan, Fei Yang, Li Zhe, Hujun Bao, Xipeng Qiu

In this paper, we take the advantage of previous pre-trained models (PTMs) and propose a novel Chinese Pre-trained Unbalanced Transformer (CPT).

Denoising Language Modelling +3

Evaluating the Performance of Large Language Models on GAOKAO Benchmark

1 code implementation21 May 2023 Xiaotian Zhang, Chunyang Li, Yi Zong, Zhengyu Ying, Liang He, Xipeng Qiu

Large Language Models(LLMs) have demonstrated remarkable performance across various natural language processing tasks; however, how to comprehensively and accurately assess their performance becomes an urgent issue to be addressed.

AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

1 code implementation19 Feb 2024 Jun Zhan, Junqi Dai, Jiasheng Ye, Yunhua Zhou, Dong Zhang, Zhigeng Liu, Xin Zhang, Ruibin Yuan, Ge Zhang, Linyang Li, Hang Yan, Jie Fu, Tao Gui, Tianxiang Sun, Yugang Jiang, Xipeng Qiu

We introduce AnyGPT, an any-to-any multimodal language model that utilizes discrete representations for the unified processing of various modalities, including speech, text, images, and music.

Language Modelling Large Language Model

CoNT: Contrastive Neural Text Generation

2 code implementations29 May 2022 Chenxin An, Jiangtao Feng, Kai Lv, Lingpeng Kong, Xipeng Qiu, Xuanjing Huang

We validate CoNT on five generation tasks with ten benchmarks, including machine translation, summarization, code comment generation, data-to-text generation and commonsense generation.

Code Comment Generation Comment Generation +4

CoLLiE: Collaborative Training of Large Language Models in an Efficient Way

1 code implementation1 Dec 2023 Kai Lv, Shuo Zhang, Tianle Gu, Shuhao Xing, Jiawei Hong, Keyu Chen, Xiaoran Liu, Yuqing Yang, Honglin Guo, Tengxiao Liu, Yu Sun, Qipeng Guo, Hang Yan, Xipeng Qiu

This paper introduces CoLLiE, an efficient library that facilitates collaborative training of large language models using 3D parallelism, parameter-efficient fine-tuning (PEFT) methods, and optimizers such as Lion, Adan, Sophia, LOMO and AdaLomo.

Character-LLM: A Trainable Agent for Role-Playing

1 code implementation16 Oct 2023 Yunfan Shao, Linyang Li, Junqi Dai, Xipeng Qiu

Large language models (LLMs) can be used to serve as agents to simulate human behaviors, given the powerful ability to understand human instructions and provide high-quality generated texts.

SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models

3 code implementations31 Aug 2023 Xin Zhang, Dong Zhang, ShiMin Li, Yaqian Zhou, Xipeng Qiu

Therefore, we propose SpeechTokenizer, a unified speech tokenizer for speech large language models.

Language Modelling Quantization

Black-Box Tuning for Language-Model-as-a-Service

2 code implementations10 Jan 2022 Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu

In such a scenario, which we call Language-Model-as-a-Service (LMaaS), the gradients of PTMs are usually unavailable.

In-Context Learning Language Modelling

BBTv2: Towards a Gradient-Free Future with Large Language Models

1 code implementation23 May 2022 Tianxiang Sun, Zhengfu He, Hong Qian, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu

By contrast, gradient-free methods only require the forward computation of the PTM to tune the prompt, retaining the benefits of efficient tuning and deployment.

Few-Shot Learning Language Modelling

Heterogeneous Graph Neural Networks for Extractive Document Summarization

1 code implementation ACL 2020 Danqing Wang, PengFei Liu, Yining Zheng, Xipeng Qiu, Xuanjing Huang

An intuitive way is to put them in the graph-based neural network, which has a more complex structure for capturing inter-sentence relationships.

Document Summarization Extractive Document Summarization +3

A Unified Generative Framework for Various NER Subtasks

1 code implementation ACL 2021 Hang Yan, Tao Gui, Junqi Dai, Qipeng Guo, Zheng Zhang, Xipeng Qiu

To that end, we propose to formulate the NER subtasks as an entity span sequence generation task, which can be solved by a unified sequence-to-sequence (Seq2Seq) framework.

named-entity-recognition Named Entity Recognition +2

InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning

1 code implementation9 Feb 2024 Huaiyuan Ying, Shuo Zhang, Linyang Li, Zhejian Zhou, Yunfan Shao, Zhaoye Fei, Yichuan Ma, Jiawei Hong, Kuikun Liu, Ziyi Wang, Yudong Wang, Zijian Wu, Shuaibin Li, Fengzhe Zhou, Hongwei Liu, Songyang Zhang, Wenwei Zhang, Hang Yan, Xipeng Qiu, Jiayu Wang, Kai Chen, Dahua Lin

We further explore how to use LEAN to solve math problems and study its performance under the setting of multi-task learning which shows the possibility of using LEAN as a unified platform for solving and proving in math.

Data Augmentation GSM8K +3

Training-Free Long-Context Scaling of Large Language Models

1 code implementation27 Feb 2024 Chenxin An, Fei Huang, Jun Zhang, Shansan Gong, Xipeng Qiu, Chang Zhou, Lingpeng Kong

The ability of Large Language Models (LLMs) to process and generate coherent text is markedly weakened when the number of input tokens exceeds their pretraining length.

16k

Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information

1 code implementation EMNLP 2020 Zehui Lin, Xiao Pan, Mingxuan Wang, Xipeng Qiu, Jiangtao Feng, Hao Zhou, Lei LI

We investigate the following question for machine translation (MT): can we develop a single universal MT model to serve as the common seed and obtain derivative and improved models on arbitrary language pairs?

Ranked #3 on Machine Translation on WMT2014 English-French (using extra training data)

Machine Translation Translation

A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond

1 code implementation21 Mar 2024 Qiushi Sun, Zhirui Chen, Fangzhi Xu, Kanzhi Cheng, Chang Ma, Zhangyue Yin, Jianing Wang, Chengcheng Han, Renyu Zhu, Shuai Yuan, Qipeng Guo, Xipeng Qiu, Pengcheng Yin, XiaoLi Li, Fei Yuan, Lingpeng Kong, Xiang Li, Zhiyong Wu

Building on our examination of the developmental trajectories, we further investigate the emerging synergies between code intelligence and broader machine intelligence, uncovering new cross-domain opportunities and illustrating the substantial influence of code intelligence across various domains.

Reinforced Mnemonic Reader for Machine Reading Comprehension

3 code implementations8 May 2017 Minghao Hu, Yuxing Peng, Zhen Huang, Xipeng Qiu, Furu Wei, Ming Zhou

In this paper, we introduce the Reinforced Mnemonic Reader for machine reading comprehension tasks, which enhances previous attentive readers in two aspects.

Machine Reading Comprehension Question Answering +2

Does syntax matter? A strong baseline for Aspect-based Sentiment Analysis with RoBERTa

1 code implementation NAACL 2021 Junqi Dai, Hang Yan, Tianxiang Sun, PengFei Liu, Xipeng Qiu

In this paper, we firstly compare the induced trees from PTMs and the dependency parsing trees on several popular models for the ABSA task, showing that the induced tree from fine-tuned RoBERTa (FT-RoBERTa) outperforms the parser-provided tree.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1

Overview of the NLPCC 2017 Shared Task: Chinese News Headline Categorization

1 code implementation9 Jun 2017 Xipeng Qiu, Jingjing Gong, Xuanjing Huang

In this paper, we give an overview for the shared task at the CCF Conference on Natural Language Processing \& Chinese Computing (NLPCC 2017): Chinese News Headline Categorization.

CoLAKE: Contextualized Language and Knowledge Embedding

1 code implementation COLING 2020 Tianxiang Sun, Yunfan Shao, Xipeng Qiu, Qipeng Guo, Yaru Hu, Xuanjing Huang, Zheng Zhang

With the emerging branch of incorporating factual knowledge into pre-trained language models such as BERT, most existing models consider shallow, static, and separately pre-trained entity embeddings, which limits the performance gains of these models.

Entity Embeddings Knowledge Graph Completion +1

QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization

1 code implementation NAACL 2021 Ming Zhong, Da Yin, Tao Yu, Ahmad Zaidi, Mutethia Mutuma, Rahul Jha, Ahmed Hassan Awadallah, Asli Celikyilmaz, Yang Liu, Xipeng Qiu, Dragomir Radev

As increasing numbers of meetings are recorded and transcribed, meeting summaries have become essential to remind those who may or may not have attended the meetings about the key decisions made and the tasks to be completed.

Meeting Summarization

Learning Sparse Sharing Architectures for Multiple Tasks

1 code implementation12 Nov 2019 Tianxiang Sun, Yunfan Shao, Xiaonan Li, PengFei Liu, Hang Yan, Xipeng Qiu, Xuanjing Huang

Most existing deep multi-task learning models are based on parameter sharing, such as hard sharing, hierarchical sharing, and soft sharing.

Multi-Task Learning

CycleGT: Unsupervised Graph-to-Text and Text-to-Graph Generation via Cycle Training

2 code implementations ACL (WebNLG, INLG) 2020 Qipeng Guo, Zhijing Jin, Xipeng Qiu, Wei-Nan Zhang, David Wipf, Zheng Zhang

Due to the difficulty and high cost of data collection, the supervised data available in the two fields are usually on the magnitude of tens of thousands, for example, 18K in the WebNLG~2017 dataset after preprocessing, which is far fewer than the millions of data for other tasks such as machine translation.

Graph Generation Knowledge Graphs +2

Fork or Fail: Cycle-Consistent Training with Many-to-One Mappings

1 code implementation14 Dec 2020 Qipeng Guo, Zhijing Jin, Ziyu Wang, Xipeng Qiu, Weinan Zhang, Jun Zhu, Zheng Zhang, David Wipf

Cycle-consistent training is widely used for jointly learning a forward and inverse mapping between two domains of interest without the cumbersome requirement of collecting matched pairs within each domain.

Knowledge Graphs Text Generation

Do Large Language Models Know What They Don't Know?

1 code implementation29 May 2023 Zhangyue Yin, Qiushi Sun, Qipeng Guo, Jiawen Wu, Xipeng Qiu, Xuanjing Huang

Large language models (LLMs) have a wealth of knowledge that allows them to excel in various Natural Language Processing (NLP) tasks.

In-Context Learning

SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems

1 code implementation8 Jan 2024 Dong Zhang, Zhaowei Li, Pengyu Wang, Xin Zhang, Yaqian Zhou, Xipeng Qiu

In this paper, we propose SpeechAgents, a multi-modal LLM based multi-agent system designed for simulating human communication.

Language Modelling Large Language Model

Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance

1 code implementation25 Mar 2024 Jiasheng Ye, Peiju Liu, Tianxiang Sun, Yunhua Zhou, Jun Zhan, Xipeng Qiu

Pretraining data of large language models composes multiple domains (e. g., web texts, academic papers, codes), whose mixture proportions crucially impact the competence of outcome models.

Language Modelling

Soft-Labeled Contrastive Pre-training for Function-level Code Representation

1 code implementation18 Oct 2022 Xiaonan Li, Daya Guo, Yeyun Gong, Yun Lin, Yelong Shen, Xipeng Qiu, Daxin Jiang, Weizhu Chen, Nan Duan

In this paper, we present \textbf{SCodeR}, a \textbf{S}oft-labeled contrastive pre-training framework with two positive sample construction methods to learn functional-level \textbf{Code} \textbf{R}epresentation.

Scaling Laws of RoPE-based Extrapolation

1 code implementation8 Oct 2023 Xiaoran Liu, Hang Yan, Shuo Zhang, Chenxin An, Xipeng Qiu, Dahua Lin

The extrapolation capability of Large Language Models (LLMs) based on Rotary Position Embedding is currently a topic of considerable interest.

16k

Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation

1 code implementation24 Feb 2020 Yige Xu, Xipeng Qiu, Ligao Zhou, Xuanjing Huang

Fine-tuning pre-trained language models like BERT has become an effective way in NLP and yields state-of-the-art results on many downstream tasks.

Natural Language Inference text-classification +1

Towards Efficient NLP: A Standard Evaluation and A Strong Baseline

1 code implementation NAACL 2022 Xiangyang Liu, Tianxiang Sun, Junliang He, Jiawen Wu, Lingling Wu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu

ELUE is dedicated to depict the Pareto Frontier for various language understanding tasks, such that it can tell whether and how much a method achieves Pareto improvement.

Alignment for Honesty

1 code implementation12 Dec 2023 Yuqing Yang, Ethan Chern, Xipeng Qiu, Graham Neubig, PengFei Liu

Recent research has made significant strides in applying alignment techniques to enhance the helpfulness and harmlessness of large language models (LLMs) in accordance with human intentions.

Contrast and Generation Make BART a Good Dialogue Emotion Recognizer

1 code implementation21 Dec 2021 ShiMin Li, Hang Yan, Xipeng Qiu

Meanwhile, we utilize an auxiliary response generation task to enhance the model's ability of handling context information, thereby forcing the model to recognize emotions with similar semantics in diverse contexts.

Contrastive Learning Emotion Recognition in Conversation +1

Paradigm Shift in Natural Language Processing

1 code implementation26 Sep 2021 Tianxiang Sun, Xiangyang Liu, Xipeng Qiu, Xuanjing Huang

In this paper, we review such phenomenon of paradigm shifts in recent years, highlighting several paradigms that have the potential to solve different NLP tasks.

Chunking NER +3

Information Aggregation via Dynamic Routing for Sequence Encoding

2 code implementations COLING 2018 Jingjing Gong, Xipeng Qiu, Shaojing Wang, Xuanjing Huang

The dynamic routing policy is dynamically deciding that what and how much information need be transferred from each word to the final encoding of the text sequence.

Sentiment Analysis text-classification +1

Can AI Assistants Know What They Don't Know?

1 code implementation24 Jan 2024 Qinyuan Cheng, Tianxiang Sun, Xiangyang Liu, Wenwei Zhang, Zhangyue Yin, ShiMin Li, Linyang Li, Zhengfu He, Kai Chen, Xipeng Qiu

To answer this question, we construct a model-specific "I don't know" (Idk) dataset for an assistant, which contains its known and unknown questions, based on existing open-domain question answering datasets.

Math Open-Domain Question Answering +1

A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing

1 code implementation TACL 2020 Hang Yan, Xipeng Qiu, Xuanjing Huang

Our graph-based joint model achieves better performance than previous joint models and state-of-the-art results in both Chinese word segmentation and dependency parsing.

Chinese Word Segmentation Dependency Parsing +3

Improving Contrastive Learning of Sentence Embeddings from AI Feedback

1 code implementation3 May 2023 Qinyuan Cheng, Xiaogui Yang, Tianxiang Sun, Linyang Li, Xipeng Qiu

Our method utilizes AI feedback from large pre-trained language models (LLMs) to construct sample pairs with fine-grained sample similarity scores to improve contrastive learning.

Contrastive Learning Data Augmentation +5

Improving Image Captioning with Better Use of Captions

1 code implementation21 Jun 2020 Zhan Shi, Xu Zhou, Xipeng Qiu, Xiaodan Zhu

Image captioning is a multimodal problem that has drawn extensive attention in both the natural language processing and computer vision community.

Caption Generation Image Captioning +3

Accelerating BERT Inference for Sequence Labeling via Early-Exit

1 code implementation ACL 2021 Xiaonan Li, Yunfan Shao, Tianxiang Sun, Hang Yan, Xipeng Qiu, Xuanjing Huang

To alleviate this problem, we extend the recent successful early-exit mechanism to accelerate the inference of PTMs for sequence labeling tasks.

Sentence

Unified Demonstration Retriever for In-Context Learning

1 code implementation7 May 2023 Xiaonan Li, Kai Lv, Hang Yan, Tianyang Lin, Wei Zhu, Yuan Ni, Guotong Xie, Xiaoling Wang, Xipeng Qiu

To train UDR, we cast various tasks' training signals into a unified list-wise ranking formulation by language model's feedback.

In-Context Learning Language Modelling +1

TAVAT: Token-Aware Virtual Adversarial Training for Language Understanding

1 code implementation30 Apr 2020 Linyang Li, Xipeng Qiu

Gradient-based adversarial training is widely used in improving the robustness of neural networks, while it cannot be easily adapted to natural language processing tasks since the embedding space is discrete.

Natural Language Understanding text-classification +1

CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors

1 code implementation9 May 2023 Peng Li, Tianxiang Sun, Qiong Tang, Hang Yan, Yuanbin Wu, Xuanjing Huang, Xipeng Qiu

A common practice is to recast the task into a text-to-text format such that generative LLMs of natural language (NL-LLMs) like GPT-3 can be prompted to solve it.

Code Generation Few-Shot Learning +4

Enhancing Scientific Papers Summarization with Citation Graph

1 code implementation7 Apr 2021 Chenxin An, Ming Zhong, Yiran Chen, Danqing Wang, Xipeng Qiu, Xuanjing Huang

Previous work for text summarization in scientific domain mainly focused on the content of the input document, but seldom considering its citation network.

Text Summarization

COLO: A Contrastive Learning based Re-ranking Framework for One-Stage Summarization

1 code implementation COLING 2022 Chenxin An, Ming Zhong, Zhiyong Wu, Qin Zhu, Xuanjing Huang, Xipeng Qiu

Traditional training paradigms for extractive and abstractive summarization systems always only use token-level or sentence-level training objectives.

Abstractive Text Summarization Contrastive Learning +2

LLatrieval: LLM-Verified Retrieval for Verifiable Generation

1 code implementation14 Nov 2023 Xiaonan Li, Changtai Zhu, Linyang Li, Zhangyue Yin, Tianxiang Sun, Xipeng Qiu

Thus, the LLM can iteratively provide feedback to retrieval and facilitate the retrieval result to fully support verifiable generation.

Language Modelling Large Language Model +1

LongWanjuan: Towards Systematic Measurement for Long Text Quality

1 code implementation21 Feb 2024 Kai Lv, Xiaoran Liu, Qipeng Guo, Hang Yan, Conghui He, Xipeng Qiu, Dahua Lin

The quality of training data are crucial for enhancing the long-text capabilities of foundation models.

Language Modelling

Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-Thoughts

1 code implementation23 Oct 2023 Tengxiao Liu, Qipeng Guo, Yuqing Yang, Xiangkun Hu, Yue Zhang, Xipeng Qiu, Zheng Zhang

As large language models (LLMs) have shown effectiveness with different prompting methods, such as Chain of Thought, Program of Thought, we find that these methods have formed a great complementarity to each other on math reasoning tasks.

Logical Reasoning Math

KNN-BERT: Fine-Tuning Pre-Trained Models with KNN Classifier

1 code implementation6 Oct 2021 Linyang Li, Demin Song, Ruotian Ma, Xipeng Qiu, Xuanjing Huang

Pre-trained models are widely used in fine-tuning downstream tasks with linear classifiers optimized by the cross-entropy loss, which might face robustness and stability problems.

Contrastive Learning text-classification +1

An AMR-based Link Prediction Approach for Document-level Event Argument Extraction

1 code implementation30 May 2023 Yuqing Yang, Qipeng Guo, Xiangkun Hu, Yue Zhang, Xipeng Qiu, Zheng Zhang

Motivated by the fact that all event structures can be inferred from AMR, this work reformulates EAE as a link prediction problem on AMR graphs.

Event Argument Extraction Link Prediction +1

From Hypergraph Energy Functions to Hypergraph Neural Networks

1 code implementation16 Jun 2023 Yuxin Wang, Quan Gan, Xipeng Qiu, Xuanjing Huang, David Wipf

Hypergraphs are a powerful abstraction for representing higher-order interactions between entities of interest.

Bilevel Optimization Node Classification

GAOKAO-MM: A Chinese Human-Level Benchmark for Multimodal Models Evaluation

1 code implementation24 Feb 2024 Yi Zong, Xipeng Qiu

The Large Vision-Language Models (LVLMs) have demonstrated great abilities in image perception and language understanding.

A Concise Model for Multi-Criteria Chinese Word Segmentation with Transformer Encoder

1 code implementation Findings of the Association for Computational Linguistics 2020 Xipeng Qiu, Hengzhi Pei, Hang Yan, Xuanjing Huang

Multi-criteria Chinese word segmentation (MCCWS) aims to exploit the relations among the multiple heterogeneous segmentation criteria and further improve the performance of each single criterion.

Chinese Word Segmentation Multi-Task Learning +1

AutoTrans: Automating Transformer Design via Reinforced Architecture Search

3 code implementations4 Sep 2020 Wei Zhu, Xiaoling Wang, Xipeng Qiu, Yuan Ni, Guotong Xie

Though the transformer architectures have shown dominance in many natural language understanding tasks, there are still unsolved issues for the training of transformer models, especially the need for a principled way of warm-up which has shown importance for stable training of a transformer, as well as whether the task at hand prefer to scale the attention product or not.

Natural Language Understanding Navigate

Watermarking LLMs with Weight Quantization

1 code implementation17 Oct 2023 Linyang Li, Botian Jiang, Pengyu Wang, Ke Ren, Hang Yan, Xipeng Qiu

Abuse of large language models reveals high risks as large language models are being deployed at an astonishing speed.

Language Modelling Large Language Model +1

VCWE: Visual Character-Enhanced Word Embeddings

1 code implementation NAACL 2019 Chi Sun, Xipeng Qiu, Xuanjing Huang

Chinese is a logographic writing system, and the shape of Chinese characters contain rich syntactic and semantic information.

named-entity-recognition Named Entity Recognition +5

Flames: Benchmarking Value Alignment of LLMs in Chinese

1 code implementation12 Nov 2023 Kexin Huang, Xiangyang Liu, Qianyu Guo, Tianxiang Sun, Jiawei Sun, Yaru Wang, Zeyang Zhou, Yixu Wang, Yan Teng, Xipeng Qiu, Yingchun Wang, Dahua Lin

The widespread adoption of large language models (LLMs) across various regions underscores the urgent need to evaluate their alignment with human values.

Benchmarking Fairness

InferAligner: Inference-Time Alignment for Harmlessness through Cross-Model Guidance

1 code implementation20 Jan 2024 Pengyu Wang, Dong Zhang, Linyang Li, Chenkun Tan, Xinghao Wang, Ke Ren, Botian Jiang, Xipeng Qiu

With the rapid development of large language models (LLMs), they are not only used as general-purpose AI assistants but are also customized through further fine-tuning to meet the requirements of different applications.

Keyphrase Generation with Fine-Grained Evaluation-Guided Reinforcement Learning

1 code implementation Findings (EMNLP) 2021 Yichao Luo, Yige Xu, Jiacheng Ye, Xipeng Qiu, Qi Zhang

In response to this problem, we propose a new fine-grained evaluation metric to improve the RL framework, which considers different granularities: token-level $F_1$ score, edit distance, duplication, and prediction quantities.

Keyphrase Generation reinforcement-learning +1

MoT: Memory-of-Thought Enables ChatGPT to Self-Improve

1 code implementation9 May 2023 Xiaonan Li, Xipeng Qiu

Specifically, MoT is divided into two stages: 1. before the test stage, the LLM pre-thinks on the unlabeled dataset and saves the high-confidence thoughts as external memory; 2.

Arithmetic Reasoning Natural Language Inference

Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration

1 code implementation30 Sep 2023 Qiushi Sun, Zhangyue Yin, Xiang Li, Zhiyong Wu, Xipeng Qiu, Lingpeng Kong

Large Language Models (LLMs) are evolving at an unprecedented pace and have exhibited considerable capability in the realm of natural language processing (NLP) with world knowledge.

World Knowledge

DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning

1 code implementation24 Jan 2024 Xinghao Wang, Junliang He, Pengyu Wang, Yunhua Zhou, Tianxiang Sun, Xipeng Qiu

These methods regularize the representation space by pulling similar sentence representations closer and pushing away the dissimilar ones and have been proven effective in various NLP tasks, e. g., semantic textual similarity (STS) tasks.

Contrastive Learning Denoising +4

DORE: Document Ordered Relation Extraction based on Generative Framework

1 code implementation28 Oct 2022 Qipeng Guo, Yuqing Yang, Hang Yan, Xipeng Qiu, Zheng Zhang

In this paper, we investigate the root cause of the underwhelming performance of the existing generative DocRE models and discover that the culprit is the inadequacy of the training paradigm, instead of the capacities of the models.

Document-level Relation Extraction Relation

RLET: A Reinforcement Learning Based Approach for Explainable QA with Entailment Trees

1 code implementation31 Oct 2022 Tengxiao Liu, Qipeng Guo, Xiangkun Hu, Yue Zhang, Xipeng Qiu, Zheng Zhang

RLET iteratively performs single step reasoning with sentence selection and deduction generation modules, from which the training signal is accumulated across the tree with elaborately designed aligned reward function that is consistent with the evaluation.

reinforcement-learning Reinforcement Learning (RL) +1

Exchange-of-Thought: Enhancing Large Language Model Capabilities through Cross-Model Communication

1 code implementation4 Dec 2023 Zhangyue Yin, Qiushi Sun, Cheng Chang, Qipeng Guo, Junqi Dai, Xuanjing Huang, Xipeng Qiu

Large Language Models (LLMs) have recently made significant strides in complex reasoning tasks through the Chain-of-Thought technique.

Language Modelling Large Language Model

PromptNER: A Prompting Method for Few-shot Named Entity Recognition via k Nearest Neighbor Search

1 code implementation20 May 2023 Mozhi Zhang, Hang Yan, Yaqian Zhou, Xipeng Qiu

We use prompts that contains entity category information to construct label prototypes, which enables our model to fine-tune with only the support set.

few-shot-ner Few-shot NER +4

Are Factuality Checkers Reliable? Adversarial Meta-evaluation of Factuality in Summarization

1 code implementation Findings (EMNLP) 2021 Yiran Chen, PengFei Liu, Xipeng Qiu

In this paper, we present an adversarial meta-evaluation methodology that allows us to (i) diagnose the fine-grained strengths and weaknesses of 6 existing top-performing metrics over 24 diagnostic test datasets, (ii) search for directions for further improvement by data augmentation.

Data Augmentation

What Dense Graph Do You Need for Self-Attention?

1 code implementation27 May 2022 Yuxin Wang, Chu-Tak Lee, Qipeng Guo, Zhangyue Yin, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu

Transformers have made progress in miscellaneous tasks, but suffer from quadratic computational and memory complexities.

Miscellaneous

Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts

1 code implementation20 Oct 2022 Xiangyang Liu, Tianxiang Sun, Xuanjing Huang, Xipeng Qiu

Through extensive experimental results across various tasks and PTMs, we show that LPT can achieve competitive performance to full model tuning and other PETuning methods under both full-data and few-shot scenarios while possessing faster training speed and lower memory cost.

A Simple Hash-Based Early Exiting Approach For Language Understanding and Generation

1 code implementation Findings (ACL) 2022 Tianxiang Sun, Xiangyang Liu, Wei Zhu, Zhichao Geng, Lingling Wu, Yilong He, Yuan Ni, Guotong Xie, Xuanjing Huang, Xipeng Qiu

Previous works usually adopt heuristic metrics such as the entropy of internal outputs to measure instance difficulty, which suffers from generalization and threshold-tuning.

Dialogue Meaning Representation for Task-Oriented Dialogue Systems

1 code implementation23 Apr 2022 Xiangkun Hu, Junqi Dai, Hang Yan, Yi Zhang, Qipeng Guo, Xipeng Qiu, Zheng Zhang

We propose Dialogue Meaning Representation (DMR), a pliable and easily extendable representation for task-oriented dialogue.

coreference-resolution Negation +1

Is MultiWOZ a Solved Task? An Interactive TOD Evaluation Framework with User Simulator

1 code implementation26 Oct 2022 Qinyuan Cheng, Linyang Li, Guofeng Quan, Feng Gao, Xiaofeng Mou, Xipeng Qiu

Besides, we introduce a sentence-level and a session-level score to measure the sentence fluency and session coherence in the interactive evaluation.

Sentence

PerturbScore: Connecting Discrete and Continuous Perturbations in NLP

1 code implementation13 Oct 2023 Linyang Li, Ke Ren, Yunfan Shao, Pengyu Wang, Xipeng Qiu

Through experimental results, we find that we can build a connection between discrete and continuous perturbations and use the proposed PerturbScore to learn such correlation, surpassing previous methods used in discrete perturbation measuring.

Rethinking Label Smoothing on Multi-hop Question Answering

2 code implementations19 Dec 2022 Zhangyue Yin, Yuxin Wang, Xiannian Hu, Yiguang Wu, Hang Yan, Xinyu Zhang, Zhao Cao, Xuanjing Huang, Xipeng Qiu

Multi-Hop Question Answering (MHQA) is a significant area in question answering, requiring multiple reasoning components, including document retrieval, supporting sentence prediction, and answer span extraction.

Image Classification Machine Reading Comprehension +6

F-Eval: Asssessing Fundamental Abilities with Refined Evaluation Methods

1 code implementation26 Jan 2024 Yu Sun, Keyu Chen, Shujie Wang, Qipeng Guo, Hang Yan, Xipeng Qiu, Xuanjing Huang, Dahua Lin

However, these evaluation benchmarks are limited to assessing the instruction-following capabilities, overlooking the fundamental abilities that emerge during the pre-training stage.

Instruction Following

Hint-before-Solving Prompting: Guiding LLMs to Effectively Utilize Encoded Knowledge

1 code implementation22 Feb 2024 Jinlan Fu, Shenzhen Huangfu, Hang Yan, See-Kiong Ng, Xipeng Qiu

Large Language Models (LLMs) have recently showcased remarkable generalizability in various domains.

Logical Reasoning

The Open-World Lottery Ticket Hypothesis for OOD Intent Classification

1 code implementation13 Oct 2022 Yunhua Zhou, Pengyu Wang, Peiju Liu, Yuxin Wang, Xipeng Qiu

Most existing methods of Out-of-Domain (OOD) intent classification rely on extensive auxiliary OOD corpora or specific training paradigms.

intent-classification Intent Classification

Mitigating Negative Style Transfer in Hybrid Dialogue System

1 code implementation14 Dec 2022 ShiMin Li, Qinyuan Cheng, Linyang Li, Xipeng Qiu

As the functionality of dialogue systems evolves, hybrid dialogue systems that accomplish user-specific goals and participate in open-topic chitchat with users are attracting growing attention.

Contrastive Learning Style Transfer

Same Representation, Different Attentions: Shareable Sentence Representation Learning from Multiple Tasks

no code implementations22 Apr 2018 Renjie Zheng, Junkun Chen, Xipeng Qiu

More specifically, all tasks share the same sentence representation and each task can select the task-specific information from the shared sentence representation with attention mechanism.

General Classification Multi-Task Learning +4

Incorporating Discriminator in Sentence Generation: a Gibbs Sampling Method

no code implementations25 Feb 2018 Jinyue Su, Jiacheng Xu, Xipeng Qiu, Xuanjing Huang

Generating plausible and fluent sentence with desired properties has long been a challenge.

Sentence

Meta Multi-Task Learning for Sequence Modeling

no code implementations25 Feb 2018 Junkun Chen, Xipeng Qiu, Pengfei Liu, Xuanjing Huang

Specifically, we use a shared meta-network to capture the meta-knowledge of semantic composition and generate the parameters of the task-specific semantic composition models.

Multi-Task Learning Representation Learning +3

A Feature-Enriched Neural Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging

no code implementations16 Nov 2016 Xinchi Chen, Xipeng Qiu, Xuanjing Huang

Recently, neural network models for natural language processing tasks have been increasingly focused on for their ability of alleviating the burden of manual feature engineering.

Chinese Word Segmentation Feature Engineering +1

DAG-based Long Short-Term Memory for Neural Word Segmentation

no code implementations2 Jul 2017 Xinchi Chen, Zhan Shi, Xipeng Qiu, Xuanjing Huang

In this paper, we propose a new neural model to incorporate the word-level information for Chinese word segmentation.

Chinese Word Segmentation Feature Engineering +2

Dynamic Compositional Neural Networks over Tree Structure

no code implementations11 May 2017 Pengfei Liu, Xipeng Qiu, Xuanjing Huang

Tree-structured neural networks have proven to be effective in learning semantic representations by exploiting syntactic information.

Learning Semantic Representations

Adversarial Multi-task Learning for Text Classification

no code implementations ACL 2017 Pengfei Liu, Xipeng Qiu, Xuanjing Huang

Neural network models have shown their promising opportunities for multi-task learning, which focus on learning the shared layers to extract the common and task-invariant features.

General Classification Multi-Task Learning +2

Knowledge Graph Representation with Jointly Structural and Textual Encoding

no code implementations26 Nov 2016 Jiacheng Xu, Kan Chen, Xipeng Qiu, Xuanjing Huang

In this paper, we propose a novel deep architecture to utilize both structural and textual information of entities.

General Classification Knowledge Graph Embedding +2

End-to-End Neural Sentence Ordering Using Pointer Network

no code implementations15 Nov 2016 Jingjing Gong, Xinchi Chen, Xipeng Qiu, Xuanjing Huang

However, it is nontrivial for pair-wise models to incorporate the contextual sentence information.

Sentence Sentence Ordering

Deep Multi-Task Learning with Shared Memory

no code implementations23 Sep 2016 Pengfei Liu, Xipeng Qiu, Xuanjing Huang

Neural network based models have achieved impressive results on various specific tasks.

General Classification Multi-Task Learning +2

Learning Word Embeddings from Intrinsic and Extrinsic Views

no code implementations20 Aug 2016 Jifan Chen, Kan Chen, Xipeng Qiu, Qi Zhang, Xuanjing Huang, Zheng Zhang

To prove the effectiveness of our model, we evaluate it on four tasks, including word similarity, reverse dictionaries, Wiki link prediction, and document classification.

Descriptive Document Classification +4

Neural Sentence Ordering

no code implementations23 Jul 2016 Xinchi Chen, Xipeng Qiu, Xuanjing Huang

Sentence ordering is a general and critical task for natural language generation applications.

Document Summarization Multi-Document Summarization +2

Syntax-based Attention Model for Natural Language Inference

no code implementations22 Jul 2016 PengFei Liu, Xipeng Qiu, Xuanjing Huang

Introducing attentional mechanism in neural network is a powerful concept, and has achieved impressive results in many natural language processing tasks.

Natural Language Inference Sentence

Modelling Interaction of Sentence Pair with coupled-LSTMs

no code implementations EMNLP 2016 Pengfei Liu, Xipeng Qiu, Xuanjing Huang

Recently, there is rising interest in modelling the interactions of two sentences with deep neural networks.

Sentence

Bridging LSTM Architecture and the Neural Dynamics during Reading

no code implementations22 Apr 2016 Peng Qian, Xipeng Qiu, Xuanjing Huang

Recently, the long short-term memory neural network (LSTM) has attracted wide interest due to its success in many tasks.

Gaussian Mixture Embeddings for Multiple Word Prototypes

no code implementations19 Nov 2015 Xinchi Chen, Xipeng Qiu, Jingxiang Jiang, Xuanjing Huang

In this paper, we propose the Gaussian mixture skip-gram (GMSG) model to learn the Gaussian mixture embeddings for words based on skip-gram framework.

Overview of the NLPCC 2015 Shared Task: Chinese Word Segmentation and POS Tagging for Micro-blog Texts

no code implementations28 May 2015 Xipeng Qiu, Peng Qian, Liusong Yin, Shiyu Wu, Xuanjing Huang

In this paper, we give an overview for the shared task at the 4th CCF Conference on Natural Language Processing \& Chinese Computing (NLPCC 2015): Chinese word segmentation and part-of-speech (POS) tagging for micro-blog texts.

Chinese Word Segmentation Part-Of-Speech Tagging +3

A Re-ranking Model for Dependency Parser with Recursive Convolutional Neural Network

no code implementations IJCNLP 2015 Chenxi Zhu, Xipeng Qiu, Xinchi Chen, Xuanjing Huang

In this work, we address the problem to model all the nodes (words or phrases) in a dependency tree with the dense representations.

Dependency Parsing Re-Ranking

Top-Down Tree Structured Text Generation

no code implementations14 Aug 2018 Qipeng Guo, Xipeng Qiu, xiangyang xue, Zheng Zhang

Text generation is a fundamental building block in natural language processing tasks.

Sentence Text Generation

Gaussian Word Embedding with a Wasserstein Distance Loss

no code implementations21 Aug 2018 Chi Sun, Hang Yan, Xipeng Qiu, Xuanjing Huang

Therefore, with the aim of representing words in a highly efficient way, we propose to operate a Gaussian word embedding model with a loss function based on the Wasserstein distance.

Document Classification General Classification +1

Deformable Stacked Structure for Named Entity Recognition

no code implementations24 Sep 2018 Shuyang Cao, Xipeng Qiu, Xuanjing Huang

Neural architecture for named entity recognition has achieved great success in the field of natural language processing.

named-entity-recognition Named Entity Recognition +1

Neural Arithmetic Expression Calculator

no code implementations23 Sep 2018 Kaiyu Chen, Yihan Dong, Xipeng Qiu, Zitian Chen

With curriculum learning, our model can deal with a complex arithmetic expression calculation with the deep hierarchical structure of skill models.

Hierarchical Reinforcement Learning

Multi-task Learning over Graph Structures

no code implementations26 Nov 2018 Pengfei Liu, Jie Fu, Yue Dong, Xipeng Qiu, Jackie Chi Kit Cheung

We present two architectures for multi-task learning with neural sequence models.

General Classification Multi-Task Learning +2

Switch-LSTMs for Multi-Criteria Chinese Word Segmentation

no code implementations19 Dec 2018 Jingjing Gong, Xinchi Chen, Tao Gui, Xipeng Qiu

With these auto-switched LSTMs, our model provides a more flexible solution for multi-criteria CWS, which is also easy to transfer the learned knowledge to new criteria.

Chinese Word Segmentation Segmentation

Idiom-Aware Compositional Distributed Semantics

no code implementations EMNLP 2017 Pengfei Liu, Kaiyu Qian, Xipeng Qiu, Xuanjing Huang

Idioms are peculiar linguistic constructions that impose great challenges for representing the semantics of language, especially in current prevailing end-to-end neural models, which assume that the semantics of a phrase or sentence can be literally composed from its constitutive words.

General Classification Machine Translation +4

DropAttention: A Regularization Method for Fully-Connected Self-Attention Networks

no code implementations25 Jul 2019 Lin Zehui, PengFei Liu, Luyao Huang, Junkun Chen, Xipeng Qiu, Xuanjing Huang

Variants dropout methods have been designed for the fully-connected layer, convolutional layer and recurrent layer in neural networks, and shown to be effective to avoid overfitting.

Exploring Domain Shift in Extractive Text Summarization

no code implementations30 Aug 2019 Danqing Wang, PengFei Liu, Ming Zhong, Jie Fu, Xipeng Qiu, Xuanjing Huang

Although domain shift has been well explored in many NLP applications, it still has received little attention in the domain of extractive text summarization.

Extractive Text Summarization Meta-Learning

A Closer Look at Data Bias in Neural Extractive Summarization Models

no code implementations WS 2019 Ming Zhong, Danqing Wang, PengFei Liu, Xipeng Qiu, Xuanjing Huang

In this paper, we take stock of the current state of summarization datasets and explore how different factors of datasets influence the generalization behaviour of neural extractive summarization models.

Extractive Summarization

Unified Multi-Criteria Chinese Word Segmentation with BERT

no code implementations13 Apr 2020 Zhen Ke, Liang Shi, Erli Meng, Bin Wang, Xipeng Qiu, Xuanjing Huang

Besides, the pre-trained BERT language model has been also introduced into the MCCWS task in a multi-task learning framework.

Chinese Word Segmentation Language Modelling +3

Improving Image Captioning with Better Use of Caption

no code implementations ACL 2020 Zhan Shi, Xu Zhou, Xipeng Qiu, Xiaodan Zhu

Image captioning is a multimodal problem that has drawn extensive attention in both the natural language processing and computer vision community.

Caption Generation Image Captioning +3

Text Information Aggregation with Centrality Attention

no code implementations16 Nov 2020 Jingjing Gong, Hang Yan, Yining Zheng, Xipeng Qiu, Xuanjing Huang

A lot of natural language processing problems need to encode the text sequence as a fix-length vector, which usually involves aggregation process of combining the representations of all the words, such as pooling or self-attention.

Sentence text-classification +1

Generating Adversarial Examples in Chinese Texts Using Sentence-Pieces

no code implementations29 Dec 2020 Linyang Li, Yunfan Shao, Demin Song, Xipeng Qiu, Xuanjing Huang

The substitutions in the generated adversarial examples are not characters or words but \textit{'pieces'}, which are more natural to Chinese readers.

Language Modelling Sentence

Early Exiting with Ensemble Internal Classifiers

no code implementations28 May 2021 Tianxiang Sun, Yunhua Zhou, Xiangyang Liu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu

In this paper, we show that a novel objective function for the training of the ensemble internal classifiers can be naturally induced from the perspective of ensemble learning and information theory.

Ensemble Learning

A Survey of Transformers

1 code implementation8 Jun 2021 Tianyang Lin, Yuxin Wang, Xiangyang Liu, Xipeng Qiu

X-formers) have been proposed, however, a systematic and comprehensive literature review on these Transformer variants is still missing.

Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning

no code implementations EMNLP 2021 Linyang Li, Demin Song, Xiaonan Li, Jiehang Zeng, Ruotian Ma, Xipeng Qiu

\textbf{P}re-\textbf{T}rained \textbf{M}odel\textbf{s} have been widely applied and recently proved vulnerable under backdoor attacks: the released pre-trained weights can be maliciously poisoned with certain triggers.

text-classification Text Classification

Learning to Teach with Student Feedback

no code implementations10 Sep 2021 Yitao Liu, Tianxiang Sun, Xipeng Qiu, Xuanjing Huang

This one-way interaction leads to the teacher's inability to perceive the characteristics of the student and its training progress.

Knowledge Distillation

Towards More Effective and Economic Sparsely-Activated Model

no code implementations14 Oct 2021 Hao Jiang, Ke Zhan, Jianwei Qu, Yongkang Wu, Zhaoye Fei, Xinyu Zhang, Lei Chen, Zhicheng Dou, Xipeng Qiu, Zikai Guo, Ruofei Lai, Jiawen Wu, Enrui Hu, Yinxia Zhang, Yantao Jia, Fan Yu, Zhao Cao

To increase the number of activated experts without an increase in computational cost, we propose SAM (Switch and Mixture) routing, an efficient hierarchical routing mechanism that activates multiple experts in a same device (GPU).

TURNER: The Uncertainty-based Retrieval Framework for Chinese NER

no code implementations18 Feb 2022 Zhichao Geng, Hang Yan, Zhangyue Yin, Chenxin An, Xipeng Qiu

Chinese NER is a difficult undertaking due to the ambiguity of Chinese characters and the absence of word boundaries.

General Knowledge NER +1

$\mathcal{Y}$-Tuning: An Efficient Tuning Paradigm for Large-Scale Pre-Trained Models via Label Representation Learning

no code implementations20 Feb 2022 Yitao Liu, Chenxin An, Xipeng Qiu

With the success of large-scale pre-trained models (PTMs), how efficiently adapting PTMs to downstream tasks has attracted tremendous attention, especially for PTMs with billions of parameters.

Representation Learning

Text Adversarial Purification as Defense against Adversarial Attacks

no code implementations27 Mar 2022 Linyang Li, Demin Song, Xipeng Qiu

Adversarial purification is a successful defense mechanism against adversarial attacks without requiring knowledge of the form of the incoming attack.

Adversarial Attack Adversarial Defense

A Unified Generative Framework based on Prompt Learning for Various Information Extraction Tasks

no code implementations23 Sep 2022 Zhigang Kan, Linhui Feng, Zhangyue Yin, Linbo Qiao, Xipeng Qiu, Dongsheng Li

In this paper, we propose a novel composable prompt-based generative framework, which could be applied to a wide range of tasks in the field of Information Extraction.

Relation Extraction

Discovering New Intents Using Latent Variables

no code implementations21 Oct 2022 Yunhua Zhou, Peiju Liu, Yuxin Wang, Xipeng Qiu

In this paper, starting from the intuition that discovering intents could be beneficial to the identification of the known intents, we propose a probabilistic framework for discovering intents where intent assignments are treated as latent variables.

Cannot find the paper you are looking for? You can Submit a new open access paper.