Search Results for author: Xuanjing Huang

Found 249 papers, 111 papers with code

LFKQG: A Controlled Generation Framework with Local Fine-tuning for Question Generation over Knowledge Bases

no code implementations COLING 2022 Zichu Fei, Xin Zhou, Tao Gui, Qi Zhang, Xuanjing Huang

Existing KBQG models still face two main challenges: (1) Most models often focus on the most relevant part of the answer entity, while neglecting the rest of the subgraph.

Natural Questions Question Generation +1

CQG: A Simple and Effective Controlled Generation Framework for Multi-hop Question Generation

1 code implementation ACL 2022 Zichu Fei, Qi Zhang, Tao Gui, Di Liang, Sirui Wang, Wei Wu, Xuanjing Huang

CQG employs a simple method to generate the multi-hop questions that contain key entities in multi-hop reasoning chains, which ensure the complexity and quality of the questions.

Question Generation Question-Generation

A Progressive Framework for Role-Aware Rumor Resolution

1 code implementation COLING 2022 Lei Chen, Guanying Li, Zhongyu Wei, Yang Yang, Baohua Zhou, Qi Zhang, Xuanjing Huang

Existing works on rumor resolution have shown great potential in recognizing word appearance and user participation.

PlugAT: A Plug and Play Module to Defend against Textual Adversarial Attack

no code implementations COLING 2022 Rui Zheng, Rong Bao, Qin Liu, Tao Gui, Qi Zhang, Xuanjing Huang, Rui Xie, Wei Wu

To reduce the potential side effects of using defense modules, we further propose a novel forgetting restricted adversarial training, which filters out bad adversarial examples that impair the performance of original ones.

Adversarial Attack Domain Adaptation +2

Multi-task Learning with Gradient Communication

no code implementations ICLR 2019 Pengfei Liu, Xuanjing Huang

In this paper, we describe a general framework to systematically analyze current neural models for multi-task learning, in which we find that existing models expect to disentangle features into different spaces while features learned in practice are still entangled in shared space, leaving potential hazards for other training or unseen tasks.

Inductive Bias Multi-Task Learning

Making Parameter-efficient Tuning More Efficient: A Unified Framework for Classification Tasks

1 code implementation COLING 2022 Xin Zhou, Ruotian Ma, Yicheng Zou, Xuanting Chen, Tao Gui, Qi Zhang, Xuanjing Huang, Rui Xie, Wei Wu

Specifically, we re-formulate both token and sentence classification tasks into a unified language modeling task, and map label spaces of different tasks into the same vocabulary space.

Language Modelling Sentence +2

A Structure-Aware Argument Encoder for Literature Discourse Analysis

1 code implementation COLING 2022 Yinzi Li, Wei Chen, Zhongyu Wei, Yujun Huang, Chujun Wang, Siyuan Wang, Qi Zhang, Xuanjing Huang, Libo Wu

Existing research for argument representation learning mainly treats tokens in the sentence equally and ignores the implied structure information of argumentative context.

Position Representation Learning +1

Towards Adversarially Robust Text Classifiers by Learning to Reweight Clean Examples

no code implementations Findings (ACL) 2022 Jianhan Xu, Cenyuan Zhang, Xiaoqing Zheng, Linyang Li, Cho-Jui Hsieh, Kai-Wei Chang, Xuanjing Huang

Most of the existing defense methods improve the adversarial robustness by making the models adapt to the training set augmented with some adversarial examples.

Adversarial Robustness

Domain Generalization via Causal Adjustment for Cross-Domain Sentiment Analysis

no code implementations22 Feb 2024 Siyin Wang, Jie zhou, Qin Chen, Qi Zhang, Tao Gui, Xuanjing Huang

Domain adaption has been widely adapted for cross-domain sentiment analysis to transfer knowledge from the source domain to the target domain.

Domain Generalization Sentiment Analysis

LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named Entity Recognition

no code implementations22 Feb 2024 Junjie Ye, Nuo Xu, Yikun Wang, Jie zhou, Qi Zhang, Tao Gui, Xuanjing Huang

To overcome the limitations of existing data augmentation methods that compromise semantic integrity and address the uncertainty inherent in LLM-generated text, we leverage the distinctive characteristics of the NER task by augmenting the original data at both the contextual and entity levels.

Data Augmentation few-shot-ner +5

Unveiling Linguistic Regions in Large Language Models

no code implementations22 Feb 2024 Zhihao Zhang, Jun Zhao, Qi Zhang, Tao Gui, Xuanjing Huang

Furthermore, this core region exhibits significant dimensional dependency, perturbations to even a single parameter on specific dimensions leading to a loss of linguistic competence.

SoMeLVLM: A Large Vision Language Model for Social Media Processing

no code implementations20 Feb 2024 Xinnong Zhang, Haoyu Kuang, Xinyi Mou, Hanjia Lyu, Kun Wu, Siming Chen, Jiebo Luo, Xuanjing Huang, Zhongyu Wei

The powerful Large Vision Language Models make it possible to handle a variety of tasks simultaneously, but even with carefully designed prompting methods, the general domain models often fall short in aligning with the unique speaking style and context of social media tasks.

Language Modelling

Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution

no code implementations18 Feb 2024 Nuo Xu, Jun Zhao, Can Zu, Tao Gui, Qi Zhang, Xuanjing Huang

To address this issue, we propose a cost-effective preference learning strategy, optimizing reward models by distinguishing between human and machine translations.

Machine Translation Translation

LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration

no code implementations18 Feb 2024 Jun Zhao, Can Zu, Hao Xu, Yi Lu, wei he, Yiwen Ding, Tao Gui, Qi Zhang, Xuanjing Huang

Large language models (LLMs) have demonstrated impressive performance in understanding language and executing complex reasoning tasks.

Multi-hop Question Answering Question Answering +1

Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation

1 code implementation18 Feb 2024 Siyuan Wang, Zhuohan Long, Zhihao Fan, Zhongyu Wei, Xuanjing Huang

Towards a more scalable, robust and fine-grained evaluation, we implement six reframing operations to construct evolving instances testing LLMs against diverse queries, data noise and probing their problem-solving sub-abilities.

Model Selection

LLM can Achieve Self-Regulation via Hyperparameter Aware Generation

no code implementations17 Feb 2024 Siyin Wang, ShiMin Li, Tianxiang Sun, Jinlan Fu, Qinyuan Cheng, Jiasheng Ye, Junjie Ye, Xipeng Qiu, Xuanjing Huang

HAG extends the current paradigm in the text generation process, highlighting the feasibility of endowing the LLMs with self-regulate decoding strategies.

Text Generation

ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages

1 code implementation16 Feb 2024 Junjie Ye, Sixian Li, Guanyu Li, Caishuang Huang, Songyang Gao, Yilong Wu, Qi Zhang, Tao Gui, Xuanjing Huang

Tool learning is widely acknowledged as a foundational approach or deploying large language models (LLMs) in real-world scenarios.

LongHeads: Multi-Head Attention is Secretly a Long Context Processor

no code implementations16 Feb 2024 Yi Lu, Xin Zhou, wei he, Jun Zhao, Tao Ji, Tao Gui, Qi Zhang, Xuanjing Huang

To this end, we propose a chunk selection strategy that relies on the inherent correlation between the query and the key representations, efficiently distributing context chunks to different heads.


Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning

1 code implementation8 Feb 2024 Zhiheng Xi, Wenxiang Chen, Boyang Hong, Senjie Jin, Rui Zheng, wei he, Yiwen Ding, Shichun Liu, Xin Guo, Junzhe Wang, Honglin Guo, Wei Shen, Xiaoran Fan, Yuhao Zhou, Shihan Dou, Xiao Wang, Xinbo Zhang, Peng Sun, Tao Gui, Qi Zhang, Xuanjing Huang

In this paper, we propose R$^3$: Learning Reasoning through Reverse Curriculum Reinforcement Learning (RL), a novel method that employs only outcome supervision to achieve the benefits of process supervision for large language models.

GSM8K reinforcement-learning +1

Are Large Language Models Good Prompt Optimizers?

no code implementations3 Feb 2024 Ruotian Ma, Xiaolei Wang, Xin Zhou, Jian Li, Nan Du, Tao Gui, Qi Zhang, Xuanjing Huang

Despite the success, the underlying mechanism of this approach remains unexplored, and the true effectiveness of LLMs as Prompt Optimizers requires further validation.


Efficient and Effective Time-Series Forecasting with Spiking Neural Networks

no code implementations2 Feb 2024 Changze Lv, Yansen Wang, Dongqi Han, Xiaoqing Zheng, Xuanjing Huang, Dongsheng Li

Spiking neural networks (SNNs), inspired by the spiking behavior of biological neurons, provide a unique pathway for capturing the intricacies of temporal data.

Model Selection Time Series +1

CauESC: A Causal Aware Model for Emotional Support Conversation

no code implementations31 Jan 2024 Wei Chen, Hengxu Lin, Qun Zhang, Xiaojin Zhang, Xiang Bai, Xuanjing Huang, Zhongyu Wei

Emotional Support Conversation aims at reducing the seeker's emotional distress through supportive response.

F-Eval: Asssessing Fundamental Abilities with Refined Evaluation Methods

1 code implementation26 Jan 2024 Yu Sun, Keyu Chen, Shujie Wang, Qipeng Guo, Hang Yan, Xipeng Qiu, Xuanjing Huang, Dahua Lin

However, these evaluation benchmarks are limited to assessing the instruction-following capabilities, overlooking the fundamental abilities that emerge during the pre-training stage.

Instruction Following

RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning

1 code implementation16 Jan 2024 Junjie Ye, Yilong Wu, Songyang Gao, Caishuang Huang, Sixian Li, Guanyu Li, Xiaoran Fan, Qi Zhang, Tao Gui, Xuanjing Huang

To bridge this gap, we introduce RoTBench, a multi-level benchmark for evaluating the robustness of LLMs in tool learning.

Open the Pandora's Box of LLMs: Jailbreaking LLMs through Representation Engineering

no code implementations12 Jan 2024 Tianlong Li, Shihan Dou, Wenhao Liu, Muling Wu, Changze Lv, Xiaoqing Zheng, Xuanjing Huang

To overcome these limitations, we propose a novel jailbreaking approach, named Jailbreaking LLMs through Representation Engineering (JRE).

Prompt Engineering

LLaMA Beyond English: An Empirical Study on Language Capability Transfer

no code implementations2 Jan 2024 Jun Zhao, Zhihao Zhang, Luhui Gao, Qi Zhang, Tao Gui, Xuanjing Huang

In recent times, substantial advancements have been witnessed in large language models (LLMs), exemplified by ChatGPT, showcasing remarkable proficiency across a range of complex tasks.

Informativeness Text Generation

Aligning Large Language Models with Human Preferences through Representation Engineering

no code implementations26 Dec 2023 Wenhao Liu, Xiaohua Wang, Muling Wu, Tianlong Li, Changze Lv, Zixuan Ling, Jianhao Zhu, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang

Aligning large language models (LLMs) with human preferences is crucial for enhancing their utility in terms of helpfulness, truthfulness, safety, harmlessness, and interestingness.

Argue with Me Tersely: Towards Sentence-Level Counter-Argument Generation

1 code implementation21 Dec 2023 Jiayu Lin, Rong Ye, Meng Han, Qi Zhang, Ruofei Lai, Xinyu Zhang, Zhao Cao, Xuanjing Huang, Zhongyu Wei

The results show the competitiveness of our proposed framework and evaluator in counter-argument generation tasks.


K-ESConv: Knowledge Injection for Emotional Support Dialogue Systems via Prompt Learning

no code implementations16 Dec 2023 Wei Chen, Gang Zhao, Xiaojin Zhang, Xiang Bai, Xuanjing Huang, Zhongyu Wei

Automatic psychological counseling requires mass of professional knowledge that can be found in online counseling forums.

Response Generation

A Soft Contrastive Learning-based Prompt Model for Few-shot Sentiment Analysis

no code implementations16 Dec 2023 Jingyi Zhou, Jie zhou, Jiabao Zhao, Siyin Wang, Haijun Shan, Gui Tao, Qi Zhang, Xuanjing Huang

Few-shot text classification has attracted great interest in both academia and industry due to the lack of labeled data in many fields.

Contrastive Learning Few-Shot Text Classification +5

LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment

no code implementations15 Dec 2023 Shihan Dou, Enyu Zhou, Yan Liu, Songyang Gao, Jun Zhao, Wei Shen, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Xiaoran Fan, ShiLiang Pu, Jiang Zhu, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang

When the models are required to align with a broader range of downstream tasks, or there is a desire to notably improve the performance on a specific task, a substantial increase in fine-tuning data often emerges as the solution.

Language Modelling Multi-Task Learning +1

Exchange-of-Thought: Enhancing Large Language Model Capabilities through Cross-Model Communication

1 code implementation4 Dec 2023 Zhangyue Yin, Qiushi Sun, Cheng Chang, Qipeng Guo, Junqi Dai, Xuanjing Huang, Xipeng Qiu

Large Language Models (LLMs) have recently made significant strides in complex reasoning tasks through the Chain-of-Thought technique.

Language Modelling Large Language Model

Making Harmful Behaviors Unlearnable for Large Language Models

no code implementations2 Nov 2023 Xin Zhou, Yi Lu, Ruotian Ma, Tao Gui, Qi Zhang, Xuanjing Huang

Specifically, we introduce ``security vectors'', a few new parameters that can be separated from the LLM, to ensure LLM's responses are consistent with the harmful behavior.

Tailoring Personality Traits in Large Language Models via Unsupervisedly-Built Personalized Lexicons

no code implementations25 Oct 2023 Tianlong Li, Shihan Dou, Changze Lv, Wenhao Liu, Jianhan Xu, Muling Wu, Zixuan Ling, Xiaoqing Zheng, Xuanjing Huang

Users can utilize UBPL to adjust the probability vectors of predicted words in the decoding phase of LLMs, thus influencing the personality expression of LLMs.

Unveiling A Core Linguistic Region in Large Language Models

no code implementations23 Oct 2023 Jun Zhao, Zhihao Zhang, Yide Ma, Qi Zhang, Tao Gui, Luhui Gao, Xuanjing Huang

We have discovered a core region in LLMs that corresponds to linguistic competence, accounting for approximately 1% of the total model parameters.

Orthogonal Subspace Learning for Language Model Continual Learning

1 code implementation22 Oct 2023 Xiao Wang, Tianze Chen, Qiming Ge, Han Xia, Rong Bao, Rui Zheng, Qi Zhang, Tao Gui, Xuanjing Huang

In this paper, we propose orthogonal low-rank adaptation (O-LoRA), a simple and efficient approach for continual learning in language models, effectively mitigating catastrophic forgetting while learning new tasks.

Continual Learning Language Modelling

Are Structural Concepts Universal in Transformer Language Models? Towards Interpretable Cross-Lingual Generalization

1 code implementation19 Oct 2023 Ningyu Xu, Qi Zhang, Jingting Ye, Menghan Zhang, Xuanjing Huang

We then propose a meta-learning-based method to learn to align conceptual spaces of different languages, which facilitates zero-shot and few-shot generalization in concept classification and also offers insights into the cross-lingual in-context learning phenomenon.

In-Context Learning Meta-Learning +1

RealBehavior: A Framework for Faithfully Characterizing Foundation Models' Human-like Behavior Mechanisms

no code implementations17 Oct 2023 Enyu Zhou, Rui Zheng, Zhiheng Xi, Songyang Gao, Xiaoran Fan, Zichu Fei, Jingting Ye, Tao Gui, Qi Zhang, Xuanjing Huang

Reports of human-like behaviors in foundation models are growing, with psychological theories providing enduring tools to investigate these behaviors.

SpikeCLIP: A Contrastive Language-Image Pretrained Spiking Neural Network

no code implementations10 Oct 2023 Tianlong Li, Wenhao Liu, Changze Lv, Jianhan Xu, Cenyuan Zhang, Muling Wu, Xiaoqing Zheng, Xuanjing Huang

Spiking neural networks (SNNs) have demonstrated the capability to achieve comparable performance to deep neural networks (DNNs) in both visual and linguistic domains while offering the advantages of improved energy efficiency and adherence to biological plausibility.

Image Classification

Loose lips sink ships: Mitigating Length Bias in Reinforcement Learning from Human Feedback

no code implementations8 Oct 2023 Wei Shen, Rui Zheng, WenYu Zhan, Jun Zhao, Shihan Dou, Tao Gui, Qi Zhang, Xuanjing Huang

Reinforcement learning from human feedback serves as a crucial bridge, aligning large language models with human and societal values.

Language Modelling

DISC-LawLLM: Fine-tuning Large Language Models for Intelligent Legal Services

2 code implementations20 Sep 2023 Shengbin Yue, Wei Chen, Siyuan Wang, Bingxuan Li, Chenchen Shen, Shujun Liu, Yuxuan Zhou, Yao Xiao, Song Yun, Xuanjing Huang, Zhongyu Wei

We propose DISC-LawLLM, an intelligent legal system utilizing large language models (LLMs) to provide a wide range of legal services.

Legal Reasoning Retrieval

DISC-MedLLM: Bridging General Large Language Models and Real-World Medical Consultation

1 code implementation28 Aug 2023 Zhijie Bao, Wei Chen, Shengze Xiao, Kuang Ren, Jiaao Wu, Cheng Zhong, Jiajie Peng, Xuanjing Huang, Zhongyu Wei

We propose DISC-MedLLM, a comprehensive solution that leverages Large Language Models (LLMs) to provide accurate and truthful medical response in end-to-end conversational healthcare services.

Knowledge Graphs

On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection

1 code implementation27 Jun 2023 Songyang Gao, Shihan Dou, Qi Zhang, Xuanjing Huang, Jin Ma, Ying Shan

Detecting adversarial samples that are carefully crafted to fool the model is a critical step to socially-secure applications.

text-classification Text Classification

From Hypergraph Energy Functions to Hypergraph Neural Networks

1 code implementation16 Jun 2023 Yuxin Wang, Quan Gan, Xipeng Qiu, Xuanjing Huang, David Wipf

Hypergraphs are a powerful abstraction for representing higher-order interactions between entities of interest.

Bilevel Optimization Node Classification

Open Set Relation Extraction via Unknown-Aware Training

1 code implementation8 Jun 2023 Jun Zhao, Xin Zhao, WenYu Zhan, Qi Zhang, Tao Gui, Zhongyu Wei, Yunwen Chen, Xiang Gao, Xuanjing Huang

Inspired by text adversarial attacks, we adaptively apply small but critical perturbations to original training instances and thus synthesizing negative instances that are more likely to be mistaken by the model as known relations.

Relation Relation Extraction

Do Large Language Models Know What They Don't Know?

1 code implementation29 May 2023 Zhangyue Yin, Qiushi Sun, Qipeng Guo, Jiawen Wu, Xipeng Qiu, Xuanjing Huang

Large language models (LLMs) have a wealth of knowledge that allows them to excel in various Natural Language Processing (NLP) tasks.

In-Context Learning

KNSE: A Knowledge-aware Natural Language Inference Framework for Dialogue Symptom Status Recognition

no code implementations26 May 2023 Wei Chen, Shiqi Wei, Zhongyu Wei, Xuanjing Huang

Symptom diagnosis in medical conversations aims to correctly extract both symptom entities and their status from the doctor-patient dialogue.

Natural Language Inference

Query Structure Modeling for Inductive Logical Reasoning Over Knowledge Graphs

1 code implementation23 May 2023 Siyuan Wang, Zhongyu Wei, Meng Han, Zhihao Fan, Haijun Shan, Qi Zhang, Xuanjing Huang

The results demonstrate the effectiveness of our method on logical reasoning over KGs in both inductive and transductive settings.

Knowledge Graphs Logical Reasoning

Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement

1 code implementation23 May 2023 Zhiheng Xi, Senjie Jin, Yuhao Zhou, Rui Zheng, Songyang Gao, Tao Gui, Qi Zhang, Xuanjing Huang

For example, with Text-davinci-003, our method boosts the performance of standard few-shot prompting by $8. 0\%$ on GSM8K and $17. 8\%$ on MultiArith; it also improves the performance of CoT by $6. 0\%$ on GSM8K and $6. 0\%$ on MathQA, respectively.


A Confidence-based Partial Label Learning Model for Crowd-Annotated Named Entity Recognition

1 code implementation21 May 2023 Limao Xiong, Jie zhou, Qunxi Zhu, Xiao Wang, Yuanbin Wu, Qi Zhang, Tao Gui, Xuanjing Huang, Jin Ma, Ying Shan

Particularly, we propose a Confidence-based Partial Label Learning (CPLL) method to integrate the prior confidence (given by annotators) and posterior confidences (learned by models) for crowd-annotated NER.

named-entity-recognition Named Entity Recognition +2

Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization

1 code implementation20 May 2023 Ting Wu, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang

Models trained with empirical risk minimization (ERM) are revealed to easily rely on spurious correlations, resulting in poor generalization.

Out-of-Distribution Generalization text-classification +1

CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors

1 code implementation9 May 2023 Peng Li, Tianxiang Sun, Qiong Tang, Hang Yan, Yuanbin Wu, Xuanjing Huang, Xipeng Qiu

A common practice is to recast the task into a text-to-text format such that generative LLMs of natural language (NL-LLMs) like GPT-3 can be prompted to solve it.

Code Generation Few-Shot Learning +4

CausalAPM: Generalizable Literal Disentanglement for NLU Debiasing

no code implementations4 May 2023 Songyang Gao, Shihan Dou, Junjie Shan, Qi Zhang, Xuanjing Huang

Dataset bias, i. e., the over-reliance on dataset-specific literal heuristics, is getting increasing attention for its detrimental effect on the generalization ability of NLU models.

Causal Inference Disentanglement +2

A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models

no code implementations18 Mar 2023 Junjie Ye, Xuanting Chen, Nuo Xu, Can Zu, Zekai Shao, Shichun Liu, Yuhan Cui, Zeyang Zhou, Chao Gong, Yang shen, Jie zhou, Siming Chen, Tao Gui, Qi Zhang, Xuanjing Huang

GPT series models, such as GPT-3, CodeX, InstructGPT, ChatGPT, and so on, have gained considerable attention due to their exceptional natural language processing capabilities.

Natural Language Understanding

How Robust is GPT-3.5 to Predecessors? A Comprehensive Study on Language Understanding Tasks

no code implementations1 Mar 2023 Xuanting Chen, Junjie Ye, Can Zu, Nuo Xu, Rui Zheng, Minlong Peng, Jie zhou, Tao Gui, Qi Zhang, Xuanjing Huang

The GPT-3. 5 models have demonstrated impressive performance in various Natural Language Processing (NLP) tasks, showcasing their strong understanding and reasoning capabilities.

Natural Language Inference Natural Language Understanding +1

Cross-Linguistic Syntactic Difference in Multilingual BERT: How Good is It and How Does It Affect Transfer?

1 code implementation21 Dec 2022 Ningyu Xu, Tao Gui, Ruotian Ma, Qi Zhang, Jingting Ye, Menghan Zhang, Xuanjing Huang

We demonstrate that the distance between the distributions of different languages is highly consistent with the syntactic difference in terms of linguistic formalisms.

Zero-Shot Cross-Lingual Transfer

Rethinking Label Smoothing on Multi-hop Question Answering

2 code implementations19 Dec 2022 Zhangyue Yin, Yuxin Wang, Xiannian Hu, Yiguang Wu, Hang Yan, Xinyu Zhang, Zhao Cao, Xuanjing Huang, Xipeng Qiu

Multi-Hop Question Answering (MHQA) is a significant area in question answering, requiring multiple reasoning components, including document retrieval, supporting sentence prediction, and answer span extraction.

Image Classification Machine Reading Comprehension +6

Efficient Adversarial Training with Robust Early-Bird Tickets

1 code implementation14 Nov 2022 Zhiheng Xi, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang

Adversarial training is one of the most powerful methods to improve the robustness of pre-trained language models (PLMs).

Robust Lottery Tickets for Pre-trained Language Models

2 code implementations ACL 2022 Rui Zheng, Rong Bao, Yuhao Zhou, Di Liang, Sirui Wang, Wei Wu, Tao Gui, Qi Zhang, Xuanjing Huang

Recent works on Lottery Ticket Hypothesis have shown that pre-trained language models (PLMs) contain smaller matching subnetworks(winning tickets) which are capable of reaching accuracy comparable to the original models.

Adversarial Robustness

Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts

1 code implementation20 Oct 2022 Xiangyang Liu, Tianxiang Sun, Xuanjing Huang, Xipeng Qiu

Through extensive experimental results across various tasks and PTMs, we show that LPT can achieve competitive performance to full model tuning and other PETuning methods under both full-data and few-shot scenarios while possessing faster training speed and lower memory cost.

Kernel-Whitening: Overcome Dataset Bias with Isotropic Sentence Embedding

1 code implementation14 Oct 2022 Songyang Gao, Shihan Dou, Qi Zhang, Xuanjing Huang

Dataset bias has attracted increasing attention recently for its detrimental effect on the generalization ability of fine-tuned models.

Sentence Sentence Embedding +2

COLO: A Contrastive Learning based Re-ranking Framework for One-Stage Summarization

1 code implementation COLING 2022 Chenxin An, Ming Zhong, Zhiyong Wu, Qin Zhu, Xuanjing Huang, Xipeng Qiu

Traditional training paradigms for extractive and abstractive summarization systems always only use token-level or sentence-level training objectives.

Abstractive Text Summarization Contrastive Learning +2

Locate Then Ask: Interpretable Stepwise Reasoning for Multi-hop Question Answering

1 code implementation COLING 2022 Siyuan Wang, Zhongyu Wei, Zhihao Fan, Qi Zhang, Xuanjing Huang

In this paper, we propose an interpretable stepwise reasoning framework to incorporate both single-hop supporting sentence identification and single-hop question generation at each intermediate step, and utilize the inference of the current hop for the next until reasoning out the final result.

Multi-hop Question Answering Question Answering +3

Causal Intervention Improves Implicit Sentiment Analysis

no code implementations COLING 2022 Siyin Wang, Jie zhou, Changzhi Sun, Junjie Ye, Tao Gui, Qi Zhang, Xuanjing Huang

In this work, we propose a causal intervention model for Implicit Sentiment Analysis using Instrumental Variable (ISAIV).

Sentence Sentiment Analysis

CoNT: Contrastive Neural Text Generation

2 code implementations29 May 2022 Chenxin An, Jiangtao Feng, Kai Lv, Lingpeng Kong, Xipeng Qiu, Xuanjing Huang

We validate CoNT on five generation tasks with ten benchmarks, including machine translation, summarization, code comment generation, data-to-text generation and commonsense generation.

Code Comment Generation Comment Generation +4

What Dense Graph Do You Need for Self-Attention?

1 code implementation27 May 2022 Yuxin Wang, Chu-Tak Lee, Qipeng Guo, Zhangyue Yin, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu

Transformers have made progress in miscellaneous tasks, but suffer from quadratic computational and memory complexities.


BBTv2: Towards a Gradient-Free Future with Large Language Models

1 code implementation23 May 2022 Tianxiang Sun, Zhengfu He, Hong Qian, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu

By contrast, gradient-free methods only require the forward computation of the PTM to tune the prompt, retaining the benefits of efficient tuning and deployment.

Few-Shot Learning Language Modelling

A Benchmark for Automatic Medical Consultation System: Frameworks, Tasks and Datasets

1 code implementation19 Apr 2022 Wei Chen, Zhiwei Li, Hongyi Fang, Qianyuan Yao, Cheng Zhong, Jianye Hao, Qi Zhang, Xuanjing Huang, Jiajie Peng, Zhongyu Wei

In recent years, interest has arisen in using machine learning to improve the efficiency of automatic medical consultation and enhance patient experience.

Dialogue Act Classification Dialogue Understanding +4

A Simple Hash-Based Early Exiting Approach For Language Understanding and Generation

1 code implementation Findings (ACL) 2022 Tianxiang Sun, Xiangyang Liu, Wei Zhu, Zhichao Geng, Lingling Wu, Yilong He, Yuan Ni, Guotong Xie, Xuanjing Huang, Xipeng Qiu

Previous works usually adopt heuristic metrics such as the entropy of internal outputs to measure instance difficulty, which suffers from generalization and threshold-tuning.

Decorrelate Irrelevant, Purify Relevant: Overcome Textual Spurious Correlations from a Feature Perspective

2 code implementations COLING 2022 Shihan Dou, Rui Zheng, Ting Wu, Songyang Gao, Junjie Shan, Qi Zhang, Yueming Wu, Xuanjing Huang

Most of the existing debiasing methods often identify and weaken these samples with biased features (i. e., superficial surface features that cause such spurious correlations).

Fact Verification Natural Language Inference +1

Black-Box Tuning for Language-Model-as-a-Service

2 code implementations10 Jan 2022 Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu

In such a scenario, which we call Language-Model-as-a-Service (LMaaS), the gradients of PTMs are usually unavailable.

In-Context Learning Language Modelling

Plug-Tagger: A Pluggable Sequence Labeling Framework Using Language Models

no code implementations14 Oct 2021 Xin Zhou, Ruotian Ma, Tao Gui, Yiding Tan, Qi Zhang, Xuanjing Huang

Specifically, for each task, a label word set is first constructed by selecting a high-frequency word for each class respectively, and then, task-specific vectors are inserted into the inputs and optimized to manipulate the model predictions towards the corresponding label words.

Language Modelling Text Generation

Towards Efficient NLP: A Standard Evaluation and A Strong Baseline

1 code implementation NAACL 2022 Xiangyang Liu, Tianxiang Sun, Junliang He, Jiawen Wu, Lingling Wu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu

ELUE is dedicated to depict the Pareto Frontier for various language understanding tasks, such that it can tell whether and how much a method achieves Pareto improvement.

KNN-BERT: Fine-Tuning Pre-Trained Models with KNN Classifier

1 code implementation6 Oct 2021 Linyang Li, Demin Song, Ruotian Ma, Xipeng Qiu, Xuanjing Huang

Pre-trained models are widely used in fine-tuning downstream tasks with linear classifiers optimized by the cross-entropy loss, which might face robustness and stability problems.

Contrastive Learning text-classification +1

Template-free Prompt Tuning for Few-shot NER

1 code implementation NAACL 2022 Ruotian Ma, Xin Zhou, Tao Gui, Yiding Tan, Linyang Li, Qi Zhang, Xuanjing Huang

Prompt-based methods have been successfully applied in sentence-level few-shot learning tasks, mostly owing to the sophisticated design of templates and label words.

Few-Shot Learning Few-shot NER +1

Paradigm Shift in Natural Language Processing

1 code implementation26 Sep 2021 Tianxiang Sun, Xiangyang Liu, Xipeng Qiu, Xuanjing Huang

In this paper, we review such phenomenon of paradigm shifts in recent years, highlighting several paradigms that have the potential to solve different NLP tasks.

Chunking NER +3

Constructing Phrase-level Semantic Labels to Form Multi-Grained Supervision for Image-Text Retrieval

no code implementations12 Sep 2021 Zhihao Fan, Zhongyu Wei, Zejun Li, Siyuan Wang, Haijun Shan, Xuanjing Huang, Jianqing Fan

Existing research for image text retrieval mainly relies on sentence-level supervision to distinguish matched and mismatched sentences for a query image.

Representation Learning Retrieval +2

Learning to Teach with Student Feedback

no code implementations10 Sep 2021 Yitao Liu, Tianxiang Sun, Xipeng Qiu, Xuanjing Huang

This one-way interaction leads to the teacher's inability to perceive the characteristics of the student and its training progress.

Knowledge Distillation

Math Word Problem Solving with Explicit Numerical Values

1 code implementation ACL 2021 Qinzhuo Wu, Qi Zhang, Zhongyu Wei, Xuanjing Huang

In recent years, math word problem solving has received considerable attention and achieved promising results, but previous methods rarely take numerical values into consideration.

Math Math Word Problem Solving

Defense against Synonym Substitution-based Adversarial Attacks via Dirichlet Neighborhood Ensemble

1 code implementation ACL 2021 Yi Zhou, Xiaoqing Zheng, Cho-Jui Hsieh, Kai-Wei Chang, Xuanjing Huang

Although deep neural networks have achieved prominent performance on many NLP tasks, they are vulnerable to adversarial examples.


SENT: Sentence-level Distant Relation Extraction via Negative Training

1 code implementation ACL 2021 Ruotian Ma, Tao Gui, Linyang Li, Qi Zhang, Yaqian Zhou, Xuanjing Huang

In this work, we propose the use of negative training (NT), in which a model is trained using complementary labels regarding that ``the instance does not belong to these complementary labels".

Relation Relation Extraction +1

TCIC: Theme Concepts Learning Cross Language and Vision for Image Captioning

no code implementations21 Jun 2021 Zhihao Fan, Zhongyu Wei, Siyuan Wang, Ruize Wang, Zejun Li, Haijun Shan, Xuanjing Huang

Considering that theme concepts can be learned from both images and captions, we propose two settings for their representations learning based on TTN.

Image Captioning Representation Learning

SpanNER: Named Entity Re-/Recognition as Span Prediction

1 code implementation ACL 2021 Jinlan Fu, Xuanjing Huang, PengFei Liu

Recent years have seen the paradigm shift of Named Entity Recognition (NER) systems from sequence labeling to span prediction.

named-entity-recognition Named Entity Recognition +1

Exploration and Exploitation: Two Ways to Improve Chinese Spelling Correction Models

1 code implementation ACL 2021 Chong Li, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang

A sequence-to-sequence learning with neural networks has empirically proven to be an effective framework for Chinese Spelling Correction (CSC), which takes a sentence with some spelling errors as input and outputs the corrected one.

Sentence Spelling Correction +1

Early Exiting with Ensemble Internal Classifiers

no code implementations28 May 2021 Tianxiang Sun, Yunhua Zhou, Xiangyang Liu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu

In this paper, we show that a novel objective function for the training of the ensemble internal classifiers can be naturally induced from the perspective of ensemble learning and information theory.

Ensemble Learning

Accelerating BERT Inference for Sequence Labeling via Early-Exit

1 code implementation ACL 2021 Xiaonan Li, Yunfan Shao, Tianxiang Sun, Hang Yan, Xipeng Qiu, Xuanjing Huang

To alleviate this problem, we extend the recent successful early-exit mechanism to accelerate the inference of PTMs for sequence labeling tasks.


Certified Robustness to Text Adversarial Attacks by Randomized [MASK]

1 code implementation8 May 2021 Jiehang Zeng, Xiaoqing Zheng, Jianhan Xu, Linyang Li, Liping Yuan, Xuanjing Huang

Recently, few certified defense methods have been developed to provably guarantee the robustness of a text classifier to adversarial synonym substitutions.

Who Responded to Whom: The Joint Effects of Latent Topics and Discourse in Conversation Structure

no code implementations17 Apr 2021 Lu Ji, Jing Li, Zhongyu Wei, Qi Zhang, Xuanjing Huang

Numerous online conversations are produced on a daily basis, resulting in a pressing need to conversation understanding.

Larger-Context Tagging: When and Why Does It Work?

no code implementations NAACL 2021 Jinlan Fu, Liangjing Feng, Qi Zhang, Xuanjing Huang, PengFei Liu

The development of neural networks and pretraining techniques has spawned many sentence-level tagging systems that achieved superior performance on typical benchmarks.

Attribute Sentence

Enhancing Scientific Papers Summarization with Citation Graph

1 code implementation7 Apr 2021 Chenxin An, Ming Zhong, Yiran Chen, Danqing Wang, Xipeng Qiu, Xuanjing Huang

Previous work for text summarization in scientific domain mainly focused on the content of the input document, but seldom considering its citation network.

Text Summarization

Alleviate Exposure Bias in Sequence Prediction \\ with Recurrent Neural Networks

no code implementations22 Mar 2021 Liping Yuan, Jiangtao Feng, Xiaoqing Zheng, Xuanjing Huang

The key idea is that at each time step, the network takes as input a ``bundle'' of similar words predicted at the previous step instead of a single ground truth.

Generating Adversarial Examples in Chinese Texts Using Sentence-Pieces

no code implementations29 Dec 2020 Linyang Li, Yunfan Shao, Demin Song, Xipeng Qiu, Xuanjing Huang

The substitutions in the generated adversarial examples are not characters or words but \textit{'pieces'}, which are more natural to Chinese readers.

Language Modelling Sentence

Uncertainty-Aware Label Refinement for Sequence Labeling

1 code implementation EMNLP 2020 Tao Gui, Jiacheng Ye, Qi Zhang, Zhengyan Li, Zichu Fei, Yeyun Gong, Xuanjing Huang

Conditional random fields (CRF) for label decoding has become ubiquitous in sequence labeling tasks.

Topic-Oriented Spoken Dialogue Summarization for Customer Service with Saliency-Aware Topic Modeling

1 code implementation14 Dec 2020 Yicheng Zou, Lujun Zhao, Yangyang Kang, Jun Lin, Minlong Peng, Zhuoren Jiang, Changlong Sun, Qi Zhang, Xuanjing Huang, Xiaozhong Liu

In a customer service system, dialogue summarization can boost service efficiency by automatically creating summaries for long spoken dialogues in which customers and agents try to address issues about specific topics.

SenSeNet: Neural Keyphrase Generation with Document Structure

no code implementations12 Dec 2020 Yichao Luo, Zhengyan Li, Bingning Wang, Xiaoyu Xing, Qi Zhang, Xuanjing Huang

Keyphrase Generation (KG) is the task of generating central topics from a given document or literary work, which captures the crucial information necessary to understand the content.

Inductive Bias Keyphrase Generation +1

Modeling Evolution of Message Interaction for Rumor Resolution

no code implementations COLING 2020 Lei Chen, Zhongyu Wei, Jing Li, Baohua Zhou, Qi Zhang, Xuanjing Huang

Previous work for rumor resolution concentrates on exploiting time-series characteristics or modeling topology structure separately.

Time Series Time Series Analysis

Text Information Aggregation with Centrality Attention

no code implementations16 Nov 2020 Jingjing Gong, Hang Yan, Yining Zheng, Xipeng Qiu, Xuanjing Huang

A lot of natural language processing problems need to encode the text sequence as a fix-length vector, which usually involves aggregation process of combining the representations of all the words, such as pooling or self-attention.

Sentence text-classification +1

RethinkCWS: Is Chinese Word Segmentation a Solved Task?

1 code implementation EMNLP 2020 Jinlan Fu, PengFei Liu, Qi Zhang, Xuanjing Huang

The performance of the Chinese Word Segmentation (CWS) systems has gradually reached a plateau with the rapid development of deep neural networks, especially the successful use of large pre-trained models.

Chinese Word Segmentation

Cross-Lingual Dependency Parsing by POS-Guided Word Reordering

no code implementations Findings of the Association for Computational Linguistics 2020 Lu Liu, Yi Zhou, Jianhan Xu, Xiaoqing Zheng, Kai-Wei Chang, Xuanjing Huang

The words in each sentence of a source language corpus are rearranged to meet the word order in a target language under the guidance of a part-of-speech based language model (LM).

Dependency Parsing Language Modelling +2

CDEvalSumm: An Empirical Study of Cross-Dataset Evaluation for Neural Summarization Systems

2 code implementations Findings of the Association for Computational Linguistics 2020 Yiran Chen, PengFei Liu, Ming Zhong, Zi-Yi Dou, Danqing Wang, Xipeng Qiu, Xuanjing Huang

In this paper, we perform an in-depth analysis of characteristics of different datasets and investigate the performance of different summarization models under a cross-dataset setting, in which a summarizer trained on one corpus will be evaluated on a range of out-of-domain corpora.

Text Summarization

CoLAKE: Contextualized Language and Knowledge Embedding

1 code implementation COLING 2020 Tianxiang Sun, Yunfan Shao, Xipeng Qiu, Qipeng Guo, Yaru Hu, Xuanjing Huang, Zheng Zhang

With the emerging branch of incorporating factual knowledge into pre-trained language models such as BERT, most existing models consider shallow, static, and separately pre-trained entity embeddings, which limits the performance gains of these models.

Entity Embeddings Knowledge Graph Completion +1

fastHan: A BERT-based Multi-Task Toolkit for Chinese NLP

1 code implementation ACL 2021 Zhichao Geng, Hang Yan, Xipeng Qiu, Xuanjing Huang

The joint-model is trained and evaluated on 13 corpora of four tasks, yielding near state-of-the-art (SOTA) performance in dependency parsing and NER, achieving SOTA performance in CWS and POS.

Chinese Word Segmentation Dependency Parsing +6

Defense against Adversarial Attacks in NLP via Dirichlet Neighborhood Ensemble

1 code implementation20 Jun 2020 Yi Zhou, Xiaoqing Zheng, Cho-Jui Hsieh, Kai-Wei Chang, Xuanjing Huang

Despite neural networks have achieved prominent performance on many natural language processing (NLP) tasks, they are vulnerable to adversarial examples.


Heterogeneous Graph Neural Networks for Extractive Document Summarization

1 code implementation ACL 2020 Danqing Wang, PengFei Liu, Yining Zheng, Xipeng Qiu, Xuanjing Huang

An intuitive way is to put them in the graph-based neural network, which has a more complex structure for capturing inter-sentence relationships.

Document Summarization Extractive Document Summarization +3

FLAT: Chinese NER Using Flat-Lattice Transformer

1 code implementation ACL 2020 Xiaonan Li, Hang Yan, Xipeng Qiu, Xuanjing Huang

Recently, the character-word lattice structure has been proved to be effective for Chinese named entity recognition (NER) by incorporating the word information.

Chinese Named Entity Recognition named-entity-recognition +3