Search Results for author: Zhifang Sui

The safety of Large Language Models (LLMs) has gained increasing attention in recent years, but there still lacks a comprehensive approach for detecting safety issues within LLMs' responses in an aligned, customizable and explainable manner.

Paper
Code

PeriodicLoRA: Breaking the Low-Rank Bottleneck in LoRA Optimization

no code implementations • 25 Feb 2024 • Xiangdi Meng, Damai Dai, Weiyao Luo, Zhe Yang, Shaoxiang Wu, Xiaochen Wang, Peiyi Wang, Qingxiu Dong, Liang Chen, Zhifang Sui

Although LoRA fine-tuning is effective, there is still a performance gap compared to full fine-tuning, since its weight update is limited to low-rank matrices.

Paper
Add Code

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

no code implementations • 20 Feb 2024 • Haoran Li, Qingxiu Dong, Zhengyang Tang, Chaojun Wang, Xingxing Zhang, Haoyang Huang, Shaohan Huang, Xiaolong Huang, Zeqiang Huang, Dongdong Zhang, Yuxian Gu, Xin Cheng, Xun Wang, Si-Qing Chen, Li Dong, Wei Lu, Zhifang Sui, Benyou Wang, Wai Lam, Furu Wei

We introduce Generalized Instruction Tuning (called GLAN), a general and scalable method for instruction tuning of Large Language Models (LLMs).

Instruction Following Logical Reasoning +1

Paper
Add Code

Can Large Multimodal Models Uncover Deep Semantics Behind Images?

no code implementations • 17 Feb 2024 • Yixin Yang, Zheng Li, Qingxiu Dong, Heming Xia, Zhifang Sui

Understanding the deep semantics of images is essential in the era dominated by social media.

Paper
Add Code

Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding

1 code implementation • 15 Jan 2024 • Heming Xia, Zhe Yang, Qingxiu Dong, Peiyi Wang, Yongqi Li, Tao Ge, Tianyu Liu, Wenjie Li, Zhifang Sui

To mitigate the high inference latency stemming from autoregressive decoding in Large Language Models (LLMs), Speculative Decoding has emerged as a novel decoding paradigm for LLM inference.

Language Modelling Large Language Model

143

Paper
Code

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

1 code implementation • 11 Jan 2024 • Damai Dai, Chengqi Deng, Chenggang Zhao, R. X. Xu, Huazuo Gao, Deli Chen, Jiashi Li, Wangding Zeng, Xingkai Yu, Y. Wu, Zhenda Xie, Y. K. Li, Panpan Huang, Fuli Luo, Chong Ruan, Zhifang Sui, Wenfeng Liang

Subsequently, we scale up DeepSeekMoE to 16B parameters and show that it achieves comparable performance with LLaMA2 7B, with only about 40% of computations.

Language Modelling Large Language Model

814

Paper
Code

Language Models Understand Numbers, at Least Partially

no code implementations • 8 Jan 2024 • Fangwei Zhu, Damai Dai, Zhifang Sui

Large language models (LLMs) have exhibited impressive competence in various tasks, but their opaque internal mechanisms hinder their use in mathematical problems.

Math

Paper
Add Code

Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations

1 code implementation • 14 Dec 2023 • Peiyi Wang, Lei LI, Zhihong Shao, R. X. Xu, Damai Dai, Yifei Li, Deli Chen, Y. Wu, Zhifang Sui

In this paper, we present an innovative process-oriented math process reward model called \textbf{Math-Shepherd}, which assigns a reward score to each step of math problem solutions.

Ranked #13 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +2

Paper
Code

Guiding AMR Parsing with Reverse Graph Linearization

1 code implementation • 13 Oct 2023 • Bofei Gao, Liang Chen, Peiyi Wang, Zhifang Sui, Baobao Chang

Abstract Meaning Representation (AMR) parsing aims to extract an abstract semantic graph from a given sentence.

AMR Parsing Sentence

Paper
Code

Not All Demonstration Examples are Equally Beneficial: Reweighting Demonstration Examples for In-Context Learning

1 code implementation • 12 Oct 2023 • Zhe Yang, Damai Dai, Peiyi Wang, Zhifang Sui

To assess the quality of weights in the absence of additional validation data, we design a masked self-prediction (MSP) score that exhibits a strong correlation with the final ICL performance.

In-Context Learning text-classification +1

Paper
Code

InfoCL: Alleviating Catastrophic Forgetting in Continual Text Classification from An Information Theoretic Perspective

1 code implementation • 10 Oct 2023 • YiFan Song, Peiyi Wang, Weimin Xiong, Dawei Zhu, Tianyu Liu, Zhifang Sui, Sujian Li

Continual learning (CL) aims to constantly learn new knowledge over time while avoiding catastrophic forgetting on old tasks.

Continual Learning Contrastive Learning +3

Paper
Code

Large Language Model for Science: A Study on P vs. NP

1 code implementation • 11 Sep 2023 • Qingxiu Dong, Li Dong, Ke Xu, Guangyan Zhou, Yaru Hao, Zhifang Sui, Furu Wei

In this work, we use large language models (LLMs) to augment and accelerate research on the P versus NP problem, one of the most important open problems in theoretical computer science and mathematics.

Language Modelling Large Language Model

3,163

Paper
Code

Making Large Language Models Better Reasoners with Alignment

no code implementations • 5 Sep 2023 • Peiyi Wang, Lei LI, Liang Chen, Feifan Song, Binghuai Lin, Yunbo Cao, Tianyu Liu, Zhifang Sui

To address this problem, we introduce an \textit{Alignment Fine-Tuning (AFT)} paradigm, which involves three steps: 1) fine-tuning LLMs with COT training data; 2) generating multiple COT responses for each question, and categorizing them into positive and negative ones based on whether they achieve the correct answer; 3) calibrating the scores of positive and negative responses given by LLMs with a novel constraint alignment loss.

Paper
Add Code

Large Language Models are not Fair Evaluators

1 code implementation • 29 May 2023 • Peiyi Wang, Lei LI, Liang Chen, Zefan Cai, Dawei Zhu, Binghuai Lin, Yunbo Cao, Qi Liu, Tianyu Liu, Zhifang Sui

In this paper, we uncover a systematic bias in the evaluation paradigm of adopting large language models~(LLMs), e. g., GPT-4, as a referee to score and compare the quality of responses generated by candidate models.

Language Modelling Large Language Model +1

121

Paper
Code

Learn to Not Link: Exploring NIL Prediction in Entity Linking

1 code implementation • 25 May 2023 • Fangwei Zhu, Jifan Yu, Hailong Jin, Juanzi Li, Lei Hou, Zhifang Sui

We conduct a series of experiments with the widely used bi-encoder and cross-encoder entity linking models, results show that both types of NIL mentions in training data have a significant influence on the accuracy of NIL prediction.

Entity Linking

Paper
Code

Denoising Bottleneck with Mutual Information Maximization for Video Multimodal Fusion

1 code implementation • 24 May 2023 • Shaoxiang Wu, Damai Dai, Ziwei Qin, Tianyu Liu, Binghuai Lin, Yunbo Cao, Zhifang Sui

However, unlike other image-text multimodal tasks, video has longer multimodal sequences with more redundancy and noise in both visual and audio modalities.

Denoising Multimodal Sentiment Analysis

Paper
Code

ImageNetVC: Zero- and Few-Shot Visual Commonsense Evaluation on 1000 ImageNet Categories

1 code implementation • 24 May 2023 • Heming Xia, Qingxiu Dong, Lei LI, Jingjing Xu, Tianyu Liu, Ziwei Qin, Zhifang Sui

Recently, Large Language Models (LLMs) have been serving as general-purpose interfaces, posing a significant demand for comprehensive visual knowledge.

Common Sense Reasoning

Paper
Code

Bi-Drop: Enhancing Fine-tuning Generalization via Synchronous sub-net Estimation and Optimization

no code implementations • 24 May 2023 • Shoujie Tong, Heming Xia, Damai Dai, Runxin Xu, Tianyu Liu, Binghuai Lin, Yunbo Cao, Zhifang Sui

Also, Bi-Drop needs only one mini-batch to estimate the sub-net so it achieves higher utility of training data.

Natural Language Understanding

Paper
Add Code

RepCL: Exploring Effective Representation for Continual Text Classification

no code implementations • 12 May 2023 • YiFan Song, Peiyi Wang, Dawei Zhu, Tianyu Liu, Zhifang Sui, Sujian Li

Continual learning (CL) aims to constantly learn new knowledge over time while avoiding catastrophic forgetting on old tasks.

Continual Learning Representation Learning +2

Paper
Add Code

Enhancing Continual Relation Extraction via Classifier Decomposition

1 code implementation • 8 May 2023 • Heming Xia, Peiyi Wang, Tianyu Liu, Binghuai Lin, Yunbo Cao, Zhifang Sui

In this work, we point out that there exist two typical biases after training of this vanilla strategy: classifier bias and representation bias, which causes the previous knowledge that the model learned to be shaded.

Continual Relation Extraction Relation

Paper
Code

A Survey on In-context Learning

1 code implementation • 31 Dec 2022 • Qingxiu Dong, Damai Dai, Ce Zheng, Zhiyong Wu, Baobao Chang, Xu sun, Jingjing Xu, Lei LI, Zhifang Sui

With the increasing ability of large language models (LLMs), in-context learning (ICL) has become a new paradigm for natural language processing (NLP), where LLMs make predictions only based on contexts augmented with a few examples.

In-Context Learning

739

Paper
Code

Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers

1 code implementation • 20 Dec 2022 • Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Shuming Ma, Zhifang Sui, Furu Wei

We comprehensively compare the behaviors of in-context learning and explicit finetuning on real tasks to provide empirical evidence that supports our understanding.

In-Context Learning Open-Ended Question Answering

3,163

Paper
Code

Statistical Dataset Evaluation: Reliability, Difficulty, and Validity

no code implementations • 19 Dec 2022 • Chengwen Wang, Qingxiu Dong, Xiaochen Wang, Haitao Wang, Zhifang Sui

Taking the Named Entity Recognition (NER) datasets as a case study, we introduce $9$ statistical metrics for a statistical dataset evaluation framework.

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

DialogQAE: N-to-N Question Answer Pair Extraction from Customer Service Chatlog

no code implementations • 14 Dec 2022 • Xin Zheng, Tianyu Liu, Haoran Meng, Xu Wang, Yufan Jiang, Mengliang Rao, Binghuai Lin, Zhifang Sui, Yunbo Cao

Harvesting question-answer (QA) pairs from customer service chatlog in the wild is an efficient way to enrich the knowledge base for customer service chatbots in the cold start or continuous integration scenarios.

Retrieval

Paper
Add Code

DialogUSR: Complex Dialogue Utterance Splitting and Reformulation for Multiple Intent Detection

1 code implementation • 20 Oct 2022 • Haoran Meng, Zheng Xin, Tianyu Liu, Zizhen Wang, He Feng, Binghuai Lin, Xuemin Zhao, Yunbo Cao, Zhifang Sui

While interacting with chatbots, users may elicit multiple intents in a single dialogue utterance.

Intent Detection

Paper
Code

Learning Robust Representations for Continual Relation Extraction via Adversarial Class Augmentation

1 code implementation • 10 Oct 2022 • Peiyi Wang, YiFan Song, Tianyu Liu, Binghuai Lin, Yunbo Cao, Sujian Li, Zhifang Sui

In this paper, through empirical studies we argue that this assumption may not hold, and an important reason for catastrophic forgetting is that the learned representations do not have good robustness against the appearance of analogous relations in the subsequent learning process.

Continual Relation Extraction Relation

Paper
Code

Calibrating Factual Knowledge in Pretrained Language Models

1 code implementation • 7 Oct 2022 • Qingxiu Dong, Damai Dai, YiFan Song, Jingjing Xu, Zhifang Sui, Lei LI

However, we find that facts stored in the PLMs are not always correct.

Knowledge Probing Question Answering

Paper
Code

Less is More: Rethinking State-of-the-art Continual Relation Extraction Models with a Frustratingly Easy but Effective Approach

no code implementations • 1 Sep 2022 • Peiyi Wang, YiFan Song, Tianyu Liu, Rundong Gao, Binghuai Lin, Yunbo Cao, Zhifang Sui

2) Balanced Tuning (BT) finetunes the model on the balanced memory data.

Continual Relation Extraction

Paper
Add Code

Neural Knowledge Bank for Pretrained Transformers

no code implementations • 31 Jul 2022 • Damai Dai, Wenbin Jiang, Qingxiu Dong, Yajuan Lyu, Qiaoqiao She, Zhifang Sui

The ability of pretrained Transformers to remember factual knowledge is essential but still limited for existing models.

Language Modelling Machine Translation +2

Paper
Add Code

Robust Fine-tuning via Perturbation and Interpolation from In-batch Instances

1 code implementation • 2 May 2022 • Shoujie Tong, Qingxiu Dong, Damai Dai, YiFan Song, Tianyu Liu, Baobao Chang, Zhifang Sui

For each instance in a batch, we involve other instances in the same batch to interact with it.

Paper
Code

A Two-Stream AMR-enhanced Model for Document-level Event Argument Extraction

1 code implementation • NAACL 2022 • Runxin Xu, Peiyi Wang, Tianyu Liu, Shuang Zeng, Baobao Chang, Zhifang Sui

In this paper, we focus on extracting event arguments from an entire document, which mainly faces two critical problems: a) the long-distance dependency between trigger and arguments over sentences; b) the distracting context towards an event in the document.

Document-level Event Extraction Event Argument Extraction +2

Paper
Code

HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification

1 code implementation • 28 Apr 2022 • Zihan Wang, Peiyi Wang, Tianyu Liu, Binghuai Lin, Yunbo Cao, Zhifang Sui, Houfeng Wang

However, in this paradigm, there exists a huge gap between the classification tasks with sophisticated label hierarchy and the masked language model (MLM) pretraining tasks of PLMs and thus the potentials of PLMs can not be fully tapped.

Language Modelling Multi-Label Classification +2

Paper
Code

ATP: AMRize Then Parse! Enhancing AMR Parsing with PseudoAMRs

2 code implementations • Findings (NAACL) 2022 • Liang Chen, Peiyi Wang, Runxin Xu, Tianyu Liu, Zhifang Sui, Baobao Chang

As Abstract Meaning Representation (AMR) implicitly involves compound semantic annotations, we hypothesize auxiliary tasks which are semantically or formally related can better enhance AMR parsing.

Ranked #7 on AMR Parsing on LDC2020T02 (using extra training data)

AMR Parsing Dependency Parsing +1

Paper
Code

StableMoE: Stable Routing Strategy for Mixture of Experts

1 code implementation • ACL 2022 • Damai Dai, Li Dong, Shuming Ma, Bo Zheng, Zhifang Sui, Baobao Chang, Furu Wei

We point out that existing learning-to-route MoE methods suffer from the routing fluctuation issue, i. e., the target expert of the same input may change along with training, but only one expert will be activated for the input during inference.

Language Modelling Machine Translation

Paper
Code

Mixture of Experts for Biomedical Question Answering

no code implementations • 15 Apr 2022 • Damai Dai, Wenbin Jiang, Jiyuan Zhang, Weihua Peng, Yajuan Lyu, Zhifang Sui, Baobao Chang, Yong Zhu

In this paper, in order to alleviate the parameter competition problem, we propose a Mixture-of-Expert (MoE) based question answering method called MoEBQA that decouples the computation for different types of questions by sparse routing.

Question Answering

Paper
Add Code

Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation

2 code implementations • 30 Mar 2022 • Heming Xia, Tao Ge, Peiyi Wang, Si-Qing Chen, Furu Wei, Zhifang Sui

We propose Speculative Decoding (SpecDec), for the first time ever, to formally study exploiting the idea of speculative execution to accelerate autoregressive (AR) decoding.

Abstractive Text Summarization Machine Translation +1

Paper
Code

A Roadmap for Big Model

no code implementations • 26 Mar 2022 • Sha Yuan, Hanyu Zhao, Shuai Zhao, Jiahong Leng, Yangxiao Liang, Xiaozhi Wang, Jifan Yu, Xin Lv, Zhou Shao, Jiaao He, Yankai Lin, Xu Han, Zhenghao Liu, Ning Ding, Yongming Rao, Yizhao Gao, Liang Zhang, Ming Ding, Cong Fang, Yisen Wang, Mingsheng Long, Jing Zhang, Yinpeng Dong, Tianyu Pang, Peng Cui, Lingxiao Huang, Zheng Liang, HuaWei Shen, HUI ZHANG, Quanshi Zhang, Qingxiu Dong, Zhixing Tan, Mingxuan Wang, Shuo Wang, Long Zhou, Haoran Li, Junwei Bao, Yingwei Pan, Weinan Zhang, Zhou Yu, Rui Yan, Chence Shi, Minghao Xu, Zuobai Zhang, Guoqiang Wang, Xiang Pan, Mengjie Li, Xiaoyu Chu, Zijun Yao, Fangwei Zhu, Shulin Cao, Weicheng Xue, Zixuan Ma, Zhengyan Zhang, Shengding Hu, Yujia Qin, Chaojun Xiao, Zheni Zeng, Ganqu Cui, Weize Chen, Weilin Zhao, Yuan YAO, Peng Li, Wenzhao Zheng, Wenliang Zhao, Ziyi Wang, Borui Zhang, Nanyi Fei, Anwen Hu, Zenan Ling, Haoyang Li, Boxi Cao, Xianpei Han, Weidong Zhan, Baobao Chang, Hao Sun, Jiawen Deng, Chujie Zheng, Juanzi Li, Lei Hou, Xigang Cao, Jidong Zhai, Zhiyuan Liu, Maosong Sun, Jiwen Lu, Zhiwu Lu, Qin Jin, Ruihua Song, Ji-Rong Wen, Zhouchen Lin, LiWei Wang, Hang Su, Jun Zhu, Zhifang Sui, Jiajun Zhang, Yang Liu, Xiaodong He, Minlie Huang, Jian Tang, Jie Tang

With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm.

Language Modelling Machine Translation +1

Paper
Add Code

CUGE: A Chinese Language Understanding and Generation Evaluation Benchmark

no code implementations • 27 Dec 2021 • Yuan YAO, Qingxiu Dong, Jian Guan, Boxi Cao, Zhengyan Zhang, Chaojun Xiao, Xiaozhi Wang, Fanchao Qi, Junwei Bao, Jinran Nie, Zheni Zeng, Yuxian Gu, Kun Zhou, Xuancheng Huang, Wenhao Li, Shuhuai Ren, Jinliang Lu, Chengqiang Xu, Huadong Wang, Guoyang Zeng, Zile Zhou, Jiajun Zhang, Juanzi Li, Minlie Huang, Rui Yan, Xiaodong He, Xiaojun Wan, Xin Zhao, Xu sun, Yang Liu, Zhiyuan Liu, Xianpei Han, Erhong Yang, Zhifang Sui, Maosong Sun

We argue that for general-purpose language intelligence evaluation, the benchmark itself needs to be comprehensive and systematic.

Paper
Add Code

Hierarchical Curriculum Learning for AMR Parsing

1 code implementation • ACL 2022 • Peiyi Wang, Liang Chen, Tianyu Liu, Damai Dai, Yunbo Cao, Baobao Chang, Zhifang Sui

Abstract Meaning Representation (AMR) parsing aims to translate sentences to semantic representation with a hierarchical structure, and is recently empowered by pretrained sequence-to-sequence models.

AMR Parsing Representation Learning

Paper
Code

An Enhanced Span-based Decomposition Method for Few-Shot Sequence Labeling

1 code implementation • NAACL 2022 • Peiyi Wang, Runxin Xu, Tianyu Liu, Qingyu Zhou, Yunbo Cao, Baobao Chang, Zhifang Sui

Few-Shot Sequence Labeling (FSSL) is a canonical paradigm for the tagging models, e. g., named entity recognition and slot filling, to generalize on an emerging, resource-scarce domain.

Ranked #6 on Few-shot NER on Few-NERD (INTER)

Few-shot NER Meta-Learning +4

Paper
Code

Behind the Scenes: An Exploration of Trigger Biases Problem in Few-Shot Event Classification

1 code implementation • 29 Aug 2021 • Peiyi Wang, Runxin Xu, Tianyu Liu, Damai Dai, Baobao Chang, Zhifang Sui

However, we find they suffer from trigger biases that signify the statistical homogeneity between some trigger words and target event types, which we summarize as trigger overlapping and trigger separability.

Paper
Code

Explicit Interaction Network for Aspect Sentiment Triplet Extraction

no code implementations • 21 Jun 2021 • Peiyi Wang, Tianyu Liu, Damai Dai, Runxin Xu, Baobao Chang, Zhifang Sui

Table encoder extracts sentiment at token-pair level, so that the compositional feature between targets and opinions can be easily captured.

Aspect Sentiment Triplet Extraction Sentence +1

Paper
Add Code

CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark

2 code implementations • ACL 2022 • Ningyu Zhang, Mosha Chen, Zhen Bi, Xiaozhuan Liang, Lei LI, Xin Shang, Kangping Yin, Chuanqi Tan, Jian Xu, Fei Huang, Luo Si, Yuan Ni, Guotong Xie, Zhifang Sui, Baobao Chang, Hui Zong, Zheng Yuan, Linfeng Li, Jun Yan, Hongying Zan, Kunli Zhang, Buzhou Tang, Qingcai Chen

Artificial Intelligence (AI), along with the recent progress in biomedical language understanding, is gradually changing medical practice.

Ranked #1 on Semantic Similarity on CHIP-STS

Intent Classification Medical Concept Normalization +8

669

Paper
Code

Decompose, Fuse and Generate: A Formation-Informed Method for Chinese Definition Generation

no code implementations • NAACL 2021 • Hua Zheng, Damai Dai, Lei LI, Tianyu Liu, Zhifang Sui, Baobao Chang, Yang Liu

In this paper, we tackle the task of Definition Generation (DG) in Chinese, which aims at automatically generating a definition for a word.

Paper
Add Code

Problems and Countermeasures in Natural Language Processing Evaluation

no code implementations • 20 Apr 2021 • Qingxiu Dong, Zhifang Sui, Weidong Zhan, Baobao Chang

Starting from the concept, com-position, development and meaning of natural language evaluation, this article classifies and summarizes the tasks and char-acteristics of mainstream natural language evaluation, and then summarizes the problems and causes of natural language pro-cessing evaluation.

Position

Paper
Add Code

A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation

2 code implementations • ACL 2022 • Tianyu Liu, Yizhe Zhang, Chris Brockett, Yi Mao, Zhifang Sui, Weizhu Chen, Bill Dolan

Large pretrained generative models like GPT-3 often suffer from hallucinating non-existent or incorrect content, which undermines their potential merits in real applications.

Hallucination Sentence +1

Paper
Code

Knowledge Neurons in Pretrained Transformers

3 code implementations • ACL 2022 • Damai Dai, Li Dong, Yaru Hao, Zhifang Sui, Baobao Chang, Furu Wei

In this paper, we present preliminary studies on how factual knowledge is stored in pretrained Transformers by introducing the concept of knowledge neurons.

487

Paper
Code

Incorporating Connections Beyond Knowledge Embeddings: A Plug-and-Play Module to Enhance Commonsense Reasoning in Machine Reading Comprehension

no code implementations • 26 Mar 2021 • Damai Dai, Hua Zheng, Zhifang Sui, Baobao Chang

Conventional Machine Reading Comprehension (MRC) has been well-addressed by pattern matching, but the ability of commonsense reasoning remains a gap between humans and machines.

Knowledge Graph Embeddings Knowledge Graphs +1

Paper
Add Code

Towards Faithfulness in Open Domain Table-to-text Generation from an Entity-centric View

1 code implementation • 17 Feb 2021 • Tianyu Liu, Xin Zheng, Baobao Chang, Zhifang Sui

In open domain table-to-text generation, we notice that the unfaithful generation usually contains hallucinated content which can not be aligned to any input table record.

Few-Shot Learning Table-to-Text Generation

896

Paper
Code

Coarse-to-Fine Entity Representations for Document-level Relation Extraction

1 code implementation • 4 Dec 2020 • Damai Dai, Jing Ren, Shuang Zeng, Baobao Chang, Zhifang Sui

In classification, we combine the entity representations from both two levels into more comprehensive representations for relation extraction.

Ranked #34 on Relation Extraction on DocRED

Document-level Relation Extraction Relation

Paper
Code

An Anchor-Based Automatic Evaluation Metric for Document Summarization

no code implementations • COLING 2020 • Kexiang Wang, Tianyu Liu, Baobao Chang, Zhifang Sui

The widespread adoption of reference-based automatic evaluation metrics such as ROUGE has promoted the development of document summarization.

Document Summarization

Paper
Add Code

Discriminatively-Tuned Generative Classifiers for Robust Natural Language Inference

1 code implementation • EMNLP 2020 • Xiaoan Ding, Tianyu Liu, Baobao Chang, Zhifang Sui, Kevin Gimpel

We explore training objectives for discriminative fine-tuning of our generative classifiers, showing improvements over log loss fine-tuning from prior work .

Natural Language Inference

Paper
Code

An Empirical Study on Model-agnostic Debiasing Strategies for Robust Natural Language Inference

1 code implementation • CONLL 2020 • Tianyu Liu, Xin Zheng, Xiaoan Ding, Baobao Chang, Zhifang Sui

The prior work on natural language inference (NLI) debiasing mainly targets at one or few known biases while not necessarily making the models more robust.

Data Augmentation Natural Language Inference

Paper
Code

Inductively Representing Out-of-Knowledge-Graph Entities by Optimal Estimation Under Translational Assumptions

1 code implementation • ACL (RepL4NLP) 2021 • Damai Dai, Hua Zheng, Fuli Luo, Pengcheng Yang, Baobao Chang, Zhifang Sui

Conventional Knowledge Graph Completion (KGC) assumes that all test entities appear during training.

Knowledge Graph Embedding

Paper
Code

HypoNLI: Exploring the Artificial Patterns of Hypothesis-only Bias in Natural Language Inference

no code implementations • LREC 2020 • Tianyu Liu, Xin Zheng, Baobao Chang, Zhifang Sui

Many recent studies have shown that for models trained on datasets for natural language inference (NLI), it is possible to make correct predictions by merely looking at the hypothesis while completely ignoring the premise.

Natural Language Inference

Paper
Add Code

XGPT: Cross-modal Generative Pre-Training for Image Captioning

no code implementations • 3 Mar 2020 • Qiaolin Xia, Haoyang Huang, Nan Duan, Dong-dong Zhang, Lei Ji, Zhifang Sui, Edward Cui, Taroon Bharti, Xin Liu, Ming Zhou

While many BERT-based cross-modal pre-trained models produce excellent results on downstream understanding tasks like image-text retrieval and VQA, they cannot be applied to generation tasks directly.

Data Augmentation Denoising +7

Paper
Add Code

Multi-View Learning for Vision-and-Language Navigation

no code implementations • 2 Mar 2020 • Qiaolin Xia, Xiujun Li, Chunyuan Li, Yonatan Bisk, Zhifang Sui, Jianfeng Gao, Yejin Choi, Noah A. Smith

Learning to navigate in a visual environment following natural language instructions is a challenging task because natural language instructions are highly variable, ambiguous, and under-specified.

MULTI-VIEW LEARNING Navigate +1

Paper
Add Code

Pun-GAN: Generative Adversarial Network for Pun Generation

1 code implementation • IJCNLP 2019 • Fuli Luo, Shunyao Li, Pengcheng Yang, Lei LI, Baobao Chang, Zhifang Sui, Xu sun

It consists of a generator to produce pun sentences, and a discriminator to distinguish between the generated pun sentences and the real sentences with specific word senses.

Generative Adversarial Network Sentence

Paper
Code

Towards Fine-grained Text Sentiment Transfer

1 code implementation • ACL 2019 • Fuli Luo, Peng Li, Pengcheng Yang, Jie zhou, Yutong Tan, Baobao Chang, Zhifang Sui, Xu sun

In this paper, we focus on the task of fine-grained text sentiment transfer (FGST).

Paper
Code

Towards Comprehensive Description Generation from Factual Attribute-value Tables

no code implementations • ACL 2019 • Tianyu Liu, Fuli Luo, Pengcheng Yang, Wei Wu, Baobao Chang, Zhifang Sui

To relieve these problems, we first propose force attention (FA) method to encourage the generator to pay more attention to the uncovered attributes to avoid potential key attributes missing.

Attribute

Paper
Add Code

Learning to Control the Fine-grained Sentiment for Story Ending Generation

no code implementations • ACL 2019 • Fuli Luo, Damai Dai, Pengcheng Yang, Tianyu Liu, Baobao Chang, Zhifang Sui, Xu sun

Therefore, we propose a generic and novel framework which consists of a sentiment analyzer and a sentimental generator, respectively addressing the two challenges.

Text Generation

Paper
Add Code

A Dual Reinforcement Learning Framework for Unsupervised Text Style Transfer

2 code implementations • 24 May 2019 • Fuli Luo, Peng Li, Jie zhou, Pengcheng Yang, Baobao Chang, Zhifang Sui, Xu sun

Therefore, in this paper, we propose a dual reinforcement learning framework to directly transfer the style of the text via a one-step mapping model, without any separation of content and style.

Ranked #1 on Unsupervised Text Style Transfer on GYAFC

reinforcement-learning Reinforcement Learning (RL) +2

261

Paper
Code

Fine-grained Coordinated Cross-lingual Text Stream Alignment for Endless Language Knowledge Acquisition

no code implementations • EMNLP 2018 • Tao Ge, Qing Dou, Heng Ji, Lei Cui, Baobao Chang, Zhifang Sui, Furu Wei, Ming Zhou

This paper proposes to study fine-grained coordinated cross-lingual text stream alignment through a novel information network decipherment paradigm.

Decipherment Information Retrieval

Paper
Add Code

Leveraging Gloss Knowledge in Neural Word Sense Disambiguation by Hierarchical Co-Attention

no code implementations • EMNLP 2018 • Fuli Luo, Tianyu Liu, Zexue He, Qiaolin Xia, Zhifang Sui, Baobao Chang

The goal of Word Sense Disambiguation (WSD) is to identify the correct meaning of a word in the particular context.

Sentence Word Sense Disambiguation

Paper
Add Code

Incorporating Glosses into Neural Word Sense Disambiguation

1 code implementation • ACL 2018 • Fuli Luo, Tianyu Liu, Qiaolin Xia, Baobao Chang, Zhifang Sui

GAS models the semantic relationship between the context and the gloss in an improved memory network framework, which breaks the barriers of the previous supervised methods and knowledge-based methods.

Ranked #3 on Word Sense Disambiguation on SemEval 2015 Task 13

Word Sense Disambiguation

Paper
Code

EventWiki: A Knowledge Base of Major Events

no code implementations • LREC 2018 • Tao Ge, Lei Cui, Baobao Chang, Zhifang Sui, Furu Wei, Ming Zhou

Question Answering Semantic Parsing

Paper
Add Code

Revisiting Distant Supervision for Relation Extraction

no code implementations • LREC 2018 • Tingsong Jiang, Jing Liu, Chin-Yew Lin, Zhifang Sui

Relation Relation Extraction

Paper
Add Code

Table-to-text Generation by Structure-aware Seq2seq Learning

3 code implementations • 27 Nov 2017 • Tianyu Liu, Kexiang Wang, Lei Sha, Baobao Chang, Zhifang Sui

In the decoding phase, dual attention mechanism which contains word level attention and field level attention is proposed to model the semantic relevance between the generated description and the table.

Ranked #1 on Table-to-Text Generation on WikiBio

Table-to-Text Generation

150

Paper
Code

A Soft-label Method for Noise-tolerant Distantly Supervised Relation Extraction

no code implementations • EMNLP 2017 • Tianyu Liu, Kexiang Wang, Baobao Chang, Zhifang Sui

Distant-supervised relation extraction inevitably suffers from wrong labeling problems because it heuristically labels relational facts with knowledge bases.

Relation Relation Extraction +1

Paper
Add Code

Affinity-Preserving Random Walk for Multi-Document Summarization

no code implementations • EMNLP 2017 • Kexiang Wang, Tianyu Liu, Zhifang Sui, Baobao Chang

Multi-document summarization provides users with a short text that summarizes the information in a set of related documents.

Document Summarization Multi-Document Summarization

Paper
Add Code

Order-Planning Neural Text Generation From Structured Data

1 code implementation • 1 Sep 2017 • Lei Sha, Lili Mou, Tianyu Liu, Pascal Poupart, Sujian Li, Baobao Chang, Zhifang Sui

Generating texts from structured data (e. g., a table) is important for various natural language processing tasks such as question answering and dialog systems.

Question Answering Table-to-Text Generation

Paper
Code

A Progressive Learning Approach to Chinese SRL Using Heterogeneous Data

no code implementations • ACL 2017 • Qiaolin Xia, Lei Sha, Baobao Chang, Zhifang Sui

But the training data of single corpus is often limited.

Chinese Semantic Role Labeling Machine Translation +2

Paper
Add Code

Improving Chinese SRL with Heterogeneous Annotations

no code implementations • 22 Feb 2017 • Qiaolin Xia, Baobao Chang, Zhifang Sui

Previous studies on Chinese semantic role labeling (SRL) have concentrated on single semantically annotated corpus.

Chinese Semantic Role Labeling Semantic Role Labeling

Paper
Add Code

Reading and Thinking: Re-read LSTM Unit for Textual Entailment Recognition

no code implementations • COLING 2016 • Lei Sha, Baobao Chang, Zhifang Sui, Sujian Li

After read the premise again, the model can get a better understanding of the premise, which can also affect the understanding of the hypothesis.

Ranked #42 on Natural Language Inference on SNLI

Information Retrieval Machine Translation +4

Paper
Add Code

Towards Time-Aware Knowledge Graph Completion

no code implementations • COLING 2016 • Tingsong Jiang, Tianyu Liu, Tao Ge, Lei Sha, Baobao Chang, Sujian Li, Zhifang Sui

In this paper, we present a novel time-aware knowledge graph completion model that is able to predict links in a KG using both the existing facts and the temporal information of the facts.

Question Answering Relation Extraction +1

Paper
Add Code

Event Detection with Burst Information Networks

no code implementations • COLING 2016 • Tao Ge, Lei Cui, Baobao Chang, Zhifang Sui, Ming Zhou

Retrospective event detection is an important task for discovering previously unidentified events in a text stream.

Clustering Event Detection

Paper
Add Code

News Stream Summarization using Burst Information Networks

no code implementations • EMNLP 2016 • Tao Ge, Lei Cui, Baobao Chang, Sujian Li, Ming Zhou, Zhifang Sui

Document Summarization Multi-Document Summarization

Paper
Add Code

Encoding Temporal Information for Time-Aware Link Prediction

no code implementations • EMNLP 2016 • Tingsong Jiang, Tianyu Liu, Tao Ge, Lei Sha, Sujian Li, Baobao Chang, Zhifang Sui

Link Prediction

Paper
Add Code

Capturing Argument Relationship for Chinese Semantic Role Labeling

no code implementations • EMNLP 2016 • Lei Sha, Sujian Li, Baobao Chang, Zhifang Sui, Tingsong Jiang

Chinese Semantic Role Labeling Semantic Role Labeling

Paper
Add Code

Aligning Coordinated Text Streams through Burst Information Network Construction and Decipherment

no code implementations • 27 Sep 2016 • Tao Ge, Qing Dou, Xiaoman Pan, Heng Ji, Lei Cui, Baobao Chang, Zhifang Sui, Ming Zhou

We introduce a novel Burst Information Network (BINet) representation that can display the most important information and illustrate the connections among bursty entities, events and keywords in the corpus.

Decipherment Translation

Paper
Add Code

RBPB: Regularization-Based Pattern Balancing Method for Event Extraction

no code implementations • ACL 2016 • Lei Sha, Jing Liu, Chin-Yew Lin, Sujian Li, Baobao Chang, Zhifang Sui

Dependency Parsing Event Extraction +2

Paper
Add Code

Implicit Discourse Relation Classification via Multi-Task Neural Networks

no code implementations • 9 Mar 2016 • Yang Liu, Sujian Li, Xiaodong Zhang, Zhifang Sui

Without discourse connectives, classifying implicit discourse relations is a challenging task and a bottleneck for building a practical discourse parser.

Classification General Classification +3

Paper
Add Code

Joint Learning Templates and Slots for Event Schema Induction

no code implementations • NAACL 2016 • Lei Sha, Sujian Li, Baobao Chang, Zhifang Sui

Automatic event schema induction (AESI) means to extract meta-event from raw text, in other words, to find out what types (templates) of event may exist in the raw text and what roles (slots) may exist in each event type.

Image Segmentation Semantic Segmentation +1