Search Results for author: Yuxian Meng

Found 40 papers, 21 papers with code

ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information

3 code implementations ACL 2021 Zijun Sun, Xiaoya Li, Xiaofei Sun, Yuxian Meng, Xiang Ao, Qing He, Fei Wu, Jiwei Li

Recent pretraining models in Chinese neglect two important aspects specific to the Chinese language: glyph and pinyin, which carry significant syntax and semantic information for language understanding.

Language Modelling Machine Reading Comprehension +5

A Unified MRC Framework for Named Entity Recognition

8 code implementations ACL 2020 Xiaoya Li, Jingrong Feng, Yuxian Meng, Qinghong Han, Fei Wu, Jiwei Li

Instead of treating the task of NER as a sequence labeling problem, we propose to formulate it as a machine reading comprehension (MRC) task.

Ranked #2 on Nested Mention Recognition on ACE 2004 (using extra training data)

Chinese Named Entity Recognition Entity Extraction using GAN +4

Glyce: Glyph-vectors for Chinese Character Representations

2 code implementations NeurIPS 2019 Yuxian Meng, Wei Wu, Fei Wang, Xiaoya Li, Ping Nie, Fan Yin, Muyu Li, Qinghong Han, Xiaofei Sun, Jiwei Li

However, due to the lack of rich pictographic evidence in glyphs and the weak generalization ability of standard computer vision models on character data, an effective way to utilize the glyph information remains to be found.

Chinese Dependency Parsing Chinese Named Entity Recognition +21

Dice Loss for Data-imbalanced NLP Tasks

2 code implementations ACL 2020 Xiaoya Li, Xiaofei Sun, Yuxian Meng, Junjun Liang, Fei Wu, Jiwei Li

Many NLP tasks such as tagging and machine reading comprehension are faced with the severe data imbalance issue: negative examples significantly outnumber positive examples, and the huge number of background examples (or easy-negative examples) overwhelms the training.

 Ranked #1 on Chinese Named Entity Recognition on OntoNotes 4 (using extra training data)

Chinese Named Entity Recognition Machine Reading Comprehension +5

BertGCN: Transductive Text Classification by Combining GCN and BERT

1 code implementation12 May 2021 Yuxiao Lin, Yuxian Meng, Xiaofei Sun, Qinghong Han, Kun Kuang, Jiwei Li, Fei Wu

In this work, we propose BertGCN, a model that combines large scale pretraining and transductive learning for text classification.

text-classification Text Classification +1

Dependency Parsing as MRC-based Span-Span Prediction

2 code implementations ACL 2022 Leilei Gan, Yuxian Meng, Kun Kuang, Xiaofei Sun, Chun Fan, Fei Wu, Jiwei Li

The proposed method has the following merits: (1) it addresses the fundamental problem that edges in a dependency tree should be constructed between subtrees; (2) the MRC framework allows the method to retrieve missing spans in the span proposal stage, which leads to higher recall for eligible spans.

Dependency Parsing Machine Reading Comprehension

OpenViDial: A Large-Scale, Open-Domain Dialogue Dataset with Visual Contexts

1 code implementation30 Dec 2020 Yuxian Meng, Shuhe Wang, Qinghong Han, Xiaofei Sun, Fei Wu, Rui Yan, Jiwei Li

Based on this dataset, we propose a family of encoder-decoder models leveraging both textual and visual contexts, from coarse-grained image features extracted from CNNs to fine-grained object features extracted from Faster R-CNNs.

Dialogue Generation

Modeling Text-visual Mutual Dependency for Multi-modal Dialog Generation

1 code implementation30 May 2021 Shuhe Wang, Yuxian Meng, Xiaofei Sun, Fei Wu, Rongbin Ouyang, Rui Yan, Tianwei Zhang, Jiwei Li

Specifically, we propose to model the mutual dependency between text-visual features, where the model not only needs to learn the probability of generating the next dialog utterance given preceding dialog utterances and visual contexts, but also the probability of predicting the visual features in which a dialog utterance takes place, leading the generated dialog utterance specific to the visual context.

OpenViDial 2.0: A Larger-Scale, Open-Domain Dialogue Generation Dataset with Visual Contexts

1 code implementation27 Sep 2021 Shuhe Wang, Yuxian Meng, Xiaoya Li, Xiaofei Sun, Rongbin Ouyang, Jiwei Li

In order to better simulate the real human conversation process, models need to generate dialogue utterances based on not only preceding textual contexts but also visual contexts.

Multi-modal Dialogue Generation

Self-Explaining Structures Improve NLP Models

1 code implementation3 Dec 2020 Zijun Sun, Chun Fan, Qinghong Han, Xiaofei Sun, Yuxian Meng, Fei Wu, Jiwei Li

The proposed model comes with the following merits: (1) span weights make the model self-explainable and do not require an additional probing model for interpretation; (2) the proposed model is general and can be adapted to any existing deep learning structures in NLP; (3) the weight associated with each text span provides direct importance scores for higher-level text units such as phrases and sentences.

Natural Language Inference Paraphrase Identification +1

$k$NN-NER: Named Entity Recognition with Nearest Neighbor Search

1 code implementation31 Mar 2022 Shuhe Wang, Xiaoya Li, Yuxian Meng, Tianwei Zhang, Rongbin Ouyang, Jiwei Li, Guoyin Wang

Inspired by recent advances in retrieval augmented methods in NLP~\citep{khandelwal2019generalization, khandelwal2020nearest, meng2021gnn}, in this paper, we introduce a $k$ nearest neighbor NER ($k$NN-NER) framework, which augments the distribution of entity labels by assigning $k$ nearest neighbors retrieved from the training set.

Few-Shot Learning named-entity-recognition +3

Neural Semi-supervised Learning for Text Classification Under Large-Scale Pretraining

1 code implementation17 Nov 2020 Zijun Sun, Chun Fan, Xiaofei Sun, Yuxian Meng, Fei Wu, Jiwei Li

The goal of semi-supervised learning is to utilize the unlabeled, in-domain dataset U to improve models trained on the labeled dataset D. Under the context of large-scale language-model (LM) pretraining, how we can make the best use of U is poorly understood: is semi-supervised learning still beneficial with the presence of large-scale pretraining?

Ranked #1000000000 on Text Classification on IMDb

General Classification Language Modelling +3

GNN-LM: Language Modeling based on Global Contexts via GNN

1 code implementation ICLR 2022 Yuxian Meng, Shi Zong, Xiaoya Li, Xiaofei Sun, Tianwei Zhang, Fei Wu, Jiwei Li

Inspired by the notion that ``{\it to copy is easier than to memorize}``, in this work, we introduce GNN-LM, which extends the vanilla neural language model (LM) by allowing to reference similar contexts in the entire training corpus.

Language Modelling

Fast Nearest Neighbor Machine Translation

1 code implementation Findings (ACL) 2022 Yuxian Meng, Xiaoya Li, Xiayu Zheng, Fei Wu, Xiaofei Sun, Tianwei Zhang, Jiwei Li

Fast $k$NN-MT constructs a significantly smaller datastore for the nearest neighbor search: for each word in a source sentence, Fast $k$NN-MT first selects its nearest token-level neighbors, which is limited to tokens that are the same as the query token.

Machine Translation NMT +2

An MRC Framework for Semantic Role Labeling

1 code implementation COLING 2022 Nan Wang, Jiwei Li, Yuxian Meng, Xiaofei Sun, Han Qiu, Ziyao Wang, Guoyin Wang, Jun He

We formalize predicate disambiguation as multiple-choice machine reading comprehension, where the descriptions of candidate senses of a given predicate are used as options to select the correct sense.

Computational Efficiency Machine Reading Comprehension +3

Defending Against Backdoor Attacks in Natural Language Generation

1 code implementation3 Jun 2021 Xiaofei Sun, Xiaoya Li, Yuxian Meng, Xiang Ao, Lingjuan Lyu, Jiwei Li, Tianwei Zhang

The frustratingly fragile nature of neural network models make current natural language generation (NLG) systems prone to backdoor attacks and generate malicious sequences that could be sexist or offensive.

Backdoor Attack Dialogue Generation +2

Triggerless Backdoor Attack for NLP Tasks with Clean Labels

2 code implementations NAACL 2022 Leilei Gan, Jiwei Li, Tianwei Zhang, Xiaoya Li, Yuxian Meng, Fei Wu, Yi Yang, Shangwei Guo, Chun Fan

To deal with this issue, in this paper, we propose a new strategy to perform textual backdoor attacks which do not require an external trigger, and the poisoned samples are correctly labeled.

Backdoor Attack Sentence

GNN-SL: Sequence Labeling Based on Nearest Examples via GNN

1 code implementation5 Dec 2022 Shuhe Wang, Yuxian Meng, Rongbin Ouyang, Jiwei Li, Tianwei Zhang, Lingjuan Lyu, Guoyin Wang

To better handle long-tail cases in the sequence labeling (SL) task, in this work, we introduce graph neural networks sequence labeling (GNN-SL), which augments the vanilla SL model output with similar tagging examples retrieved from the whole training set.

Chinese Word Segmentation named-entity-recognition +4

$k$Folden: $k$-Fold Ensemble for Out-Of-Distribution Detection

1 code implementation29 Aug 2021 Xiaoya Li, Jiwei Li, Xiaofei Sun, Chun Fan, Tianwei Zhang, Fei Wu, Yuxian Meng, Jun Zhang

For a task with $k$ training labels, $k$Folden induces $k$ sub-models, each of which is trained on a subset with $k-1$ categories with the left category masked unknown to the sub-model.

Attribute domain classification +4

Self Question-answering: Aspect-based Sentiment Analysis by Role Flipped Machine Reading Comprehension

1 code implementation Findings (EMNLP) 2021 Guoxin Yu, Jiwei Li, Ling Luo, Yuxian Meng, Xiang Ao, Qing He

In this paper, we investigate the unified ABSA task from the perspective of Machine Reading Comprehension (MRC) by observing that the aspect and the opinion terms can serve as the query and answer in MRC interchangeably.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +3

Is Word Segmentation Necessary for Deep Learning of Chinese Representations?

no code implementations ACL 2019 Xiaoya Li, Yuxian Meng, Xiaofei Sun, Qinghong Han, Arianna Yuan, Jiwei Li

Based on these observations, we conduct comprehensive experiments to study why word-based models underperform char-based models in these deep learning-based NLP tasks.

Chinese Word Segmentation Language Modelling +6

DSReg: Using Distant Supervision as a Regularizer

no code implementations ICLR 2020 Yuxian Meng, Muyu Li, Xiaoya Li, Wei Wu, Jiwei Li

In this paper, we aim at tackling a general issue in NLP tasks where some of the negative examples are highly similar to the positive examples, i. e., hard-negative examples.

Multi-Task Learning Reading Comprehension +2

Large-scale Pretraining for Neural Machine Translation with Tens of Billions of Sentence Pairs

no code implementations26 Sep 2019 Yuxian Meng, Xiangyuan Ren, Zijun Sun, Xiaoya Li, Arianna Yuan, Fei Wu, Jiwei Li

In this paper, we investigate the problem of training neural machine translation (NMT) systems with a dataset of more than 40 billion bilingual sentence pairs, which is larger than the largest dataset to date by orders of magnitude.

Machine Translation NMT +2

LAVA NAT: A Non-Autoregressive Translation Model with Look-Around Decoding and Vocabulary Attention

no code implementations8 Feb 2020 Xiaoya Li, Yuxian Meng, Arianna Yuan, Fei Wu, Jiwei Li

Non-autoregressive translation (NAT) models generate multiple tokens in one forward pass and is highly efficient at inference stage compared with autoregressive translation (AT) methods.

Translation

Non-Autoregressive Neural Dialogue Generation

no code implementations11 Feb 2020 Qinghong Han, Yuxian Meng, Fei Wu, Jiwei Li

Unfortunately, under the framework of the \sts model, direct decoding from $\log p(y|x) + \log p(x|y)$ is infeasible since the second part (i. e., $p(x|y)$) requires the completion of target generation before it can be computed, and the search space for $y$ is enormous.

Dialogue Generation Open-Domain Dialog +1

Improving Robustness and Generality of NLP Models Using Disentangled Representations

no code implementations21 Sep 2020 Jiawei Wu, Xiaoya Li, Xiang Ao, Yuxian Meng, Fei Wu, Jiwei Li

We show that models trained with the proposed criteria provide better robustness and domain adaptation ability in a wide range of supervised learning tasks.

Domain Adaptation Representation Learning

Summarize, Outline, and Elaborate: Long-Text Generation via Hierarchical Supervision from Extractive Summaries

no code implementations COLING 2022 Xiaofei Sun, Zijun Sun, Yuxian Meng, Jiwei Li, Chun Fan

The difficulty of generating coherent long texts lies in the fact that existing models overwhelmingly focus on predicting local words, and cannot make high level plans on what to generate or capture the high-level discourse dependencies between chunks of texts.

Text Generation

Pair the Dots: Jointly Examining Training History and Test Stimuli for Model Interpretability

no code implementations14 Oct 2020 Yuxian Meng, Chun Fan, Zijun Sun, Eduard Hovy, Fei Wu, Jiwei Li

Any prediction from a model is made by a combination of learning history and test stimuli.

Sentence Similarity Based on Contexts

no code implementations17 May 2021 Xiaofei Sun, Yuxian Meng, Xiang Ao, Fei Wu, Tianwei Zhang, Jiwei Li, Chun Fan

The proposed framework is based on the core idea that the meaning of a sentence should be defined by its contexts, and that sentence similarity can be measured by comparing the probabilities of generating two sentences given the same context.

Language Modelling Semantic Similarity +3

Parameter Estimation for the SEIR Model Using Recurrent Nets

no code implementations30 May 2021 Chun Fan, Yuxian Meng, Xiaofei Sun, Fei Wu, Tianwei Zhang, Jiwei Li

Next, based on this recurrent net that is able to generalize SEIR simulations, we are able to transform the objective to a differentiable one with respect to $\Theta_\text{SEIR}$, and straightforwardly obtain its optimal value.

Layer-wise Model Pruning based on Mutual Information

no code implementations EMNLP 2021 Chun Fan, Jiwei Li, Xiang Ao, Fei Wu, Yuxian Meng, Xiaofei Sun

The proposed pruning strategy offers merits over weight-based pruning techniques: (1) it avoids irregular memory access since representations and matrices can be squeezed into their smaller but dense counterparts, leading to greater speedup; (2) in a manner of top-down pruning, the proposed method operates from a more global perspective based on training signals in the top layer, and prunes each layer by propagating the effect of global signals through layers, leading to better performances at the same sparsity level.

Paraphrase Generation as Unsupervised Machine Translation

no code implementations COLING 2022 Xiaofei Sun, Yufei Tian, Yuxian Meng, Nanyun Peng, Fei Wu, Jiwei Li, Chun Fan

Then based on the paraphrase pairs produced by these UMT models, a unified surrogate model can be trained to serve as the final \sts model to generate paraphrases, which can be directly used for test in the unsupervised setup, or be finetuned on labeled datasets in the supervised setup.

Paraphrase Generation Sentence +3

BadPre: Task-agnostic Backdoor Attacks to Pre-trained NLP Foundation Models

no code implementations ICLR 2022 Kangjie Chen, Yuxian Meng, Xiaofei Sun, Shangwei Guo, Tianwei Zhang, Jiwei Li, Chun Fan

The key feature of our attack is that the adversary does not need prior information about the downstream tasks when implanting the backdoor to the pre-trained model.

Backdoor Attack Transfer Learning

Interpreting Deep Learning Models in Natural Language Processing: A Review

no code implementations20 Oct 2021 Xiaofei Sun, Diyi Yang, Xiaoya Li, Tianwei Zhang, Yuxian Meng, Han Qiu, Guoyin Wang, Eduard Hovy, Jiwei Li

Neural network models have achieved state-of-the-art performances in a wide range of natural language processing (NLP) tasks.

Faster Nearest Neighbor Machine Translation

no code implementations15 Dec 2021 Shuhe Wang, Jiwei Li, Yuxian Meng, Rongbin Ouyang, Guoyin Wang, Xiaoya Li, Tianwei Zhang, Shi Zong

The core idea of Faster $k$NN-MT is to use a hierarchical clustering strategy to approximate the distance between the query and a data point in the datastore, which is decomposed into two parts: the distance between the query and the center of the cluster that the data point belongs to, and the distance between the data point and the cluster center.

Machine Translation Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.