Search Results for author: Jiwei Li

Found 119 papers, 45 papers with code

Self Question-answering: Aspect-based Sentiment Analysis by Role Flipped Machine Reading Comprehension

1 code implementation Findings (EMNLP) 2021 Guoxin Yu, Jiwei Li, Ling Luo, Yuxian Meng, Xiang Ao, Qing He

In this paper, we investigate the unified ABSA task from the perspective of Machine Reading Comprehension (MRC) by observing that the aspect and the opinion terms can serve as the query and answer in MRC interchangeably.

Aspect-Based Sentiment Analysis (ABSA) Machine Reading Comprehension +2

Instruction Tuning for Large Language Models: A Survey

1 code implementation21 Aug 2023 Shengyu Zhang, Linfeng Dong, Xiaoya Li, Sen Zhang, Xiaofei Sun, Shuhe Wang, Jiwei Li, Runyi Hu, Tianwei Zhang, Fei Wu, Guoyin Wang

In this work, we make a systematic review of the literature, including the general methodology of IT, the construction of IT datasets, the training of IT models, and applications to different modalities, domains and applications, along with an analysis on aspects that influence the outcome of IT (e. g., generation of instruction outputs, size of the instruction dataset, etc).

Pushing the Limits of ChatGPT on NLP Tasks

no code implementations16 Jun 2023 Xiaofei Sun, Linfeng Dong, Xiaoya Li, Zhen Wan, Shuhe Wang, Tianwei Zhang, Jiwei Li, Fei Cheng, Lingjuan Lyu, Fei Wu, Guoyin Wang

In this work, we propose a collection of general modules to address these issues, in an attempt to push the limits of ChatGPT on NLP tasks.

Dependency Parsing Event Extraction +8

TaDSE: Template-aware Dialogue Sentence Embeddings

no code implementations23 May 2023 Minsik Oh, Jiwei Li, Guoyin Wang

We further introduce a novel analytic instrument of Semantic Compression method, for which we discover a correlation with uniformity and alignment.

Contrastive Learning intent-classification +5

Text Classification via Large Language Models

1 code implementation15 May 2023 Xiaofei Sun, Xiaoya Li, Jiwei Li, Fei Wu, Shangwei Guo, Tianwei Zhang, Guoyin Wang

This is due to (1) the lack of reasoning ability in addressing complex linguistic phenomena (e. g., intensification, contrast, irony etc); (2) limited number of tokens allowed in in-context learning.

Domain Adaptation SST-2 +2

GPT-RE: In-context Learning for Relation Extraction using Large Language Models

no code implementations3 May 2023 Zhen Wan, Fei Cheng, Zhuoyuan Mao, Qianying Liu, Haiyue Song, Jiwei Li, Sadao Kurohashi

In spite of the potential for ground-breaking achievements offered by large language models (LLMs) (e. g., GPT-3), they still lag significantly behind fully-supervised baselines (e. g., fine-tuned BERT) in relation extraction (RE).

Relation Extraction Retrieval

GPT-NER: Named Entity Recognition via Large Language Models

2 code implementations20 Apr 2023 Shuhe Wang, Xiaofei Sun, Xiaoya Li, Rongbin Ouyang, Fei Wu, Tianwei Zhang, Jiwei Li, Guoyin Wang

GPT-NER bridges the gap by transforming the sequence labeling task to a generation task that can be easily adapted by LLMs e. g., the task of finding location entities in the input text "Columbus is a city" is transformed to generate the text sequence "@@Columbus## is a city", where special tokens @@## marks the entity to extract.

named-entity-recognition Named Entity Recognition +3

Backdoor Attacks with Input-unique Triggers in NLP

no code implementations25 Mar 2023 Xukun Zhou, Jiwei Li, Tianwei Zhang, Lingjuan Lyu, Muqiao Yang, Jun He

Backdoor attack aims at inducing neural models to make incorrect predictions for poison data while keeping predictions on the clean dataset unchanged, which creates a considerable threat to current natural language processing (NLP) systems.

Backdoor Attack Language Modelling

Open World Classification with Adaptive Negative Samples

no code implementations9 Mar 2023 Ke Bai, Guoyin Wang, Jiwei Li, Sunghyun Park, Sungjin Lee, Puyang Xu, Ricardo Henao, Lawrence Carin

Open world classification is a task in natural language processing with key practical relevance and impact.


PK-ICR: Persona-Knowledge Interactive Context Retrieval for Grounded Dialogue

no code implementations13 Feb 2023 Minsik Oh, Joosung Lee, Jiwei Li, Guoyin Wang

Identifying relevant Persona or Knowledge for conversational systems is a critical component of grounded dialogue response generation.

Response Generation Retrieval

GNN-SL: Sequence Labeling Based on Nearest Examples via GNN

1 code implementation5 Dec 2022 Shuhe Wang, Yuxian Meng, Rongbin Ouyang, Jiwei Li, Tianwei Zhang, Lingjuan Lyu, Guoyin Wang

To better handle long-tail cases in the sequence labeling (SL) task, in this work, we introduce graph neural networks sequence labeling (GNN-SL), which augments the vanilla SL model output with similar tagging examples retrieved from the whole training set.

Chinese Word Segmentation named-entity-recognition +4

Rescue Implicit and Long-tail Cases: Nearest Neighbor Relation Extraction

1 code implementation21 Oct 2022 Zhen Wan, Qianying Liu, Zhuoyuan Mao, Fei Cheng, Sadao Kurohashi, Jiwei Li

Relation extraction (RE) has achieved remarkable progress with the help of pre-trained language models.

Relation Extraction

Ranking-Enhanced Unsupervised Sentence Representation Learning

1 code implementation9 Sep 2022 Yeon Seonwoo, Guoyin Wang, Changmin Seo, Sajal Choudhary, Jiwei Li, Xiang Li, Puyang Xu, Sunghyun Park, Alice Oh

In this work, we show that the semantic meaning of a sentence is also determined by nearest-neighbor sentences that are similar to the input sentence.

Contrastive Learning Data Augmentation +4

ShiftNAS: Towards Automatic Generation of Advanced Mulitplication-Less Neural Networks

no code implementations7 Apr 2022 Xiaoxuan Lou, Guowen Xu, Kangjie Chen, Guanlin Li, Jiwei Li, Tianwei Zhang

Multiplication-less neural networks significantly reduce the time and energy cost on the hardware platform, as the compute-intensive multiplications are replaced with lightweight bit-shift operations.

Neural Architecture Search

$k$NN-NER: Named Entity Recognition with Nearest Neighbor Search

1 code implementation31 Mar 2022 Shuhe Wang, Xiaoya Li, Yuxian Meng, Tianwei Zhang, Rongbin Ouyang, Jiwei Li, Guoyin Wang

Inspired by recent advances in retrieval augmented methods in NLP~\citep{khandelwal2019generalization, khandelwal2020nearest, meng2021gnn}, in this paper, we introduce a $k$ nearest neighbor NER ($k$NN-NER) framework, which augments the distribution of entity labels by assigning $k$ nearest neighbors retrieved from the training set.

Few-Shot Learning named-entity-recognition +3

Physical Backdoor Attacks to Lane Detection Systems in Autonomous Driving

no code implementations2 Mar 2022 Xingshuo Han, Guowen Xu, Yuan Zhou, Xuehuan Yang, Jiwei Li, Tianwei Zhang

However, DNN models are vulnerable to different types of adversarial attacks, which pose significant risks to the security and safety of the vehicles and passengers.

Autonomous Driving Backdoor Attack +1

Faster Nearest Neighbor Machine Translation

no code implementations15 Dec 2021 Shuhe Wang, Jiwei Li, Yuxian Meng, Rongbin Ouyang, Guoyin Wang, Xiaoya Li, Tianwei Zhang, Shi Zong

The core idea of Faster $k$NN-MT is to use a hierarchical clustering strategy to approximate the distance between the query and a data point in the datastore, which is decomposed into two parts: the distance between the query and the center of the cluster that the data point belongs to, and the distance between the data point and the cluster center.

Machine Translation Translation

A General Framework for Defending Against Backdoor Attacks via Influence Graph

no code implementations29 Nov 2021 Xiaofei Sun, Jiwei Li, Xiaoya Li, Ziyao Wang, Tianwei Zhang, Han Qiu, Fei Wu, Chun Fan

In this work, we propose a new and general framework to defend against backdoor attacks, inspired by the fact that attack triggers usually follow a \textsc{specific} type of attacking pattern, and therefore, poisoned training examples have greater impacts on each other during training.

Triggerless Backdoor Attack for NLP Tasks with Clean Labels

1 code implementation NAACL 2022 Leilei Gan, Jiwei Li, Tianwei Zhang, Xiaoya Li, Yuxian Meng, Fei Wu, Yi Yang, Shangwei Guo, Chun Fan

To deal with this issue, in this paper, we propose a new strategy to perform textual backdoor attacks which do not require an external trigger, and the poisoned samples are correctly labeled.

Backdoor Attack

Interpreting Deep Learning Models in Natural Language Processing: A Review

no code implementations20 Oct 2021 Xiaofei Sun, Diyi Yang, Xiaoya Li, Tianwei Zhang, Yuxian Meng, Han Qiu, Guoyin Wang, Eduard Hovy, Jiwei Li

Neural network models have achieved state-of-the-art performances in a wide range of natural language processing (NLP) tasks.

GNN-LM: Language Modeling based on Global Contexts via GNN

1 code implementation ICLR 2022 Yuxian Meng, Shi Zong, Xiaoya Li, Xiaofei Sun, Tianwei Zhang, Fei Wu, Jiwei Li

Inspired by the notion that ``{\it to copy is easier than to memorize}``, in this work, we introduce GNN-LM, which extends the vanilla neural language model (LM) by allowing to reference similar contexts in the entire training corpus.

Language Modelling

Fingerprinting Multi-exit Deep Neural Network Models via Inference Time

no code implementations7 Oct 2021 Tian Dong, Han Qiu, Tianwei Zhang, Jiwei Li, Hewu Li, Jialiang Lu

Specifically, we design an effective method to generate a set of fingerprint samples to craft the inference process with a unique and robust inference time cost as the evidence for model ownership.

BadPre: Task-agnostic Backdoor Attacks to Pre-trained NLP Foundation Models

no code implementations ICLR 2022 Kangjie Chen, Yuxian Meng, Xiaofei Sun, Shangwei Guo, Tianwei Zhang, Jiwei Li, Chun Fan

The key feature of our attack is that the adversary does not need prior information about the downstream tasks when implanting the backdoor to the pre-trained model.

Backdoor Attack Transfer Learning

A Novel Watermarking Framework for Ownership Verification of DNN Architectures

no code implementations29 Sep 2021 Xiaoxuan Lou, Shangwei Guo, Tianwei Zhang, Jiwei Li, Yinqian Zhang, Yang Liu

We present a novel watermarking scheme to achieve the intellectual property (IP) protection and ownership verification of DNN architectures.

Model extraction Neural Architecture Search

NASPY: Automated Extraction of Automated Machine Learning Models

no code implementations ICLR 2022 Xiaoxuan Lou, Shangwei Guo, Jiwei Li, Yaoxin Wu, Tianwei Zhang

We present NASPY, an end-to-end adversarial framework to extract the networkarchitecture of deep learning models from Neural Architecture Search (NAS).

BIG-bench Machine Learning Model extraction +1

Towards Robust Point Cloud Models with Context-Consistency Network and Adaptive Augmentation

no code implementations29 Sep 2021 Guanlin Li, Guowen Xu, Han Qiu, Ruan He, Jiwei Li, Tianwei Zhang

Extensive evaluations indicate the integration of the two techniques provides much more robustness than existing defense solutions for 3D models.

Data Augmentation

OpenViDial 2.0: A Larger-Scale, Open-Domain Dialogue Generation Dataset with Visual Contexts

1 code implementation27 Sep 2021 Shuhe Wang, Yuxian Meng, Xiaoya Li, Xiaofei Sun, Rongbin Ouyang, Jiwei Li

In order to better simulate the real human conversation process, models need to generate dialogue utterances based on not only preceding textual contexts but also visual contexts.

Multi-modal Dialogue Generation

An MRC Framework for Semantic Role Labeling

1 code implementation COLING 2022 Nan Wang, Jiwei Li, Yuxian Meng, Xiaofei Sun, Han Qiu, Ziyao Wang, Guoyin Wang, Jun He

We formalize predicate disambiguation as multiple-choice machine reading comprehension, where the descriptions of candidate senses of a given predicate are used as options to select the correct sense.

Machine Reading Comprehension Multiple-choice +1

Paraphrase Generation as Unsupervised Machine Translation

no code implementations COLING 2022 Xiaofei Sun, Yufei Tian, Yuxian Meng, Nanyun Peng, Fei Wu, Jiwei Li, Chun Fan

Then based on the paraphrase pairs produced by these UMT models, a unified surrogate model can be trained to serve as the final \sts model to generate paraphrases, which can be directly used for test in the unsupervised setup, or be finetuned on labeled datasets in the supervised setup.

Paraphrase Generation STS +2

$k$Folden: $k$-Fold Ensemble for Out-Of-Distribution Detection

1 code implementation29 Aug 2021 Xiaoya Li, Jiwei Li, Xiaofei Sun, Chun Fan, Tianwei Zhang, Fei Wu, Yuxian Meng, Jun Zhang

For a task with $k$ training labels, $k$Folden induces $k$ sub-models, each of which is trained on a subset with $k-1$ categories with the left category masked unknown to the sub-model.

Out-of-Distribution Detection Out of Distribution (OOD) Detection +2

Layer-wise Model Pruning based on Mutual Information

no code implementations EMNLP 2021 Chun Fan, Jiwei Li, Xiang Ao, Fei Wu, Yuxian Meng, Xiaofei Sun

The proposed pruning strategy offers merits over weight-based pruning techniques: (1) it avoids irregular memory access since representations and matrices can be squeezed into their smaller but dense counterparts, leading to greater speedup; (2) in a manner of top-down pruning, the proposed method operates from a more global perspective based on training signals in the top layer, and prunes each layer by propagating the effect of global signals through layers, leading to better performances at the same sparsity level.

ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information

3 code implementations ACL 2021 Zijun Sun, Xiaoya Li, Xiaofei Sun, Yuxian Meng, Xiang Ao, Qing He, Fei Wu, Jiwei Li

Recent pretraining models in Chinese neglect two important aspects specific to the Chinese language: glyph and pinyin, which carry significant syntax and semantic information for language understanding.

Language Modelling Machine Reading Comprehension +4

Fingerprinting Generative Adversarial Networks

no code implementations19 Jun 2021 Guanlin Li, Guowen Xu, Han Qiu, Shangwei Guo, Run Wang, Jiwei Li, Tianwei Zhang, Rongxing Lu

In this paper, we present the first fingerprinting scheme for the Intellectual Property (IP) protection of GANs.

Defending Against Backdoor Attacks in Natural Language Generation

1 code implementation3 Jun 2021 Xiaofei Sun, Xiaoya Li, Yuxian Meng, Xiang Ao, Lingjuan Lyu, Jiwei Li, Tianwei Zhang

The frustratingly fragile nature of neural network models make current natural language generation (NLG) systems prone to backdoor attacks and generate malicious sequences that could be sexist or offensive.

Backdoor Attack Dialogue Generation +2

Parameter Estimation for the SEIR Model Using Recurrent Nets

no code implementations30 May 2021 Chun Fan, Yuxian Meng, Xiaofei Sun, Fei Wu, Tianwei Zhang, Jiwei Li

Next, based on this recurrent net that is able to generalize SEIR simulations, we are able to transform the objective to a differentiable one with respect to $\Theta_\text{SEIR}$, and straightforwardly obtain its optimal value.

Modeling Text-visual Mutual Dependency for Multi-modal Dialog Generation

1 code implementation30 May 2021 Shuhe Wang, Yuxian Meng, Xiaofei Sun, Fei Wu, Rongbin Ouyang, Rui Yan, Tianwei Zhang, Jiwei Li

Specifically, we propose to model the mutual dependency between text-visual features, where the model not only needs to learn the probability of generating the next dialog utterance given preceding dialog utterances and visual contexts, but also the probability of predicting the visual features in which a dialog utterance takes place, leading the generated dialog utterance specific to the visual context.

Fast Nearest Neighbor Machine Translation

1 code implementation Findings (ACL) 2022 Yuxian Meng, Xiaoya Li, Xiayu Zheng, Fei Wu, Xiaofei Sun, Tianwei Zhang, Jiwei Li

Fast $k$NN-MT constructs a significantly smaller datastore for the nearest neighbor search: for each word in a source sentence, Fast $k$NN-MT first selects its nearest token-level neighbors, which is limited to tokens that are the same as the query token.

Machine Translation NMT +1

Sentence Similarity Based on Contexts

no code implementations17 May 2021 Xiaofei Sun, Yuxian Meng, Xiang Ao, Fei Wu, Tianwei Zhang, Jiwei Li, Chun Fan

The proposed framework is based on the core idea that the meaning of a sentence should be defined by its contexts, and that sentence similarity can be measured by comparing the probabilities of generating two sentences given the same context.

Language Modelling Semantic Similarity +2

Dependency Parsing as MRC-based Span-Span Prediction

2 code implementations ACL 2022 Leilei Gan, Yuxian Meng, Kun Kuang, Xiaofei Sun, Chun Fan, Fei Wu, Jiwei Li

The proposed method has the following merits: (1) it addresses the fundamental problem that edges in a dependency tree should be constructed between subtrees; (2) the MRC framework allows the method to retrieve missing spans in the span proposal stage, which leads to higher recall for eligible spans.

Dependency Parsing Machine Reading Comprehension

BertGCN: Transductive Text Classification by Combining GCN and BERT

1 code implementation12 May 2021 Yuxiao Lin, Yuxian Meng, Xiaofei Sun, Qinghong Han, Kun Kuang, Jiwei Li, Fei Wu

In this work, we propose BertGCN, a model that combines large scale pretraining and transductive learning for text classification.

text-classification Text Classification +1

Complex Spectral Mapping With Attention Based Convolution Recurrent Neural Network for Speech Enhancement

no code implementations12 Apr 2021 Liming Zhou, Yongyu Gao, Ziluo Wang, Jiwei Li, Wenbin Zhang

Speech enhancement has benefited from the success of deep learning in terms of intelligibility and perceptual quality.

Speech Enhancement

OpenViDial: A Large-Scale, Open-Domain Dialogue Dataset with Visual Contexts

1 code implementation30 Dec 2020 Yuxian Meng, Shuhe Wang, Qinghong Han, Xiaofei Sun, Fei Wu, Rui Yan, Jiwei Li

Based on this dataset, we propose a family of encoder-decoder models leveraging both textual and visual contexts, from coarse-grained image features extracted from CNNs to fine-grained object features extracted from Faster R-CNNs.

Dialogue Generation

Self-Explaining Structures Improve NLP Models

1 code implementation3 Dec 2020 Zijun Sun, Chun Fan, Qinghong Han, Xiaofei Sun, Yuxian Meng, Fei Wu, Jiwei Li

The proposed model comes with the following merits: (1) span weights make the model self-explainable and do not require an additional probing model for interpretation; (2) the proposed model is general and can be adapted to any existing deep learning structures in NLP; (3) the weight associated with each text span provides direct importance scores for higher-level text units such as phrases and sentences.

Natural Language Inference Paraphrase Identification +1

Neural Semi-supervised Learning for Text Classification Under Large-Scale Pretraining

1 code implementation17 Nov 2020 Zijun Sun, Chun Fan, Xiaofei Sun, Yuxian Meng, Fei Wu, Jiwei Li

The goal of semi-supervised learning is to utilize the unlabeled, in-domain dataset U to improve models trained on the labeled dataset D. Under the context of large-scale language-model (LM) pretraining, how we can make the best use of U is poorly understood: is semi-supervised learning still beneficial with the presence of large-scale pretraining?

General Classification Language Modelling +3

Summarize, Outline, and Elaborate: Long-Text Generation via Hierarchical Supervision from Extractive Summaries

no code implementations COLING 2022 Xiaofei Sun, Zijun Sun, Yuxian Meng, Jiwei Li, Chun Fan

The difficulty of generating coherent long texts lies in the fact that existing models overwhelmingly focus on predicting local words, and cannot make high level plans on what to generate or capture the high-level discourse dependencies between chunks of texts.

Text Generation

Pair the Dots: Jointly Examining Training History and Test Stimuli for Model Interpretability

no code implementations14 Oct 2020 Yuxian Meng, Chun Fan, Zijun Sun, Eduard Hovy, Fei Wu, Jiwei Li

Any prediction from a model is made by a combination of learning history and test stimuli.

Improving Robustness and Generality of NLP Models Using Disentangled Representations

no code implementations21 Sep 2020 Jiawei Wu, Xiaoya Li, Xiang Ao, Yuxian Meng, Fei Wu, Jiwei Li

We show that models trained with the proposed criteria provide better robustness and domain adaptation ability in a wide range of supervised learning tasks.

Domain Adaptation Representation Learning

Receptive Multi-granularity Representation for Person Re-Identification

no code implementations31 Aug 2020 Guanshuo Wang, Yufeng Yuan, Jiwei Li, Shiming Ge, Xi Zhou

Current stripe-based feature learning approaches have delivered impressive accuracy, but do not make a proper trade-off between diversity, locality, and robustness, which easily suffers from part semantic inconsistency for the conflict between rigid partition and misalignment.

Person Re-Identification

CorefQA: Coreference Resolution as Query-based Span Prediction

1 code implementation ACL 2020 Wei Wu, Fei Wang, Arianna Yuan, Fei Wu, Jiwei Li

In this paper, we present CorefQA, an accurate and extensible approach for the coreference resolution task.

Ranked #2 on Coreference Resolution on CoNLL 2012 (using extra training data)

coreference-resolution Data Augmentation +1

Analyzing COVID-19 on Online Social Media: Trends, Sentiments and Emotions

no code implementations29 May 2020 Xiaoya Li, Mingxin Zhou, Jiawei Wu, Arianna Yuan, Fei Wu, Jiwei Li

At the time of writing, the ongoing pandemic of coronavirus disease (COVID-19) has caused severe impacts on society, economy and people's daily lives.

Non-Autoregressive Neural Dialogue Generation

no code implementations11 Feb 2020 Qinghong Han, Yuxian Meng, Fei Wu, Jiwei Li

Unfortunately, under the framework of the \sts model, direct decoding from $\log p(y|x) + \log p(x|y)$ is infeasible since the second part (i. e., $p(x|y)$) requires the completion of target generation before it can be computed, and the search space for $y$ is enormous.

Dialogue Generation Open-Domain Dialog +1

LAVA NAT: A Non-Autoregressive Translation Model with Look-Around Decoding and Vocabulary Attention

no code implementations8 Feb 2020 Xiaoya Li, Yuxian Meng, Arianna Yuan, Fei Wu, Jiwei Li

Non-autoregressive translation (NAT) models generate multiple tokens in one forward pass and is highly efficient at inference stage compared with autoregressive translation (AT) methods.


Description Based Text Classification with Reinforcement Learning

no code implementations ICML 2020 Duo Chai, Wei Wu, Qinghong Han, Fei Wu, Jiwei Li

We observe significant performance boosts over strong baselines on a wide range of text classification tasks including single-label classification, multi-label classification and multi-aspect sentiment analysis.

General Classification Multi-Label Classification +6

Teaching Machines to Converse

1 code implementation31 Jan 2020 Jiwei Li

The ability of a machine to communicate with humans has long been associated with the general success of AI.

Dialogue Generation Question Answering

Dice Loss for Data-imbalanced NLP Tasks

2 code implementations ACL 2020 Xiaoya Li, Xiaofei Sun, Yuxian Meng, Junjun Liang, Fei Wu, Jiwei Li

Many NLP tasks such as tagging and machine reading comprehension are faced with the severe data imbalance issue: negative examples significantly outnumber positive examples, and the huge number of background examples (or easy-negative examples) overwhelms the training.

 Ranked #1 on Chinese Named Entity Recognition on OntoNotes 4 (using extra training data)

Chinese Named Entity Recognition Machine Reading Comprehension +5

Coreference Resolution as Query-based Span Prediction

1 code implementation5 Nov 2019 Wei Wu, Fei Wang, Arianna Yuan, Fei Wu, Jiwei Li

In this paper, we present an accurate and extensible approach for the coreference resolution task.

coreference-resolution Data Augmentation +1

A Unified MRC Framework for Named Entity Recognition

8 code implementations ACL 2020 Xiaoya Li, Jingrong Feng, Yuxian Meng, Qinghong Han, Fei Wu, Jiwei Li

Instead of treating the task of NER as a sequence labeling problem, we propose to formulate it as a machine reading comprehension (MRC) task.

Ranked #2 on Nested Mention Recognition on ACE 2004 (using extra training data)

Chinese Named Entity Recognition Entity Extraction using GAN +4

Large-scale Pretraining for Neural Machine Translation with Tens of Billions of Sentence Pairs

no code implementations26 Sep 2019 Yuxian Meng, Xiangyuan Ren, Zijun Sun, Xiaoya Li, Arianna Yuan, Fei Wu, Jiwei Li

In this paper, we investigate the problem of training neural machine translation (NMT) systems with a dataset of more than 40 billion bilingual sentence pairs, which is larger than the largest dataset to date by orders of magnitude.

Machine Translation NMT +1

Relation-Aware Pyramid Network (RapNet) for temporal action proposal

no code implementations9 Aug 2019 Jialin Gao, Zhixiang Shi, Jiani Li, Yufeng Yuan, Jiwei Li, Xi Zhou

In this technical report, we describe our solution to temporal action proposal (task 1) in ActivityNet Challenge 2019.

Deep Adversarial Learning for NLP

no code implementations NAACL 2019 William Yang Wang, Sameer Singh, Jiwei Li

Adversarial learning is a game-theoretic learning paradigm, which has achieved huge successes in the field of Computer Vision recently.

DSReg: Using Distant Supervision as a Regularizer

no code implementations ICLR 2020 Yuxian Meng, Muyu Li, Xiaoya Li, Wei Wu, Jiwei Li

In this paper, we aim at tackling a general issue in NLP tasks where some of the negative examples are highly similar to the positive examples, i. e., hard-negative examples.

Multi-Task Learning Reading Comprehension +2

Is Word Segmentation Necessary for Deep Learning of Chinese Representations?

no code implementations ACL 2019 Xiaoya Li, Yuxian Meng, Xiaofei Sun, Qinghong Han, Arianna Yuan, Jiwei Li

Based on these observations, we conduct comprehensive experiments to study why word-based models underperform char-based models in these deep learning-based NLP tasks.

Chinese Word Segmentation Language Modelling +4

DenseBody: Directly Regressing Dense 3D Human Pose and Shape From a Single Color Image

1 code implementation25 Mar 2019 Pengfei Yao, Zheng Fang, Fan Wu, Yao Feng, Jiwei Li

Recovering 3D human body shape and pose from 2D images is a challenging task due to high complexity and flexibility of human body, and relatively less 3D labeled data.

Glyce: Glyph-vectors for Chinese Character Representations

2 code implementations NeurIPS 2019 Yuxian Meng, Wei Wu, Fei Wang, Xiaoya Li, Ping Nie, Fan Yin, Muyu Li, Qinghong Han, Xiaofei Sun, Jiwei Li

However, due to the lack of rich pictographic evidence in glyphs and the weak generalization ability of standard computer vision models on character data, an effective way to utilize the glyph information remains to be found.

Chinese Dependency Parsing Chinese Named Entity Recognition +20

Pixel-Anchor: A Fast Oriented Scene Text Detector with Combined Networks

no code implementations19 Nov 2018 Yuan Li, Yuanjie Yu, Zefeng Li, Yangkun Lin, Meifang Xu, Jiwei Li, Xi Zhou

Recently, semantic segmentation and general object detection frameworks have been widely adopted by scene text detecting tasks.

object-detection Object Detection +1

Cascaded CNN-resBiLSTM-CTC: An End-to-End Acoustic Model For Speech Recognition

no code implementations29 Oct 2018 Xinpei Zhou, Jiwei Li, Xi Zhou

Automatic speech recognition (ASR) tasks are resolved by end-to-end deep learning models, which benefits us by less preparation of raw data, and easier transformation between languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

A novel pyramidal-FSMN architecture with lattice-free MMI for speech recognition

no code implementations26 Oct 2018 Xuerui Yang, Jiwei Li, Xi Zhou

Deep Feedforward Sequential Memory Network (DFSMN) has shown superior performance on speech recognition tasks.

Sound Audio and Speech Processing

Deep Reinforcement Learning for NLP

no code implementations ACL 2018 William Yang Wang, Jiwei Li, Xiaodong He

Many Natural Language Processing (NLP) tasks (including generation, language grounding, reasoning, information extraction, coreference resolution, and dialog) can be formulated as deep reinforcement learning (DRL) problems.

Atari Games coreference-resolution +7

Learning Discriminative Features with Multiple Granularities for Person Re-Identification

15 code implementations4 Apr 2018 Guanshuo Wang, Yufeng Yuan, Xiong Chen, Jiwei Li, Xi Zhou

Instead of learning on semantic regions, we uniformly partition the images into several stripes, and vary the number of parts in different local branches to obtain local feature representations with multiple granularities.

Ranked #3 on Person Re-Identification on SYSU-30k (using extra training data)

Person Re-Identification Re-Ranking

Neural Net Models of Open-domain Discourse Coherence

no code implementations EMNLP 2017 Jiwei Li, Dan Jurafsky

In this paper, we describe domain-independent neural models of discourse coherence that are capable of measuring multiple aspects of coherence in existing sentences and can maintain coherence while generating new sentences.

Abstractive Text Summarization Question Answering +2

Data Distillation for Controlling Specificity in Dialogue Generation

no code implementations22 Feb 2017 Jiwei Li, Will Monroe, Dan Jurafsky

We show that from such a set of subsystems, one can use reinforcement learning to build a system that tailors its output to different input contexts at test time.

Dialogue Generation reinforcement-learning +2

Learning to Decode for Future Success

no code implementations23 Jan 2017 Jiwei Li, Will Monroe, Dan Jurafsky

We introduce a simple, general strategy to manipulate the behavior of a neural decoder that enables it to generate outputs that have specific properties of interest (e. g., sequences of a pre-specified length).

Abstractive Text Summarization Decision Making +2

Adversarial Learning for Neural Dialogue Generation

8 code implementations EMNLP 2017 Jiwei Li, Will Monroe, Tianlin Shi, Sébastien Jean, Alan Ritter, Dan Jurafsky

In this paper, drawing intuition from the Turing test, we propose using adversarial training for open-domain dialogue generation: the system is trained to produce sequences that are indistinguishable from human-generated dialogue utterances.

Dialogue Evaluation Dialogue Generation +1

Understanding Neural Networks through Representation Erasure

no code implementations24 Dec 2016 Jiwei Li, Will Monroe, Dan Jurafsky

While neural networks have been successfully applied to many natural language processing tasks, they come at the cost of interpretability.

Sentiment Analysis

Learning through Dialogue Interactions by Asking Questions

2 code implementations15 Dec 2016 Jiwei Li, Alexander H. Miller, Sumit Chopra, Marc'Aurelio Ranzato, Jason Weston

A good dialogue agent should have the ability to interact with users by both responding to questions and by asking questions, and importantly to learn from both types of interaction.

reinforcement-learning Reinforcement Learning (RL)

Dialogue Learning With Human-In-The-Loop

2 code implementations29 Nov 2016 Jiwei Li, Alexander H. Miller, Sumit Chopra, Marc'Aurelio Ranzato, Jason Weston

An important aspect of developing conversational agents is to give a bot the ability to improve through communicating with humans and to learn from the mistakes that it makes.

Question Answering reinforcement-learning +1

A Simple, Fast Diverse Decoding Algorithm for Neural Generation

1 code implementation25 Nov 2016 Jiwei Li, Will Monroe, Dan Jurafsky

We further propose a variation that is capable of automatically adjusting its diversity decoding rates for different inputs using reinforcement learning (RL).

Abstractive Text Summarization Machine Translation +3

Deep Reinforcement Learning for Dialogue Generation

8 code implementations EMNLP 2016 Jiwei Li, Will Monroe, Alan Ritter, Michel Galley, Jianfeng Gao, Dan Jurafsky

Recent neural models of dialogue generation offer great promise for generating responses for conversational agents, but tend to be shortsighted, predicting utterances one at a time while ignoring their influence on future outcomes.

Dialogue Generation Policy Gradient Methods +2

Neural Net Models for Open-Domain Discourse Coherence

1 code implementation5 Jun 2016 Jiwei Li, Dan Jurafsky

In this paper, we describe domain-independent neural models of discourse coherence that are capable of measuring multiple aspects of coherence in existing sentences and can maintain coherence while generating new sentences.

Text Generation

A Persona-Based Neural Conversation Model

1 code implementation ACL 2016 Jiwei Li, Michel Galley, Chris Brockett, Georgios P. Spithourakis, Jianfeng Gao, Bill Dolan

We present persona-based models for handling the issue of speaker consistency in neural response generation.

Response Generation

Mutual Information and Diverse Decoding Improve Neural Machine Translation

1 code implementation4 Jan 2016 Jiwei Li, Dan Jurafsky

We introduce an alternative objective function for neural MT that maximizes the mutual information between the source and target sentences, modeling the bi-directional dependency of sources and targets.

Machine Translation Re-Ranking +1

Learning multi-faceted representations of individuals from heterogeneous evidence using neural networks

no code implementations18 Oct 2015 Jiwei Li, Alan Ritter, Dan Jurafsky

Inferring latent attributes of people online is an important social computing task, but requires integrating the many heterogeneous sources of information available on the web.

Community Detection Link Prediction

A Diversity-Promoting Objective Function for Neural Conversation Models

15 code implementations NAACL 2016 Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, Bill Dolan

Sequence-to-sequence neural network models for generation of conversational responses tend to generate safe, commonplace responses (e. g., "I don't know") regardless of the input.

Conversational Response Generation Response Generation

Reflections on Sentiment/Opinion Analysis

no code implementations6 Jul 2015 Jiwei Li, Eduard Hovy

In this paper, we described possible directions for deeper understanding, helping bridge the gap between psychology / cognitive science and computational approaches in sentiment/opinion analysis literature.

A Hierarchical Neural Autoencoder for Paragraphs and Documents

6 code implementations IJCNLP 2015 Jiwei Li, Minh-Thang Luong, Dan Jurafsky

Natural language generation of coherent long texts like paragraphs or longer documents is a challenging problem for recurrent networks models.

Text Generation

Do Multi-Sense Embeddings Improve Natural Language Understanding?

no code implementations EMNLP 2015 Jiwei Li, Dan Jurafsky

Learning a distinct representation for each sense of an ambiguous word could lead to more powerful and fine-grained models of vector-space representations.

named-entity-recognition Named Entity Recognition +5

Visualizing and Understanding Neural Models in NLP

1 code implementation NAACL 2016 Jiwei Li, Xinlei Chen, Eduard Hovy, Dan Jurafsky

While neural networks have been successfully applied to many NLP tasks the resulting vector-based models are very difficult to interpret.

The NLP Engine: A Universal Turing Machine for NLP

no code implementations28 Feb 2015 Jiwei Li, Eduard Hovy

It is commonly accepted that machine translation is a more complex task than part of speech tagging.

Machine Translation Part-Of-Speech Tagging +1

When Are Tree Structures Necessary for Deep Learning of Representations?

no code implementations EMNLP 2015 Jiwei Li, Minh-Thang Luong, Dan Jurafsky, Eudard Hovy

Recursive neural models, which use syntactic parse trees to recursively generate representations bottom-up, are a popular architecture.

Discourse Parsing Relation Extraction +2

Feature Weight Tuning for Recursive Neural Networks

no code implementations11 Dec 2014 Jiwei Li

This paper addresses how a recursive neural network model can automatically leave out useless information and emphasize important evidence, in other words, to perform "weight tuning" for higher-level representation acquisition.

Inferring User Preferences by Probabilistic Logical Reasoning over Social Networks

no code implementations11 Nov 2014 Jiwei Li, Alan Ritter, Dan Jurafsky

by building a probabilistic model that reasons over user attributes (the user's location or gender) and the social network (the user's friends and spouse), via inferences like homophily (I am more likely to like sushi if spouse or friends like sushi, I am more likely to like the Knicks if I live in New York).

Logical Reasoning Relation Extraction

What a Nasty day: Exploring Mood-Weather Relationship from Twitter

no code implementations30 Oct 2014 Jiwei Li, Xun Wang, Eduard Hovy

While it has long been believed in psychology that weather somehow influences human's mood, the debates have been going on for decades about how they are correlated.

Early Stage Influenza Detection from Twitter

no code implementations27 Sep 2013 Jiwei Li, Claire Cardie

In this paper, we investigate the real-time flu detection problem on Twitter data by proposing Flu Markov Network (Flu-MN): a spatio-temporal unsupervised Bayesian algorithm based on a 4 phase Markov Network, trying to identify the flu breakout at the earliest stage.

A Novel Feature-based Bayesian Model for Query Focused Multi-document Summarization

no code implementations TACL 2013 Jiwei Li, Sujian Li

Both supervised learning methods and LDA based topic model have been successfully applied in the field of query focused multi-document summarization.

Document Summarization Multi-Document Summarization +1

Cannot find the paper you are looking for? You can Submit a new open access paper.