Search Results for author: Shuming Shi

Found 94 papers, 24 papers with code

Learning from Sibling Mentions with Scalable Graph Inference in Fine-Grained Entity Typing

no code implementations ACL 2022 Yi Chen, Jiayang Cheng, Haiyun Jiang, Lemao Liu, Haisong Zhang, Shuming Shi, Ruifeng Xu

In this paper, we firstly empirically find that existing models struggle to handle hard mentions due to their insufficient contexts, which consequently limits their overall typing performance.

Entity Typing

On the Relationship between Neural Machine Translation and Word Alignment

no code implementations Xintong Li, Lemao Liu, Guanlin Li, Max Meng, Shuming Shi

We find that although NMT models are difficult to capture word alignment for CFT words but these words do not sacrifice translation quality significantly, which provides an explanation why NMT is more successful for translation yet worse for word alignment compared to statistical machine translation.

Machine Translation Translation +1

Redistributing Low-Frequency Words: Making the Most of Monolingual Data in Non-Autoregressive Translation

no code implementations ACL 2022 Liang Ding, Longyue Wang, Shuming Shi, DaCheng Tao, Zhaopeng Tu

In this work, we provide an appealing alternative for NAT – monolingual KD, which trains NAT student on external monolingual data with AT teacher trained on the original bilingual data.

Knowledge Distillation Translation +1

Tencent AI Lab Machine Translation Systems for the WMT21 Biomedical Translation Task

no code implementations WMT (EMNLP) 2021 Xing Wang, Zhaopeng Tu, Shuming Shi

This paper describes the Tencent AI Lab submission of the WMT2021 shared task on biomedical translation in eight language directions: English-German, English-French, English-Spanish and English-Russian.

Machine Translation Translation

Tencent Translation System for the WMT21 News Translation Task

no code implementations WMT (EMNLP) 2021 Longyue Wang, Mu Li, Fangxu Liu, Shuming Shi, Zhaopeng Tu, Xing Wang, Shuangzhi Wu, Jiali Zeng, Wen Zhang

Based on our success in the last WMT, we continuously employed advanced techniques such as large batch training, data selection and data filtering.

Data Augmentation Translation

Tencent AI Lab Machine Translation Systems for the WMT20 Biomedical Translation Task

1 code implementation WMT (EMNLP) 2020 Xing Wang, Zhaopeng Tu, Longyue Wang, Shuming Shi

This paper describes the Tencent AI Lab submission of the WMT2020 shared task on biomedical translation in four language directions: German<->English, English<->German, Chinese<->English and English<->Chinese.

Machine Translation Translation

Fine-grained Entity Typing without Knowledge Base

1 code implementation EMNLP 2021 Jing Qian, Yibin Liu, Lemao Liu, Yangming Li, Haiyun Jiang, Haisong Zhang, Shuming Shi

Existing work on Fine-grained Entity Typing (FET) typically trains automatic models on the datasets obtained by using Knowledge Bases (KB) as distant supervision.

Entity Typing Named Entity Recognition +1

An Empirical Study on Multiple Information Sources for Zero-Shot Fine-Grained Entity Typing

no code implementations EMNLP 2021 Yi Chen, Haiyun Jiang, Lemao Liu, Shuming Shi, Chuang Fan, Min Yang, Ruifeng Xu

Auxiliary information from multiple sources has been demonstrated to be effective in zero-shot fine-grained entity typing (ZFET).

Entity Typing

One Model, Multiple Modalities: A Sparsely Activated Approach for Text, Sound, Image, Video and Code

no code implementations12 May 2022 Yong Dai, Duyu Tang, Liangxin Liu, Minghuan Tan, Cong Zhou, Jingquan Wang, Zhangyin Feng, Fan Zhang, Xueyu Hu, Shuming Shi

Moreover, our model supports self-supervised pretraining with the same sparsely activated way, resulting in better initialized parameters for different modalities.

Image Retrieval Text-to-Image Retrieval

Pretraining Chinese BERT for Detecting Word Insertion and Deletion Errors

no code implementations26 Apr 2022 Cong Zhou, Yong Dai, Duyu Tang, Enbo Zhao, Zhangyin Feng, Li Kuang, Shuming Shi

We achieve this by introducing a special token \texttt{[null]}, the prediction of which stands for the non-existence of a word.

Language Modelling Masked Language Modeling

SkillNet-NLG: General-Purpose Natural Language Generation with a Sparsely Activated Approach

no code implementations26 Apr 2022 Junwei Liao, Duyu Tang, Fan Zhang, Shuming Shi

We present SkillNet-NLG, a sparsely activated approach that handles many natural language generation tasks with one model.

Multi-Task Learning Text Generation

A Model-Agnostic Data Manipulation Method for Persona-based Dialogue Generation

1 code implementation ACL 2022 Yu Cao, Wei Bi, Meng Fang, Shuming Shi, DaCheng Tao

To alleviate the above data issues, we propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model to improve its performance.

Dialogue Generation

Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation

no code implementations ACL 2022 Wenxuan Wang, Wenxiang Jiao, Yongchang Hao, Xing Wang, Shuming Shi, Zhaopeng Tu, Michael Lyu

In this paper, we present a substantial step in better understanding the SOTA sequence-to-sequence (Seq2Seq) pretraining for neural machine translation~(NMT).

Machine Translation Translation

Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

1 code implementation ACL 2022 Zhiwei He, Xing Wang, Rui Wang, Shuming Shi, Zhaopeng Tu

By carefully designing experiments, we identify two representative characteristics of the data gap in source: (1) style gap (i. e., translated vs. natural text style) that leads to poor generalization capability; (2) content gap that induces the model to produce hallucination content biased towards the target language.

Machine Translation Translation

MarkBERT: Marking Word Boundaries Improves Chinese BERT

no code implementations12 Mar 2022 Linyang Li, Yong Dai, Duyu Tang, Zhangyin Feng, Cong Zhou, Xipeng Qiu, Zenglin Xu, Shuming Shi

MarkBERT pushes the state-of-the-art of Chinese named entity recognition from 95. 4\% to 96. 5\% on the MSRA dataset and from 82. 8\% to 84. 2\% on the OntoNotes dataset, respectively.

Chinese Named Entity Recognition POS +4

Efficient Sub-structured Knowledge Distillation

1 code implementation9 Mar 2022 Wenye Lin, Yangming Li, Lemao Liu, Shuming Shi, Hai-Tao Zheng

Specifically, we transfer the knowledge from a teacher model to its student model by locally matching their predictions on all sub-structures, instead of the whole output space.

Knowledge Distillation Structured Prediction

Revisiting the Evaluation Metrics of Paraphrase Generation

no code implementations17 Feb 2022 Lingfeng Shen, Haiyun Jiang, Lemao Liu, Shuming Shi

(2) reference-free metrics outperform reference-based metrics, indicating that the standard references are unnecessary to evaluate the paraphrase's quality.

Machine Translation Paraphrase Generation

Rethink the Evaluation for Attack Strength of Backdoor Attacks in Natural Language Processing

no code implementations9 Jan 2022 Lingfeng Shen, Haiyun Jiang, Lemao Liu, Shuming Shi

It has been shown that natural language processing (NLP) models are vulnerable to a kind of security threat called the Backdoor Attack, which utilizes a `backdoor trigger' paradigm to mislead the models.

Backdoor Attack Text Classification

On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation

1 code implementation Findings (EMNLP) 2021 Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Shuming Shi, Zhaopeng Tu

Pre-training (PT) and back-translation (BT) are two simple and powerful methods to utilize monolingual data for improving the model performance of neural machine translation (NMT).

Machine Translation Translation

On the Language Coverage Bias for Neural Machine Translation

no code implementations Findings (ACL) 2021 Shuo Wang, Zhaopeng Tu, Zhixing Tan, Shuming Shi, Maosong Sun, Yang Liu

Language coverage bias, which indicates the content-dependent differences between sentence pairs originating from the source and target languages, is important for neural machine translation (NMT) because the target-original training data is not well exploited in current practice.

Data Augmentation Machine Translation +1

Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical Error Correction

1 code implementation ACL 2021 Piji Li, Shuming Shi

We investigate the problem of Chinese Grammatical Error Correction (CGEC) and present a new framework named Tail-to-Tail (\textbf{TtT}) non-autoregressive sequence prediction to address the deep issues hidden in CGEC.

Grammatical Error Correction

Self-Training Sampling with Monolingual Data Uncertainty for Neural Machine Translation

1 code implementation ACL 2021 Wenxiang Jiao, Xing Wang, Zhaopeng Tu, Shuming Shi, Michael R. Lyu, Irwin King

In this work, we propose to improve the sampling procedure by selecting the most informative monolingual sentences to complement the parallel data.

Machine Translation Translation

GWLAN: General Word-Level AutocompletioN for Computer-Aided Translation

no code implementations ACL 2021 Huayang Li, Lemao Liu, Guoping Huang, Shuming Shi

In this paper, we propose the task of general word-level autocompletion (GWLAN) from a real-world CAT scenario, and construct the first public benchmark to facilitate research in this topic.

Translation

REAM$\sharp$: An Enhancement Approach to Reference-based Evaluation Metrics for Open-domain Dialog Generation

no code implementations30 May 2021 Jun Gao, Wei Bi, Ruifeng Xu, Shuming Shi

We first clarify an assumption on reference-based metrics that, if more high-quality references are added into the reference set, the reliability of the metric will increase.

Open-Domain Dialog

TexSmart: A Text Understanding System for Fine-Grained NER and Enhanced Semantic Analysis

no code implementations31 Dec 2020 Haisong Zhang, Lemao Liu, Haiyun Jiang, Yangming Li, Enbo Zhao, Kun Xu, Linfeng Song, Suncong Zheng, Botong Zhou, Jianchen Zhu, Xiao Feng, Tao Chen, Tao Yang, Dong Yu, Feng Zhang, Zhanhui Kang, Shuming Shi

This technique report introduces TexSmart, a text understanding system that supports fine-grained named entity recognition (NER) and enhanced semantic analysis functionalities.

Named Entity Recognition NER

Dialogue Response Selection with Hierarchical Curriculum Learning

1 code implementation ACL 2021 Yixuan Su, Deng Cai, Qingyu Zhou, Zibo Lin, Simon Baker, Yunbo Cao, Shuming Shi, Nigel Collier, Yan Wang

As for IC, it progressively strengthens the model's ability in identifying the mismatching information between the dialogue context and a response candidate.

Conversational Response Selection

Predicting Events in MOBA Games: Prediction, Attribution, and Evaluation

no code implementations17 Dec 2020 Zelong Yang, Yan Wang, Piji Li, Shaobin Lin, Shuming Shi, Shao-Lun Huang, Wei Bi

The multiplayer online battle arena (MOBA) games have become increasingly popular in recent years.

Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition

1 code implementation ICLR 2021 Yangming Li, Lemao Liu, Shuming Shi

Experiments on synthetic datasets and real-world datasets show that our model is robust to unlabeled entity problem and surpasses prior baselines.

Named Entity Recognition NER

On the Sub-Layer Functionalities of Transformer Decoder

no code implementations Findings of the Association for Computational Linguistics 2020 Yilin Yang, Longyue Wang, Shuming Shi, Prasad Tadepalli, Stefan Lee, Zhaopeng Tu

There have been significant efforts to interpret the encoder of Transformer-based encoder-decoder architectures for neural machine translation (NMT); meanwhile, the decoder remains largely unexamined despite its critical role.

Machine Translation Translation

On the Branching Bias of Syntax Extracted from Pre-trained Language Models

no code implementations Findings of the Association for Computational Linguistics 2020 Huayang Li, Lemao Liu, Guoping Huang, Shuming Shi

Many efforts have been devoted to extracting constituency trees from pre-trained language models, often proceeding in two stages: feature definition and parsing.

Interpretable Real-Time Win Prediction for Honor of Kings, a Popular Mobile MOBA Esport

no code implementations14 Aug 2020 Zelong Yang, Zhufeng Pan, Yan Wang, Deng Cai, Xiaojiang Liu, Shuming Shi, Shao-Lun Huang

With the rapid prevalence and explosive development of MOBA esports (Multiplayer Online Battle Arena electronic sports), much research effort has been devoted to automatically predicting game results (win predictions).

Evaluating Explanation Methods for Neural Machine Translation

no code implementations ACL 2020 Jierui Li, Lemao Liu, Huayang Li, Guanlin Li, Guoping Huang, Shuming Shi

Recently many efforts have been devoted to interpreting the black-box NMT models, but little progress has been made on metrics to evaluate explanation methods.

Machine Translation Translation +1

On the Inference Calibration of Neural Machine Translation

1 code implementation ACL 2020 Shuo Wang, Zhaopeng Tu, Shuming Shi, Yang Liu

Confidence calibration, which aims to make model predictions equal to the true correctness measures, is important for neural machine translation (NMT) because it is able to offer useful indicators of translation errors in the generated output.

Machine Translation Translation

Assessing the Bilingual Knowledge Learned by Neural Machine Translation Models

no code implementations28 Apr 2020 Shilin He, Xing Wang, Shuming Shi, Michael R. Lyu, Zhaopeng Tu

In this paper, we bridge the gap by assessing the bilingual knowledge learned by NMT models with phrase table -- an interpretable table of bilingual lexicons.

Machine Translation Translation

The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection

no code implementations EMNLP 2020 Zibo Lin, Deng Cai, Yan Wang, Xiaojiang Liu, Hai-Tao Zheng, Shuming Shi

Despite that response selection is naturally a learning-to-rank problem, most prior works take a point-wise view and train binary classifiers for this task: each response candidate is labeled either relevant (one) or irrelevant (zero).

Conversational Response Selection Learning-To-Rank +1

Understanding Learning Dynamics for Neural Machine Translation

no code implementations5 Apr 2020 Conghui Zhu, Guanlin Li, Lemao Liu, Tiejun Zhao, Shuming Shi

Despite the great success of NMT, there still remains a severe challenge: it is hard to interpret the internal dynamics during its training process.

Machine Translation Translation

CASE: Context-Aware Semantic Expansion

no code implementations31 Dec 2019 Jialong Han, Aixin Sun, Haisong Zhang, Chenliang Li, Shuming Shi

In this study, we demonstrate that annotations for this task can be harvested at scale from existing corpora, in a fully automatic manner.

Word Sense Disambiguation

A Discrete CVAE for Response Generation on Short-Text Conversation

no code implementations IJCNLP 2019 Jun Gao, Wei Bi, Xiaojiang Liu, Junhui Li, Guodong Zhou, Shuming Shi

In this paper, we introduce a discrete latent variable with an explicit semantic meaning to improve the CVAE on short-text conversation.

Response Generation Short-Text Conversation +1

Neuron Interaction Based Representation Composition for Neural Machine Translation

no code implementations22 Nov 2019 Jian Li, Xing Wang, Baosong Yang, Shuming Shi, Michael R. Lyu, Zhaopeng Tu

Starting from this intuition, we propose a novel approach to compose representations learned by different components in neural machine translation (e. g., multi-layer networks or multi-head attention), based on modeling strong interactions among neurons in the representation vectors.

Machine Translation Translation

Go From the General to the Particular: Multi-Domain Translation with Domain Transformation Networks

2 code implementations22 Nov 2019 Yong Wang, Long-Yue Wang, Shuming Shi, Victor O. K. Li, Zhaopeng Tu

The key challenge of multi-domain translation lies in simultaneously encoding both the general knowledge shared across domains and the particular knowledge distinctive to each domain in a unified model.

Knowledge Distillation Machine Translation +1

Retrieval-guided Dialogue Response Generation via a Matching-to-Generation Framework

no code implementations IJCNLP 2019 Deng Cai, Yan Wang, Wei Bi, Zhaopeng Tu, Xiaojiang Liu, Shuming Shi

End-to-end sequence generation is a popular technique for developing open domain dialogue systems, though they suffer from the \textit{safe response problem}.

Response Generation

Semi-supervised Text Style Transfer: Cross Projection in Latent Space

no code implementations IJCNLP 2019 Mingyue Shang, Piji Li, Zhenxin Fu, Lidong Bing, Dongyan Zhao, Shuming Shi, Rui Yan

Text style transfer task requires the model to transfer a sentence of one style to another style while retaining its original content meaning, which is a challenging problem that has long suffered from the shortage of parallel data.

Style Transfer Text Style Transfer

Multi-Granularity Self-Attention for Neural Machine Translation

no code implementations IJCNLP 2019 Jie Hao, Xing Wang, Shuming Shi, Jinfeng Zhang, Zhaopeng Tu

Current state-of-the-art neural machine translation (NMT) uses a deep multi-head self-attention network with no explicit phrase information.

Machine Translation Translation

Towards Better Modeling Hierarchical Structure for Self-Attention with Ordered Neurons

no code implementations IJCNLP 2019 Jie Hao, Xing Wang, Shuming Shi, Jinfeng Zhang, Zhaopeng Tu

Recent studies have shown that a hybrid of self-attention networks (SANs) and recurrent neural networks (RNNs) outperforms both individual architectures, while not much is known about why the hybrid models work.

Machine Translation Translation

Self-Attention with Structural Position Representations

no code implementations IJCNLP 2019 Xing Wang, Zhaopeng Tu, Long-Yue Wang, Shuming Shi

Although self-attention networks (SANs) have advanced the state-of-the-art on various NLP tasks, one criticism of SANs is their ability of encoding positions of input words (Shaw et al., 2018).

Translation

Towards Understanding Neural Machine Translation with Word Importance

no code implementations IJCNLP 2019 Shilin He, Zhaopeng Tu, Xing Wang, Long-Yue Wang, Michael R. Lyu, Shuming Shi

Although neural machine translation (NMT) has advanced the state-of-the-art on various language pairs, the interpretability of NMT remains unsatisfactory.

Machine Translation Translation

Fine-Grained Sentence Functions for Short-Text Conversation

no code implementations ACL 2019 Wei Bi, Jun Gao, Xiaojiang Liu, Shuming Shi

Classification models are trained on this dataset to (i) recognize the sentence function of new data in a large corpus of short-text conversations; (ii) estimate a proper sentence function of the response given a test query.

Information Retrieval Short-Text Conversation

On the Word Alignment from Neural Machine Translation

no code implementations ACL 2019 Xintong Li, Guanlin Li, Lemao Liu, Max Meng, Shuming Shi

Prior researches suggest that neural machine translation (NMT) captures word alignment through its attention mechanism, however, this paper finds attention may almost fail to capture word alignment for some NMT models.

Machine Translation Translation +1

Topic-Aware Neural Keyphrase Generation for Social Media Language

2 code implementations ACL 2019 Yue Wang, Jing Li, Hou Pong Chan, Irwin King, Michael R. Lyu, Shuming Shi

Further discussions show that our model learns meaningful topics, which interprets its superiority in social media keyphrase generation.

Keyphrase Generation

Microblog Hashtag Generation via Encoding Conversation Contexts

1 code implementation NAACL 2019 Yue Wang, Jing Li, Irwin King, Michael R. Lyu, Shuming Shi

Automatic hashtag annotation plays an important role in content understanding for microblog posts.

Topic Models

Dynamic Layer Aggregation for Neural Machine Translation with Routing-by-Agreement

no code implementations15 Feb 2019 Zi-Yi Dou, Zhaopeng Tu, Xing Wang, Long-Yue Wang, Shuming Shi, Tong Zhang

With the promising progress of deep neural networks, layer aggregation has been used to fuse information across layers in various fields, such as computer vision and machine translation.

Machine Translation Translation

Neural Machine Translation with Adequacy-Oriented Learning

no code implementations21 Nov 2018 Xiang Kong, Zhaopeng Tu, Shuming Shi, Eduard Hovy, Tong Zhang

Although Neural Machine Translation (NMT) models have advanced state-of-the-art performance in machine translation, they face problems like the inadequate translation.

Machine Translation Translation

Generating Multiple Diverse Responses for Short-Text Conversation

no code implementations14 Nov 2018 Jun Gao, Wei Bi, Xiaojiang Liu, Junhui Li, Shuming Shi

In this paper, we propose a novel response generation model, which considers a set of responses jointly and generates multiple diverse responses simultaneously.

Informativeness reinforcement-learning +2

Exploiting Deep Representations for Neural Machine Translation

no code implementations EMNLP 2018 Zi-Yi Dou, Zhaopeng Tu, Xing Wang, Shuming Shi, Tong Zhang

Advanced neural machine translation (NMT) models generally implement encoder and decoder as multiple layers, which allows systems to model complex functions and capture complicated linguistic structures.

Machine Translation Translation

Towards Less Generic Responses in Neural Conversation Models: A Statistical Re-weighting Method

1 code implementation EMNLP 2018 Yahui Liu, Wei Bi, Jun Gao, Xiaojiang Liu, Jian Yao, Shuming Shi

We observe that in the conversation tasks, each query could have multiple responses, which forms a 1-to-n or m-to-n relationship in the view of the total corpus.

Dialogue Generation Machine Translation +1

Language Style Transfer from Sentences with Arbitrary Unknown Styles

no code implementations13 Aug 2018 Yanpeng Zhao, Wei Bi, Deng Cai, Xiaojiang Liu, Kewei Tu, Shuming Shi

Then, by recombining the content with the target style, we decode a sentence aligned in the target domain.

Sentence ReWriting Style Transfer

Directional Skip-Gram: Explicitly Distinguishing Left and Right Context for Word Embeddings

no code implementations NAACL 2018 Yan Song, Shuming Shi, Jing Li, Haisong Zhang

In this paper, we present directional skip-gram (DSG), a simple but effective enhancement of the skip-gram model by explicitly distinguishing left and right context in word prediction.

Learning Word Embeddings Part-Of-Speech Tagging +1

Target Foresight Based Attention for Neural Machine Translation

no code implementations NAACL 2018 Xintong Li, Lemao Liu, Zhaopeng Tu, Shuming Shi, Max Meng

In neural machine translation, an attention model is used to identify the aligned source words for a target word (target foresight word) in order to select translation context, but it does not make use of any information of this target foresight word at all.

Language Modelling Machine Translation +1

A Manually Annotated Chinese Corpus for Non-task-oriented Dialogue Systems

no code implementations15 May 2018 Jing Li, Yan Song, Haisong Zhang, Shuming Shi

This paper presents a large-scale corpus for non-task-oriented dialogue response selection, which contains over 27K distinct prompts more than 82K responses collected from social media.

Informativeness Task-Oriented Dialogue Systems

hyperdoc2vec: Distributed Representations of Hypertext Documents

1 code implementation ACL 2018 Jialong Han, Yan Song, Wayne Xin Zhao, Shuming Shi, Haisong Zhang

Hypertext documents, such as web pages and academic papers, are of great importance in delivering information in our daily life.

Citation Recommendation Document Embedding +1

Generative Stock Question Answering

no code implementations21 Apr 2018 Zhaopeng Tu, Yong Jiang, Xiaojiang Liu, Lei Shu, Shuming Shi

We study the problem of stock related question answering (StockQA): automatically generating answers to stock related questions, just like professional stock analysts providing action recommendations to stocks upon user's requests.

Question Answering

Translating Pro-Drop Languages with Reconstruction Models

1 code implementation10 Jan 2018 Long-Yue Wang, Zhaopeng Tu, Shuming Shi, Tong Zhang, Yvette Graham, Qun Liu

Next, the annotated source sentence is reconstructed from hidden representations in the NMT model.

Machine Translation Translation

Learning to Remember Translation History with a Continuous Cache

1 code implementation TACL 2018 Zhaopeng Tu, Yang Liu, Shuming Shi, Tong Zhang

Existing neural machine translation (NMT) models generally translate sentences in isolation, missing the opportunity to take advantage of document-level information.

Machine Translation Translation

Learning Fine-Grained Expressions to Solve Math Word Problems

no code implementations EMNLP 2017 Danqing Huang, Shuming Shi, Chin-Yew Lin, Jian Yin

This method learns the mappings between math concept phrases in math word problems and their math expressions from training data.

Math Word Problem Solving

Cannot find the paper you are looking for? You can Submit a new open access paper.