no code implementations • WMT (EMNLP) 2020 • Shuangzhi Wu, Xing Wang, Longyue Wang, Fangxu Liu, Jun Xie, Zhaopeng Tu, Shuming Shi, Mu Li
This paper describes Tencent Neural Machine Translation systems for the WMT 2020 news translation tasks.
no code implementations • Xintong Li, Lemao Liu, Guanlin Li, Max Meng, Shuming Shi
We find that although NMT models are difficult to capture word alignment for CFT words but these words do not sacrifice translation quality significantly, which provides an explanation why NMT is more successful for translation yet worse for word alignment compared to statistical machine translation.
1 code implementation • ACL 2022 • Liang Ding, Longyue Wang, Shuming Shi, DaCheng Tao, Zhaopeng Tu
In this work, we provide an appealing alternative for NAT – monolingual KD, which trains NAT student on external monolingual data with AT teacher trained on the original bilingual data.
no code implementations • ACL 2022 • Yi Chen, Jiayang Cheng, Haiyun Jiang, Lemao Liu, Haisong Zhang, Shuming Shi, Ruifeng Xu
In this paper, we firstly empirically find that existing models struggle to handle hard mentions due to their insufficient contexts, which consequently limits their overall typing performance.
no code implementations • ACL 2022 • Yanling Xiao, Lemao Liu, Guoping Huang, Qu Cui, ShuJian Huang, Shuming Shi, Jiajun Chen
In this work, we propose a novel BiTIIMT system, Bilingual Text-Infilling for Interactive Neural Machine Translation.
no code implementations • WMT (EMNLP) 2021 • Xing Wang, Zhaopeng Tu, Shuming Shi
This paper describes the Tencent AI Lab submission of the WMT2021 shared task on biomedical translation in eight language directions: English-German, English-French, English-Spanish and English-Russian.
no code implementations • WMT (EMNLP) 2021 • Longyue Wang, Mu Li, Fangxu Liu, Shuming Shi, Zhaopeng Tu, Xing Wang, Shuangzhi Wu, Jiali Zeng, Wen Zhang
Based on our success in the last WMT, we continuously employed advanced techniques such as large batch training, data selection and data filtering.
1 code implementation • WMT (EMNLP) 2020 • Xing Wang, Zhaopeng Tu, Longyue Wang, Shuming Shi
This paper describes the Tencent AI Lab submission of the WMT2020 shared task on biomedical translation in four language directions: German<->English, English<->German, Chinese<->English and English<->Chinese.
no code implementations • WMT (EMNLP) 2020 • Longyue Wang, Zhaopeng Tu, Xing Wang, Li Ding, Liang Ding, Shuming Shi
This paper describes the Tencent AI Lab’s submission of the WMT 2020 shared task on chat translation in English-German.
1 code implementation • EMNLP 2021 • Jing Qian, Yibin Liu, Lemao Liu, Yangming Li, Haiyun Jiang, Haisong Zhang, Shuming Shi
Existing work on Fine-grained Entity Typing (FET) typically trains automatic models on the datasets obtained by using Knowledge Bases (KB) as distant supervision.
no code implementations • EMNLP 2021 • Yi Chen, Haiyun Jiang, Lemao Liu, Shuming Shi, Chuang Fan, Min Yang, Ruifeng Xu
Auxiliary information from multiple sources has been demonstrated to be effective in zero-shot fine-grained entity typing (ZFET).
no code implementations • 6 Mar 2025 • Yi Shen, Jian Zhang, Jieyun Huang, Shuming Shi, Wenjing Zhang, Jiangze Yan, Ning Wang, Kai Wang, Shiguo Lian
Recent advancements in slow-thinking reasoning models have shown exceptional performance in complex reasoning tasks.
no code implementations • 15 Oct 2024 • Tsz Ting Chung, Leyang Cui, Lemao Liu, Xinting Huang, Shuming Shi, Dit-yan Yeung
Large Language Models (LLMs) have demonstrated impressive capabilities in a wide range of natural language processing tasks when leveraging in-context learning.
1 code implementation • 29 Jul 2024 • Cheng Yang, Guoping Huang, Mo Yu, Zhirui Zhang, Siheng Li, Mingming Yang, Shuming Shi, Yujiu Yang, Lemao Liu
Existing work addresses this task through a classification model based on a neural network that maps the hidden vector of the input context into its corresponding label (i. e., the candidate target word is treated as a label).
no code implementations • 25 Jun 2024 • Sen yang, Leyang Cui, Deng Cai, Xinting Huang, Shuming Shi, Wai Lam
Iterative preference learning, though yielding superior performances, requires online annotated preference labels.
no code implementations • 24 Jun 2024 • Deng Cai, Huayang Li, Tingchen Fu, Siheng Li, Weiwen Xu, Shuaiyi Li, Bowen Cao, Zhisong Zhang, Xinting Huang, Leyang Cui, Yan Wang, Lemao Liu, Taro Watanabe, Shuming Shi
Despite the general capabilities of pre-trained large language models (LLMs), they still need further adaptation to better serve practical applications.
1 code implementation • 22 May 2024 • Tingchen Fu, Deng Cai, Lemao Liu, Shuming Shi, Rui Yan
However, the performance of LLMs on standard knowledge and reasoning benchmarks tends to suffer from deterioration at the latter stage of the SFT process, echoing the phenomenon of alignment tax.
1 code implementation • 21 May 2024 • Yafu Li, Zhilin Wang, Leyang Cui, Wei Bi, Shuming Shi, Yue Zhang
To this end, we propose a novel detection framework, paraphrased text span detection (PTD), aiming to identify paraphrased text spans within a text.
no code implementations • 24 Apr 2024 • Jincheng Dai, Zhuowei Huang, Haiyun Jiang, Chen Chen, Deng Cai, Wei Bi, Shuming Shi
Our validation shows that CORM reduces the inference memory usage of KV cache by up to 70\% with negligible performance degradation across six tasks in LongBench.
1 code implementation • 27 Feb 2024 • Bowen Cao, Deng Cai, Leyang Cui, Xuxin Cheng, Wei Bi, Yuexian Zou, Shuming Shi
To address this, we propose to initialize the training oracles using linguistic heuristics and, more importantly, bootstrap the oracles through iterative self-reinforcement.
1 code implementation • 23 Jan 2024 • Zhiwei He, Xing Wang, Wenxiang Jiao, Zhuosheng Zhang, Rui Wang, Shuming Shi, Zhaopeng Tu
In this work, we investigate the potential of employing the QE model as the reward model to predict human preferences for feedback training.
1 code implementation • 23 Jan 2024 • Fanghua Ye, Mingming Yang, Jianhui Pang, Longyue Wang, Derek F. Wong, Emine Yilmaz, Shuming Shi, Zhaopeng Tu
The proliferation of open-source Large Language Models (LLMs) from various institutions has highlighted the urgent need for comprehensive evaluation methods.
1 code implementation • 19 Jan 2024 • Fanqi Wan, Xinting Huang, Leyang Cui, Xiaojun Quan, Wei Bi, Shuming Shi
While large language models (LLMs) have demonstrated exceptional performance across various tasks following human alignment, they may still generate responses that sound plausible but contradict factual knowledge, a phenomenon known as hallucination.
3 code implementations • 19 Jan 2024 • Fanqi Wan, Xinting Huang, Deng Cai, Xiaojun Quan, Wei Bi, Shuming Shi
In this paper, we introduce the notion of knowledge fusion for LLMs, aimed at combining the capabilities of existing LLMs and transferring them into a single LLM.
1 code implementation • 16 Jan 2024 • Jianhui Pang, Fanghua Ye, Longyue Wang, Dian Yu, Derek F. Wong, Shuming Shi, Zhaopeng Tu
This study revisits these challenges, offering insights into their ongoing relevance in the context of advanced Large Language Models (LLMs): domain mismatch, amount of parallel data, rare word prediction, translation of long sentences, attention model as word alignment, and sub-optimal beam search.
1 code implementation • 16 Jan 2024 • Shuming Shi, Enbo Zhao, Deng Cai, Leyang Cui, Xinting Huang, Huayang Li
We present Inferflow, an efficient and highly configurable inference engine for large language models (LLMs).
2 code implementations • 25 Dec 2023 • Yue Zhang, Leyang Cui, Wei Bi, Shuming Shi
Experimental results on both discrimination-based and generation-based hallucination evaluation benchmarks, such as TruthfulQA and \textsc{FActScore}, demonstrate that our proposed ICD methods can effectively enhance the factuality of LLMs across various model sizes and families.
1 code implementation • 22 Dec 2023 • Weiwen Xu, Deng Cai, Zhisong Zhang, Wai Lam, Shuming Shi
CUT (LLaMA2-chat-13b) can also align LLMs in an iterative fashion using up-to-date model-specific judgments, improving performance from 81. 09 to 91. 68 points on AlpacaEval.
no code implementations • 22 Dec 2023 • Zhangyin Feng, Runyi Hu, Liangxin Liu, Fan Zhang, Duyu Tang, Yong Dai, Xiaocheng Feng, Jiwei Li, Bing Qin, Shuming Shi
Compared with autoregressive baselines that needs to run one thousand times, our model only runs 16 times to generate images of competitive quality with an order of magnitude lower inference latency.
1 code implementation • 16 Dec 2023 • Qihang Ai, Jianwu Zhou, Haiyun Jiang, Lemao Liu, Shuming Shi
Graph data is ubiquitous in the physical world, and it has always been a challenge to efficiently model graph structures using a unified paradigm for the understanding and reasoning on various graphs.
no code implementations • 25 Nov 2023 • Zhanyu Wang, Longyue Wang, Zhen Zhao, Minghao Wu, Chenyang Lyu, Huayang Li, Deng Cai, Luping Zhou, Shuming Shi, Zhaopeng Tu
While the recent advances in Multimodal Large Language Models (MLLMs) constitute a significant leap forward in the field, these models are predominantly confined to the realm of input-side multimodal comprehension, lacking the capacity for multimodal content generation.
1 code implementation • 15 Nov 2023 • Chang Gao, Haiyun Jiang, Deng Cai, Shuming Shi, Wai Lam
Most existing prompting methods suffer from the issues of generalizability and consistency, as they often rely on instance-specific solutions that may not be applicable to other instances and lack task-level consistency across the selected few-shot examples.
no code implementations • 6 Nov 2023 • Longyue Wang, Zhaopeng Tu, Yan Gu, Siyou Liu, Dian Yu, Qingsong Ma, Chenyang Lyu, Liting Zhou, Chao-Hong Liu, Yufeng Ma, WeiYu Chen, Yvette Graham, Bonnie Webber, Philipp Koehn, Andy Way, Yulin Yuan, Shuming Shi
To foster progress in this domain, we hold a new shared task at WMT 2023, the first edition of the Discourse-Level Literary Translation.
1 code implementation • 31 Oct 2023 • Tian Liang, Zhiwei He, Jen-tse Huang, Wenxuan Wang, Wenxiang Jiao, Rui Wang, Yujiu Yang, Zhaopeng Tu, Shuming Shi, Xing Wang
Ideally, an advanced agent should possess the ability to accurately describe a given word using an aggressive description while concurrently maximizing confusion in the conservative description, enhancing its participation in the game.
1 code implementation • 23 Oct 2023 • Xingyu Chen, Lemao Liu, Guoping Huang, Zhirui Zhang, Mingming Yang, Shuming Shi, Rui Wang
Word-Level Auto-Completion (WLAC) plays a crucial role in Computer-Assisted Translation.
1 code implementation • NAACL 2022 • Jiahao Xu, Yubin Ruan, Wei Bi, Guoping Huang, Shuming Shi, Lihui Chen, Lemao Liu
Back translation (BT) is one of the most significant technologies in NMT research fields.
1 code implementation • 17 Oct 2023 • Xu Huang, Zhirui Zhang, Ruize Gao, Yichao Du, Lemao Liu, Gouping Huang, Shuming Shi, Jiajun Chen, ShuJian Huang
We present IMTLab, an open-source end-to-end interactive machine translation (IMT) system platform that enables researchers to quickly build IMT systems with state-of-the-art models, perform an end-to-end evaluation, and diagnose the weakness of systems.
3 code implementations • 13 Oct 2023 • Fanqi Wan, Xinting Huang, Tao Yang, Xiaojun Quan, Wei Bi, Shuming Shi
Instruction-tuning can be substantially optimized through enhanced diversity, resulting in models capable of handling a broader spectrum of tasks.
1 code implementation • 11 Oct 2023 • Yue Zhang, Leyang Cui, Enbo Zhao, Wei Bi, Shuming Shi
In this paper, we introduce RobustGEC, a benchmark designed to evaluate the context robustness of GEC systems.
no code implementations • 17 Sep 2023 • Yi Chen, Haiyun Jiang, Wei Bi, Rui Wang, Longyue Wang, Shuming Shi, Ruifeng Xu
This work presents a new task of Text Expansion (TE), which aims to insert fine-grained modifiers into proper locations of the plain text to concretize or vivify human writings.
1 code implementation • 14 Sep 2023 • Huayang Li, Siheng Li, Deng Cai, Longyue Wang, Lemao Liu, Taro Watanabe, Yujiu Yang, Shuming Shi
We release our dataset, model, and demo to foster future research in the area of multimodal instruction following.
Ranked #228 on
Visual Question Answering
on MM-Vet
1 code implementation • 11 Sep 2023 • Yongrui Chen, Haiyun Jiang, Xinting Huang, Shuming Shi, Guilin Qi
In particular, compared to the best-performing baseline, the LLM trained using our generated dataset exhibits a 10\% relative improvement in performance on AlpacaEval, despite utilizing only 1/5 of its training data.
1 code implementation • 3 Sep 2023 • Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, Longyue Wang, Anh Tuan Luu, Wei Bi, Freda Shi, Shuming Shi
While large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks, a significant concern revolves around their propensity to exhibit hallucinations: LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge.
1 code implementation • 12 Aug 2023 • Youliang Yuan, Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Pinjia He, Shuming Shi, Zhaopeng Tu
We propose a novel framework CipherChat to systematically examine the generalizability of safety alignment to non-natural languages -- ciphers.
1 code implementation • 16 Jul 2023 • Longyue Wang, Zefeng Du, Donghuai Liu, Deng Cai, Dian Yu, Haiyun Jiang, Yan Wang, Leyang Cui, Shuming Shi, Zhaopeng Tu
Modeling discourse -- the linguistic phenomena that go beyond individual sentences, is a fundamental yet challenging aspect of natural language processing (NLP).
no code implementations • 6 Jul 2023 • Bingshuai Liu, Longyue Wang, Chenyang Lyu, Yong Zhang, Jinsong Su, Shuming Shi, Zhaopeng Tu
Accordingly, we propose a novel multi-modal metric that considers object-text alignment to filter the fine-tuning data in the target culture, which is used to fine-tune a T2I model to improve cross-cultural generation.
no code implementations • 28 Jun 2023 • Zhangyin Feng, Yong Dai, Fan Zhang, Duyu Tang, Xiaocheng Feng, Shuangzhi Wu, Bing Qin, Yunbo Cao, Shuming Shi
Traditional multitask learning methods basically can only exploit common knowledge in task- or language-wise, which lose either cross-language or cross-task knowledge.
1 code implementation • 20 Jun 2023 • Yafu Li, Leyang Cui, Jianhao Yan, Yongjing Yin, Wei Bi, Shuming Shi, Yue Zhang
Most existing text generation models follow the sequence-to-sequence paradigm.
1 code implementation • 15 Jun 2023 • Chenyang Lyu, Minghao Wu, Longyue Wang, Xinting Huang, Bingshuai Liu, Zefeng Du, Shuming Shi, Zhaopeng Tu
Although instruction-tuned large language models (LLMs) have exhibited remarkable capabilities across various NLP tasks, their effectiveness on other data modalities beyond text has not been fully studied.
no code implementations • 12 Jun 2023 • Hongkun Hao, Guoping Huang, Lemao Liu, Zhirui Zhang, Shuming Shi, Rui Wang
The finding demonstrates that TM-augmented NMT is good at the ability of fitting data (i. e., lower bias) but is more sensitive to the fluctuations in the training data (i. e., higher variance), which provides an explanation to a recently reported contradictory phenomenon on the same translation task: TM-augmented NMT substantially advances vanilla NMT under the high-resource scenario whereas it fails under the low-resource scenario.
no code implementations • 4 Jun 2023 • Lingfeng Shen, Haiyun Jiang, Lemao Liu, Shuming Shi
Sentence embedding is one of the most fundamental tasks in Natural Language Processing and plays an important role in various tasks.
1 code implementation • 30 May 2023 • Tian Liang, Zhiwei He, Wenxiang Jiao, Xing Wang, Yan Wang, Rui Wang, Yujiu Yang, Shuming Shi, Zhaopeng Tu
To address the DoT problem, we propose a Multi-Agent Debate (MAD) framework, in which multiple agents express their arguments in the state of "tit for tat" and a judge manages the debate process to obtain a final solution.
no code implementations • 26 May 2023 • Zhangyin Feng, Yuchen Ren, Xinmiao Yu, Xiaocheng Feng, Duyu Tang, Shuming Shi, Bing Qin
Diffusion models developed on top of powerful text-to-image generation models like Stable Diffusion achieve remarkable success in visual story generation.
1 code implementation • 25 May 2023 • Yuejiao Fei, Leyang Cui, Sen yang, Wai Lam, Zhenzhong Lan, Shuming Shi
Grammatical error correction systems improve written communication by detecting and correcting language mistakes.
1 code implementation • 22 May 2023 • Haoran Yang, Deng Cai, Huayang Li, Wei Bi, Wai Lam, Shuming Shi
We introduce a frustratingly simple, super efficient and surprisingly effective decoding method, which we call Frustratingly Simple Decoding (FSD), for neural text generation.
2 code implementations • 22 May 2023 • Yafu Li, Qintong Li, Leyang Cui, Wei Bi, Zhilin Wang, Longyue Wang, Linyi Yang, Shuming Shi, Yue Zhang
In practical scenarios, however, the detector faces texts from various domains or LLMs without knowing their sources.
no code implementations • 17 May 2023 • Longyue Wang, Siyou Liu, Mingzhou Xu, Linfeng Song, Shuming Shi, Zhaopeng Tu
Zero pronouns (ZPs) are frequently omitted in pro-drop languages (e. g. Chinese, Hungarian, and Hindi), but should be recalled in non-pro-drop languages (e. g. English).
no code implementations • 13 May 2023 • Lingfeng Shen, Haiyun Jiang, Lemao Liu, Shuming Shi
Generating proper embedding of sentences through an unsupervised way is beneficial to semantic matching and retrieval problems in real-world scenarios.
2 code implementations • 6 May 2023 • Zhiwei He, Tian Liang, Wenxiang Jiao, Zhuosheng Zhang, Yujiu Yang, Rui Wang, Zhaopeng Tu, Shuming Shi, Xing Wang
Compared to typical machine translation that focuses solely on source-to-target mapping, LLM-based translation can potentially mimic the human translation process which might take preparatory steps to ensure high-quality translation.
1 code implementation • 5 Apr 2023 • Longyue Wang, Chenyang Lyu, Tianbo Ji, Zhirui Zhang, Dian Yu, Shuming Shi, Zhaopeng Tu
Large language models (LLMs) such as ChatGPT can produce coherent, cohesive, relevant, and fluent answers for various natural language processing (NLP) tasks.
1 code implementation • 5 Apr 2023 • Wenxiang Jiao, Jen-tse Huang, Wenxuan Wang, Zhiwei He, Tian Liang, Xing Wang, Shuming Shi, Zhaopeng Tu
Therefore, we propose ParroT, a framework to enhance and regulate the translation abilities during chat based on open-source LLMs (e. g., LLaMA), human-written translation and feedback data.
1 code implementation • 3 Apr 2023 • Yi Chen, Rui Wang, Haiyun Jiang, Shuming Shi, Ruifeng Xu
Evaluating the quality of generated text is a challenging task in NLP, due to the inherent complexity and diversity of text.
1 code implementation • 23 Mar 2023 • Mingyang Song, Haiyun Jiang, Shuming Shi, Songfang Yao, Shilong Lu, Yi Feng, Huafeng Liu, Liping Jing
Based on our findings, we conclude that ChatGPT has great potential for keyphrase generation.
1 code implementation • 20 Jan 2023 • Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Xing Wang, Shuming Shi, Zhaopeng Tu
By evaluating on a number of benchmark test sets, we find that ChatGPT performs competitively with commercial translation products (e. g., Google Translate) on high-resource European languages but lags behind significantly on low-resource or distant languages.
1 code implementation • 2 Dec 2022 • Hongzhan Lin, Pengyao Yi, Jing Ma, Haiyun Jiang, Ziyang Luo, Shuming Shi, Ruifang Liu
The spread of rumors along with breaking events seriously hinders the truth in the era of social media.
no code implementations • 22 Oct 2022 • Xueliang Zhao, Lemao Liu, Tingchen Fu, Shuming Shi, Dongyan Zhao, Rui Yan
With the availability of massive general-domain dialogue data, pre-trained dialogue generation appears to be super appealing to transfer knowledge from the general domain to downstream applications.
1 code implementation • 18 Oct 2022 • Wenxiang Jiao, Zhaopeng Tu, Jiarui Li, Wenxuan Wang, Jen-tse Huang, Shuming Shi
This paper describes Tencent's multilingual machine translation systems for the WMT22 shared task on Large-Scale Machine Translation Evaluation for African Languages.
1 code implementation • 17 Oct 2022 • Zhiwei He, Xing Wang, Zhaopeng Tu, Shuming Shi, Rui Wang
Finally, our unconstrained system achieves BLEU scores of 17. 0 and 30. 4 for English to/from Livonian.
no code implementations • 3 Aug 2022 • Shuming Shi, Enbo Zhao, Duyu Tang, Yan Wang, Piji Li, Wei Bi, Haiyun Jiang, Guoping Huang, Leyang Cui, Xinting Huang, Cong Zhou, Yong Dai, Dongyang Ma
In Effidit, we significantly expand the capacities of a writing assistant by providing functions in five categories: text completion, error checking, text polishing, keywords to sentences (K2S), and cloud input methods (cloud IME).
no code implementations • 12 May 2022 • Yong Dai, Duyu Tang, Liangxin Liu, Minghuan Tan, Cong Zhou, Jingquan Wang, Zhangyin Feng, Fan Zhang, Xueyu Hu, Shuming Shi
Moreover, our model supports self-supervised pretraining with the same sparsely activated way, resulting in better initialized parameters for different modalities.
no code implementations • 26 Apr 2022 • Cong Zhou, Yong Dai, Duyu Tang, Enbo Zhao, Zhangyin Feng, Li Kuang, Shuming Shi
We achieve this by introducing a special token \texttt{[null]}, the prediction of which stands for the non-existence of a word.
no code implementations • 26 Apr 2022 • Junwei Liao, Duyu Tang, Fan Zhang, Shuming Shi
We present SkillNet-NLG, a sparsely activated approach that handles many natural language generation tasks with one model.
1 code implementation • ACL 2022 • Yu Cao, Wei Bi, Meng Fang, Shuming Shi, DaCheng Tao
To alleviate the above data issues, we propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model to improve its performance.
no code implementations • Findings (ACL) 2022 • Jiannan Xiang, Huayang Li, Yahui Liu, Lemao Liu, Guoping Huang, Defu Lian, Shuming Shi
Current practices in metric evaluation focus on one single dataset, e. g., Newstest dataset in each year's WMT Metrics Shared Task.
no code implementations • ACL 2022 • Wenxuan Wang, Wenxiang Jiao, Yongchang Hao, Xing Wang, Shuming Shi, Zhaopeng Tu, Michael Lyu
In this paper, we present a substantial step in better understanding the SOTA sequence-to-sequence (Seq2Seq) pretraining for neural machine translation~(NMT).
1 code implementation • ACL 2022 • Zhiwei He, Xing Wang, Rui Wang, Shuming Shi, Zhaopeng Tu
By carefully designing experiments, we identify two representative characteristics of the data gap in source: (1) style gap (i. e., translated vs. natural text style) that leads to poor generalization capability; (2) content gap that induces the model to produce hallucination content biased towards the target language.
1 code implementation • 12 Mar 2022 • Linyang Li, Yong Dai, Duyu Tang, Xipeng Qiu, Zenglin Xu, Shuming Shi
We present a Chinese BERT model dubbed MarkBERT that uses word information in this work.
Chinese Named Entity Recognition
named-entity-recognition
+7
1 code implementation • 9 Mar 2022 • Wenye Lin, Yangming Li, Lemao Liu, Shuming Shi, Hai-Tao Zheng
Specifically, we transfer the knowledge from a teacher model to its student model by locally matching their predictions on all sub-structures, instead of the whole output space.
no code implementations • 7 Mar 2022 • Fan Zhang, Duyu Tang, Yong Dai, Cong Zhou, Shuangzhi Wu, Shuming Shi
The key feature of our approach is that it is sparsely activated guided by predefined skills.
1 code implementation • ACL 2022 • Minghuan Tan, Yong Dai, Duyu Tang, Zhangyin Feng, Guoping Huang, Jing Jiang, Jiwei Li, Shuming Shi
We find that a frozen GPT achieves state-of-the-art performance on perfect pinyin.
no code implementations • 24 Feb 2022 • Zhangyin Feng, Duyu Tang, Cong Zhou, Junwei Liao, Shuangzhi Wu, Xiaocheng Feng, Bing Qin, Yunbo Cao, Shuming Shi
(2) how to predict a word via cloze test without knowing the number of wordpieces in advance?
1 code implementation • 17 Feb 2022 • Lingfeng Shen, Lemao Liu, Haiyun Jiang, Shuming Shi
In this paper we revisit automatic metrics for paraphrase evaluation and obtain two findings that disobey conventional wisdom: (1) Reference-free metrics achieve better performance than their reference-based counterparts.
no code implementations • 9 Jan 2022 • Lingfeng Shen, Haiyun Jiang, Lemao Liu, Shuming Shi
It has been shown that natural language processing (NLP) models are vulnerable to a kind of security threat called the Backdoor Attack, which utilizes a `backdoor trigger' paradigm to mislead the models.
1 code implementation • Findings (EMNLP) 2021 • Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Shuming Shi, Zhaopeng Tu
Pre-training (PT) and back-translation (BT) are two simple and powerful methods to utilize monolingual data for improving the model performance of neural machine translation (NMT).
no code implementations • ACL 2022 • Yangming Li, Lemao Liu, Shuming Shi
Negative sampling is highly effective in handling missing annotations for named entity recognition (NER).
no code implementations • ACL 2021 • Lemao Liu, Haisong Zhang, Haiyun Jiang, Yangming Li, Enbo Zhao, Kun Xu, Linfeng Song, Suncong Zheng, Botong Zhou, Dick Zhu, Xiao Feng, Tao Chen, Tao Yang, Dong Yu, Feng Zhang, Zhanhui Kang, Shuming Shi
This paper introduces TexSmart, a text understanding system that supports fine-grained named entity recognition (NER) and enhanced semantic analysis functionalities.
1 code implementation • Findings (ACL) 2021 • Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Shuming Shi, Zhaopeng Tu
In response to this problem, we propose a simple and effective method named copying penalty to control the copying behaviors in decoding.
no code implementations • Findings (ACL) 2021 • Shuo Wang, Zhaopeng Tu, Zhixing Tan, Shuming Shi, Maosong Sun, Yang Liu
Language coverage bias, which indicates the content-dependent differences between sentence pairs originating from the source and target languages, is important for neural machine translation (NMT) because the target-original training data is not well exploited in current practice.
1 code implementation • ACL 2021 • Piji Li, Shuming Shi
We investigate the problem of Chinese Grammatical Error Correction (CGEC) and present a new framework named Tail-to-Tail (\textbf{TtT}) non-autoregressive sequence prediction to address the deep issues hidden in CGEC.
1 code implementation • ACL 2021 • Wenxiang Jiao, Xing Wang, Zhaopeng Tu, Shuming Shi, Michael R. Lyu, Irwin King
In this work, we propose to improve the sampling procedure by selecting the most informative monolingual sentences to complement the parallel data.
no code implementations • ACL 2021 • Huayang Li, Lemao Liu, Guoping Huang, Shuming Shi
In this paper, we propose the task of general word-level autocompletion (GWLAN) from a real-world CAT scenario, and construct the first public benchmark to facilitate research in this topic.
no code implementations • 30 May 2021 • Jun Gao, Wei Bi, Ruifeng Xu, Shuming Shi
We first clarify an assumption on reference-based metrics that, if more high-quality references are added into the reference set, the reliability of the metric will increase.
no code implementations • 27 May 2021 • Guoping Huang, Lemao Liu, Xing Wang, Longyue Wang, Huayang Li, Zhaopeng Tu, Chengyan Huang, Shuming Shi
Automatic machine translation is super efficient to produce translations yet their quality is not guaranteed.
no code implementations • 31 Dec 2020 • Haisong Zhang, Lemao Liu, Haiyun Jiang, Yangming Li, Enbo Zhao, Kun Xu, Linfeng Song, Suncong Zheng, Botong Zhou, Jianchen Zhu, Xiao Feng, Tao Chen, Tao Yang, Dong Yu, Feng Zhang, Zhanhui Kang, Shuming Shi
This technique report introduces TexSmart, a text understanding system that supports fine-grained named entity recognition (NER) and enhanced semantic analysis functionalities.
1 code implementation • ACL 2021 • Yixuan Su, Deng Cai, Qingyu Zhou, Zibo Lin, Simon Baker, Yunbo Cao, Shuming Shi, Nigel Collier, Yan Wang
As for IC, it progressively strengthens the model's ability in identifying the mismatching information between the dialogue context and a response candidate.
Ranked #3 on
Conversational Response Selection
on RRS
no code implementations • 17 Dec 2020 • Zelong Yang, Yan Wang, Piji Li, Shaobin Lin, Shuming Shi, Shao-Lun Huang, Wei Bi
The multiplayer online battle arena (MOBA) games have become increasingly popular in recent years.
1 code implementation • ICLR 2021 • Yangming Li, Lemao Liu, Shuming Shi
Experiments on synthetic datasets and real-world datasets show that our model is robust to unlabeled entity problem and surpasses prior baselines.
no code implementations • Findings (EMNLP) 2021 • Yangming Li, Lemao Liu, Shuming Shi
In this work, we present Lexical Unit Analysis (LUA), a framework for general sequence segmentation tasks.
1 code implementation • EMNLP 2020 • Changlong Yu, Jialong Han, Peifeng Wang, Yangqiu Song, Hongming Zhang, Wilfred Ng, Shuming Shi
We also demonstrate that distributional methods are ideal to make up for pattern-based ones in such cases.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Huayang Li, Lemao Liu, Guoping Huang, Shuming Shi
Many efforts have been devoted to extracting constituency trees from pre-trained language models, often proceeding in two stages: feature definition and parsing.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Yilin Yang, Longyue Wang, Shuming Shi, Prasad Tadepalli, Stefan Lee, Zhaopeng Tu
There have been significant efforts to interpret the encoder of Transformer-based encoder-decoder architectures for neural machine translation (NMT); meanwhile, the decoder remains largely unexamined despite its critical role.
no code implementations • 14 Aug 2020 • Zelong Yang, Zhufeng Pan, Yan Wang, Deng Cai, Xiaojiang Liu, Shuming Shi, Shao-Lun Huang
With the rapid prevalence and explosive development of MOBA esports (Multiplayer Online Battle Arena electronic sports), much research effort has been devoted to automatically predicting game results (win predictions).
no code implementations • ACL 2020 • Jierui Li, Lemao Liu, Huayang Li, Guanlin Li, Guoping Huang, Shuming Shi
Recently many efforts have been devoted to interpreting the black-box NMT models, but little progress has been made on metrics to evaluate explanation methods.
1 code implementation • ACL 2020 • Shuo Wang, Zhaopeng Tu, Shuming Shi, Yang Liu
Confidence calibration, which aims to make model predictions equal to the true correctness measures, is important for neural machine translation (NMT) because it is able to offer useful indicators of translation errors in the generated output.
no code implementations • 28 Apr 2020 • Shilin He, Xing Wang, Shuming Shi, Michael R. Lyu, Zhaopeng Tu
In this paper, we bridge the gap by assessing the bilingual knowledge learned by NMT models with phrase table -- an interpretable table of bilingual lexicons.
2 code implementations • ACL 2020 • Piji Li, Haisong Zhang, Xiaojiang Liu, Shuming Shi
(3) Although they are restricted to some formats, the sentence integrity must be guaranteed.
no code implementations • EMNLP 2020 • Zibo Lin, Deng Cai, Yan Wang, Xiaojiang Liu, Hai-Tao Zheng, Shuming Shi
Despite that response selection is naturally a learning-to-rank problem, most prior works take a point-wise view and train binary classifiers for this task: each response candidate is labeled either relevant (one) or irrelevant (zero).
Ranked #11 on
Conversational Response Selection
on E-commerce
no code implementations • 5 Apr 2020 • Conghui Zhu, Guanlin Li, Lemao Liu, Tiejun Zhao, Shuming Shi
Despite the great success of NMT, there still remains a severe challenge: it is hard to interpret the internal dynamics during its training process.
no code implementations • 5 Apr 2020 • Guanlin Li, Lemao Liu, Conghui Zhu, Tiejun Zhao, Shuming Shi
Generalization to unseen instances is our eternal pursuit for all data-driven models.
no code implementations • 31 Dec 2019 • Jialong Han, Aixin Sun, Haisong Zhang, Chenliang Li, Shuming Shi
In this study, we demonstrate that annotations for this task can be harvested at scale from existing corpora, in a fully automatic manner.
no code implementations • 22 Nov 2019 • Jian Li, Xing Wang, Baosong Yang, Shuming Shi, Michael R. Lyu, Zhaopeng Tu
Starting from this intuition, we propose a novel approach to compose representations learned by different components in neural machine translation (e. g., multi-layer networks or multi-head attention), based on modeling strong interactions among neurons in the representation vectors.
2 code implementations • 22 Nov 2019 • Yong Wang, Long-Yue Wang, Shuming Shi, Victor O. K. Li, Zhaopeng Tu
The key challenge of multi-domain translation lies in simultaneously encoding both the general knowledge shared across domains and the particular knowledge distinctive to each domain in a unified model.
no code implementations • IJCNLP 2019 • Jun Gao, Wei Bi, Xiaojiang Liu, Junhui Li, Guodong Zhou, Shuming Shi
In this paper, we introduce a discrete latent variable with an explicit semantic meaning to improve the CVAE on short-text conversation.
no code implementations • IJCNLP 2019 • Deng Cai, Yan Wang, Wei Bi, Zhaopeng Tu, Xiaojiang Liu, Shuming Shi
End-to-end sequence generation is a popular technique for developing open domain dialogue systems, though they suffer from the \textit{safe response problem}.
no code implementations • IJCNLP 2019 • Mingyue Shang, Piji Li, Zhenxin Fu, Lidong Bing, Dongyan Zhao, Shuming Shi, Rui Yan
Text style transfer task requires the model to transfer a sentence of one style to another style while retaining its original content meaning, which is a challenging problem that has long suffered from the shortage of parallel data.
no code implementations • IJCNLP 2019 • Jie Hao, Xing Wang, Shuming Shi, Jinfeng Zhang, Zhaopeng Tu
Current state-of-the-art neural machine translation (NMT) uses a deep multi-head self-attention network with no explicit phrase information.
no code implementations • IJCNLP 2019 • Jie Hao, Xing Wang, Shuming Shi, Jinfeng Zhang, Zhaopeng Tu
Recent studies have shown that a hybrid of self-attention networks (SANs) and recurrent neural networks (RNNs) outperforms both individual architectures, while not much is known about why the hybrid models work.
no code implementations • IJCNLP 2019 • Shilin He, Zhaopeng Tu, Xing Wang, Long-Yue Wang, Michael R. Lyu, Shuming Shi
Although neural machine translation (NMT) has advanced the state-of-the-art on various language pairs, the interpretability of NMT remains unsatisfactory.
no code implementations • IJCNLP 2019 • Long-Yue Wang, Zhaopeng Tu, Xing Wang, Shuming Shi
In this paper, we propose a unified and discourse-aware ZP translation approach for neural MT models.
no code implementations • IJCNLP 2019 • Xing Wang, Zhaopeng Tu, Long-Yue Wang, Shuming Shi
Although self-attention networks (SANs) have advanced the state-of-the-art on various NLP tasks, one criticism of SANs is their ability of encoding positions of input words (Shaw et al., 2018).
no code implementations • ACL 2019 • Wei Bi, Jun Gao, Xiaojiang Liu, Shuming Shi
Classification models are trained on this dataset to (i) recognize the sentence function of new data in a large corpus of short-text conversations; (ii) estimate a proper sentence function of the response given a test query.
no code implementations • ACL 2019 • Xintong Li, Guanlin Li, Lemao Liu, Max Meng, Shuming Shi
Prior researches suggest that neural machine translation (NMT) captures word alignment through its attention mechanism, however, this paper finds attention may almost fail to capture word alignment for some NMT models.
2 code implementations • ACL 2019 • Yue Wang, Jing Li, Hou Pong Chan, Irwin King, Michael R. Lyu, Shuming Shi
Further discussions show that our model learns meaningful topics, which interprets its superiority in social media keyphrase generation.
no code implementations • ACL 2019 • Xing Wang, Zhaopeng Tu, Long-Yue Wang, Shuming Shi
In this work, we present novel approaches to exploit sentential context for neural machine translation (NMT).
no code implementations • NAACL 2019 • Guanlin Li, Lemao Liu, Xintong Li, Conghui Zhu, Tiejun Zhao, Shuming Shi
Multilayer architectures are currently the gold standard for large-scale neural machine translation.
1 code implementation • NAACL 2019 • Yue Wang, Jing Li, Irwin King, Michael R. Lyu, Shuming Shi
Automatic hashtag annotation plays an important role in content understanding for microblog posts.
no code implementations • 15 Feb 2019 • Zi-Yi Dou, Zhaopeng Tu, Xing Wang, Long-Yue Wang, Shuming Shi, Tong Zhang
With the promising progress of deep neural networks, layer aggregation has been used to fuse information across layers in various fields, such as computer vision and machine translation.
no code implementations • 21 Nov 2018 • Xiang Kong, Zhaopeng Tu, Shuming Shi, Eduard Hovy, Tong Zhang
Although Neural Machine Translation (NMT) models have advanced state-of-the-art performance in machine translation, they face problems like the inadequate translation.
Ranked #34 on
Machine Translation
on WMT2014 English-German
no code implementations • 14 Nov 2018 • Jun Gao, Wei Bi, Xiaojiang Liu, Junhui Li, Shuming Shi
In this paper, we propose a novel response generation model, which considers a set of responses jointly and generates multiple diverse responses simultaneously.
no code implementations • EMNLP 2018 • Zi-Yi Dou, Zhaopeng Tu, Xing Wang, Shuming Shi, Tong Zhang
Advanced neural machine translation (NMT) models generally implement encoder and decoder as multiple layers, which allows systems to model complex functions and capture complicated linguistic structures.
no code implementations • EMNLP 2018 • Juntao Li, Yan Song, Haisong Zhang, Dongmin Chen, Shuming Shi, Dongyan Zhao, Rui Yan
It is a challenging task to automatically compose poems with not only fluent expressions but also aesthetic wording.
1 code implementation • EMNLP 2018 • Yi Liao, Lidong Bing, Piji Li, Shuming Shi, Wai Lam, Tong Zhang
For example, an input sequence could be a word sequence, such as review sentence and advertisement text.
1 code implementation • EMNLP 2018 • Yahui Liu, Wei Bi, Jun Gao, Xiaojiang Liu, Jian Yao, Shuming Shi
We observe that in the conversation tasks, each query could have multiple responses, which forms a 1-to-n or m-to-n relationship in the view of the total corpus.
1 code implementation • NAACL 2019 • Deng Cai, Yan Wang, Victoria Bi, Zhaopeng Tu, Xiaojiang Liu, Wai Lam, Shuming Shi
Such models rely on insufficient information for generating a specific response since a certain query could be answered in multiple ways.
no code implementations • 13 Aug 2018 • Yanpeng Zhao, Wei Bi, Deng Cai, Xiaojiang Liu, Kewei Tu, Shuming Shi
Then, by recombining the content with the target style, we decode a sentence aligned in the target domain.
no code implementations • NAACL 2018 • Yan Song, Shuming Shi, Jing Li, Haisong Zhang
In this paper, we present directional skip-gram (DSG), a simple but effective enhancement of the skip-gram model by explicitly distinguishing left and right context in word prediction.
no code implementations • NAACL 2018 • Xintong Li, Lemao Liu, Zhaopeng Tu, Shuming Shi, Max Meng
In neural machine translation, an attention model is used to identify the aligned source words for a target word (target foresight word) in order to select translation context, but it does not make use of any information of this target foresight word at all.
no code implementations • 15 May 2018 • Jing Li, Yan Song, Haisong Zhang, Shuming Shi
This paper presents a large-scale corpus for non-task-oriented dialogue response selection, which contains over 27K distinct prompts more than 82K responses collected from social media.
1 code implementation • ACL 2018 • Jialong Han, Yan Song, Wayne Xin Zhao, Shuming Shi, Haisong Zhang
Hypertext documents, such as web pages and academic papers, are of great importance in delivering information in our daily life.
no code implementations • ACL 2018 • Lianhui Qin, Lemao Liu, Victoria Bi, Yan Wang, Xiaojiang Liu, Zhiting Hu, Hai Zhao, Shuming Shi
Comments of online articles provide extended views and improve user engagement.
no code implementations • 21 Apr 2018 • Zhaopeng Tu, Yong Jiang, Xiaojiang Liu, Lei Shu, Shuming Shi
We study the problem of stock related question answering (StockQA): automatically generating answers to stock related questions, just like professional stock analysts providing action recommendations to stocks upon user's requests.
1 code implementation • EMNLP 2018 • Yi Liao, Lidong Bing, Piji Li, Shuming Shi, Wai Lam, Tong Zhang
For example, an input sequence could be a word sequence, such as review sentence and advertisement text.
1 code implementation • 10 Jan 2018 • Long-Yue Wang, Zhaopeng Tu, Shuming Shi, Tong Zhang, Yvette Graham, Qun Liu
Next, the annotated source sentence is reconstructed from hidden representations in the NMT model.
1 code implementation • TACL 2018 • Zhaopeng Tu, Yang Liu, Shuming Shi, Tong Zhang
Existing neural machine translation (NMT) models generally translate sentences in isolation, missing the opportunity to take advantage of document-level information.
no code implementations • EMNLP 2017 • Yan Wang, Xiaojiang Liu, Shuming Shi
This paper presents a deep neural solver to automatically solve math word problems.
Ranked #4 on
Math Word Problem Solving
on ALG514
no code implementations • EMNLP 2017 • Danqing Huang, Shuming Shi, Chin-Yew Lin, Jian Yin
This method learns the mappings between math concept phrases in math word problems and their math expressions from training data.