no code implementations • CCL 2020 • Shuailong Liang, Derek F. Wong, Yue Zhang
我们基于从2020年1月22日至2020年4月30日在推特社交平台上抓取的不同国家和地区发布的50万条推文, 研究了有关 2019新型冠状病毒肺炎相关的主题和人们的观点, 发现了不同国家之间推特用户的普遍关切和看法之间存在着异同, 并且对不同议题的情感态度也有所不同。我们发现大部分推文中包含了强烈的情感, 其中表达爱与支持的推文比较普遍。总体来看, 人们的情感随着时间的推移逐渐正向增强。
no code implementations • WMT (EMNLP) 2021 • Yu Wan, Dayiheng Liu, Baosong Yang, Tianchi Bi, Haibo Zhang, Boxing Chen, Weihua Luo, Derek F. Wong, Lidia S. Chao
After investigating the recent advances of trainable metrics, we conclude several aspects of vital importance to obtain a well-performed metric model by: 1) jointly leveraging the advantages of source-included model and reference-only model, 2) continuously pre-training the model with massive synthetic data pairs, and 3) fine-tuning the model with data denoising strategy.
no code implementations • 11 Jul 2024 • ZiHao Zhou, Shudong Liu, Maizhen Ning, Wei Liu, Jindong Wang, Derek F. Wong, Xiaowei Huang, Qiufeng Wang, Kaizhu Huang
Exceptional mathematical reasoning ability is one of the key features that demonstrate the power of large language models (LLMs).
no code implementations • 17 Jun 2024 • Zhipeng Qian, Pei Zhang, Baosong Yang, Kai Fan, Yiwei Ma, Derek F. Wong, Xiaoshuai Sun, Rongrong Ji
This paper introduces AnyTrans, an all-encompassing framework for the task-Translate AnyText in the Image (TATI), which includes multilingual text translation and text fusion within images.
1 code implementation • 11 Jun 2024 • Renhao Li, Minghuan Tan, Derek F. Wong, Min Yang
The responses within IFT data could be further enhanced by leveraging the capabilities of LLMs themselves.
no code implementations • 5 Jun 2024 • Shanshan Wang, Derek F. Wong, Jingming Yao, Lidia S. Chao
Machine translation (MT) has historically faced significant challenges when applied to literary works, particularly in the domain of poetry translation.
no code implementations • 2 Jun 2024 • Kaixin Lan, Tao Fang, Derek F. Wong, Yabo Xu, Lidia S. Chao, Cecilia G. Zhao
Pre-trained Language Models (PLMs) have shown impressive results in various Natural Language Generation (NLG) tasks, such as powering chatbots and generating stories.
1 code implementation • 22 May 2024 • Yiming Wang, Pei Zhang, Baosong Yang, Derek F. Wong, Zhuosheng Zhang, Rui Wang
Real-world data deviating from the independent and identically distributed (i. i. d.)
no code implementations • 7 May 2024 • Junchao Wu, Runzhe Zhan, Derek F. Wong, Shu Yang, Xuebo Liu, Lidia S. Chao, Min Zhang
The efficacy of an large language model (LLM) generated text detector depends substantially on the availability of sizable training data.
no code implementations • 5 May 2024 • Guanhua Chen, Yutong Yao, Derek F. Wong, Lidia S. Chao
Multi-intent natural language understanding (NLU) presents a formidable challenge due to the model confusion arising from multiple intents within a single utterance.
Ranked #2 on Slot Filling on MixSNIPS
1 code implementation • 29 Apr 2024 • Xinyu Ma, Xuebo Liu, Derek F. Wong, Jun Rao, Bei Li, Liang Ding, Lidia S. Chao, DaCheng Tao, Min Zhang
Experimental results show that MMT models trained on our dataset exhibit a greater ability to exploit visual information than those trained on other MMT datasets.
no code implementations • 25 Apr 2024 • Runzhe Zhan, Xinyi Yang, Derek F. Wong, Lidia S. Chao, Yue Zhang
While supervised fine-tuning (SFT) has been a straightforward approach for tailoring the output of foundation large language model (LLM) to specific preferences, concerns have been raised about the depth of this alignment, with some critiques suggesting it is merely "superficial".
no code implementations • 18 Mar 2024 • Haoyun Xu, Runzhe Zhan, Derek F. Wong, Lidia S. Chao
Large Language Models (LLMs) are composed of neurons that exhibit various behaviors and roles, which become increasingly diversified as models scale.
1 code implementation • 26 Feb 2024 • Liangxin Liu, Xuebo Liu, Derek F. Wong, Dongfang Li, Ziyi Wang, Baotian Hu, Min Zhang
In this work, we propose a novel approach, termed SelectIT, that capitalizes on the foundational capabilities of the LLM itself.
1 code implementation • 23 Jan 2024 • Fanghua Ye, Mingming Yang, Jianhui Pang, Longyue Wang, Derek F. Wong, Emine Yilmaz, Shuming Shi, Zhaopeng Tu
The proliferation of open-source Large Language Models (LLMs) from various institutions has highlighted the urgent need for comprehensive evaluation methods.
1 code implementation • 16 Jan 2024 • Jianhui Pang, Fanghua Ye, Longyue Wang, Dian Yu, Derek F. Wong, Shuming Shi, Zhaopeng Tu
This study revisits these challenges, offering insights into their ongoing relevance in the context of advanced Large Language Models (LLMs): domain mismatch, amount of parallel data, rare word prediction, translation of long sentences, attention model as word alignment, and sub-optimal beam search.
1 code implementation • 23 Oct 2023 • Junchao Wu, Shu Yang, Runzhe Zhan, Yulin Yuan, Derek F. Wong, Lidia S. Chao
In this survey, we collate recent research breakthroughs in this area and underscore the pressing need to bolster detector research.
1 code implementation • 13 Oct 2023 • Xinyi Yang, Runzhe Zhan, Derek F. Wong, Junchao Wu, Lidia S. Chao
The large language model (LLM) has garnered significant attention due to its in-context learning mechanisms and emergent capabilities.
no code implementations • 31 May 2023 • Zhihong Huang, Longyue Wang, Siyou Liu, Derek F. Wong
To bridge this gap, we introduce a probing task to interpret the ability of PLMs to capture discourse relation knowledge.
1 code implementation • 3 May 2023 • Chi Seng Cheang, Hou Pong Chan, Derek F. Wong, Xuebo Liu, Zhaocong Li, Yanming Sun, Shudong Liu, Lidia S. Chao
Moreover, the knowledge memorized by PLMs may quickly become outdated, which affects the generalization performance of PLMs on future data.
no code implementations • 2 May 2023 • Chenyang Lyu, Zefeng Du, Jitao Xu, Yitao Duan, Minghao Wu, Teresa Lynn, Alham Fikri Aji, Derek F. Wong, Siyou Liu, Longyue Wang
We conclude by emphasizing the critical role of LLMs in guiding the future evolution of MT and offer a roadmap for future exploration in the sector.
no code implementations • 4 Apr 2023 • Tao Fang, Shu Yang, Kaixin Lan, Derek F. Wong, Jinpeng Hu, Lidia S. Chao, Yue Zhang
To showcase its capabilities in GEC, we design zero-shot chain-of-thought (CoT) and few-shot CoT settings using in-context learning for ChatGPT.
no code implementations • 17 Feb 2023 • Keqin Bao, Yu Wan, Dayiheng Liu, Baosong Yang, Wenqiang Lei, Xiangnan He, Derek F. Wong, Jun Xie
In this paper, we propose Fine-Grained Translation Error Detection (FG-TED) task, aiming at identifying both the position and the type of translation errors on given source-hypothesis sentence pairs.
1 code implementation • 20 Dec 2022 • Qingyu Lu, Liang Ding, Liping Xie, Kanjian Zhang, Derek F. Wong, DaCheng Tao
To this end, we augment BARTScore by incorporating the human-like error analysis strategies, namely BARTScore++, where the final score consists of both the evaluations of major errors and minor errors.
1 code implementation • 8 Dec 2022 • Zhaocong Li, Xuebo Liu, Derek F. Wong, Lidia S. Chao, Min Zhang
In this paper, we propose a novel transfer learning method for NMT, namely ConsistTL, which can continuously transfer knowledge from the parent model during the training of the child model.
Low Resource Neural Machine Translation Low-Resource Neural Machine Translation +4
1 code implementation • 18 Oct 2022 • Yu Wan, Keqin Bao, Dayiheng Liu, Baosong Yang, Derek F. Wong, Lidia S. Chao, Wenqiang Lei, Jun Xie
In this report, we present our submission to the WMT 2022 Metrics Shared Task.
1 code implementation • 18 Oct 2022 • Keqin Bao, Yu Wan, Dayiheng Liu, Baosong Yang, Wenqiang Lei, Xiangnan He, Derek F. Wong, Jun Xie
In this paper, we present our submission to the sentence-level MQM benchmark at Quality Estimation Shared Task, named UniTE (Unified Translation Evaluation).
no code implementations • 28 Apr 2022 • Yu Wan, Dayiheng Liu, Baosong Yang, Tianchi Bi, Haibo Zhang, Boxing Chen, Weihua Luo, Derek F. Wong, Lidia S. Chao
After investigating the recent advances of trainable metrics, we conclude several aspects of vital importance to obtain a well-performed metric model by: 1) jointly leveraging the advantages of source-included model and reference-only model, 2) continuously pre-training the model with massive synthetic data pairs, and 3) fine-tuning the model with data denoising strategy.
1 code implementation • Findings (ACL) 2022 • Yu Wan, Baosong Yang, Dayiheng Liu, Rong Xiao, Derek F. Wong, Haibo Zhang, Boxing Chen, Lidia S. Chao
Attention mechanism has become the dominant module in natural language processing models.
2 code implementations • ACL 2022 • Yu Wan, Dayiheng Liu, Baosong Yang, Haibo Zhang, Boxing Chen, Derek F. Wong, Lidia S. Chao
Translation quality evaluation plays a crucial role in machine translation.
1 code implementation • 7 Nov 2021 • Runzhe Zhan, Xuebo Liu, Derek F. Wong, Lidia S. Chao
We release 70 small and discriminative test sets for machine translation (MT) evaluation called variance-aware test sets (VAT), covering 35 translation directions from WMT16 to WMT20 competitions.
1 code implementation • Findings (EMNLP) 2021 • Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Shuming Shi, Zhaopeng Tu
Pre-training (PT) and back-translation (BT) are two simple and powerful methods to utilize monolingual data for improving the model performance of neural machine translation (NMT).
1 code implementation • ACL 2021 • Runzhe Zhan, Xuebo Liu, Derek F. Wong, Lidia S. Chao
The high-quality translation results produced by machine translation (MT) systems still pose a huge challenge for automatic evaluation.
1 code implementation • Findings (ACL) 2021 • Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Shuming Shi, Zhaopeng Tu
In response to this problem, we propose a simple and effective method named copying penalty to control the copying behaviors in decoding.
no code implementations • Findings (ACL) 2021 • Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, DaCheng Tao, Zhaopeng Tu
Non-autoregressive translation (NAT) significantly accelerates the inference process via predicting the entire target sequence.
1 code implementation • ACL 2021 • Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, DaCheng Tao, Zhaopeng Tu
Results demonstrate that the proposed approach can significantly and universally improve translation quality by reducing translation errors on low-frequency words.
1 code implementation • 3 Mar 2021 • Runzhe Zhan, Xuebo Liu, Derek F. Wong, Lidia S. Chao
Meta-learning has been sufficiently validated to be beneficial for low-resource neural machine translation (NMT).
Domain Adaptation Low Resource Neural Machine Translation +4
1 code implementation • ICLR 2021 • Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Zhaopeng Tu
Encoder layer fusion (EncoderFusion) is a technique to fuse all the encoder layers (instead of the uppermost layer) for sequence-to-sequence (Seq2Seq) models, which has proven effective on various NLP tasks.
no code implementations • ICLR 2021 • Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, DaCheng Tao, Zhaopeng Tu
To this end, we introduce an extra Kullback-Leibler divergence term derived by comparing the lexical choice of NAT model and that embedded in the raw data.
1 code implementation • EMNLP 2020 • Yu Wan, Baosong Yang, Derek F. Wong, Yikai Zhou, Lidia S. Chao, Haibo Zhang, Boxing Chen
Recent studies have proven that the training of neural machine translation (NMT) can be facilitated by mimicking the learning process of humans.
1 code implementation • 14 Jul 2020 • Xuancheng Huang, Jiacheng Zhang, Zhixing Tan, Derek F. Wong, Huanbo Luan, Jingfang Xu, Maosong Sun, Yang Liu
System combination is an important technique for combining the hypotheses of different machine translation systems to improve translation performance.
no code implementations • ACL 2020 • Yikai Zhou, Baosong Yang, Derek F. Wong, Yu Wan, Lidia S. Chao
We propose uncertainty-aware curriculum learning, which is motivated by the intuition that: 1) the higher the uncertainty in a translation pair, the more complex and rarer the information it contains; and 2) the end of the decline in model uncertainty indicates the completeness of current training stage.
1 code implementation • ACL 2020 • Xuebo Liu, Houtim Lai, Derek F. Wong, Lidia S. Chao
We use the norm (aka length or module) of a word embedding as a measure of 1) the difficulty of the sentence, 2) the competence of the model, and 3) the weight of the sentence.
2 code implementations • 11 Dec 2019 • Yu Wan, Baosong Yang, Derek F. Wong, Lidia S. Chao, Haihua Du, Ben C. H. Ao
As a special machine translation task, dialect translation has two main characteristics: 1) lack of parallel training corpus; and 2) possessing similar grammar between two sides of the translation.
1 code implementation • ACL 2019 • Mingzhou Xu, Derek F. Wong, Baosong Yang, Yue Zhang, Lidia S. Chao
Self-attention networks have received increasing research attention.
no code implementations • ACL 2019 • Xuebo Liu, Derek F. Wong, Yang Liu, Lidia S. Chao, Tong Xiao, Jingbo Zhu
For similar source and target words, their embeddings tend to share a part of the features and they cooperatively learn these common representation units.
2 code implementations • ACL 2019 • Qiang Wang, Bei Li, Tong Xiao, Jingbo Zhu, Changliang Li, Derek F. Wong, Lidia S. Chao
Transformer is the state-of-the-art model in recent machine translation evaluations.
1 code implementation • ACL 2019 • Baosong Yang, Long-Yue Wang, Derek F. Wong, Lidia S. Chao, Zhaopeng Tu
Self-attention networks (SAN) have attracted a lot of interests due to their high parallelization and strong performance on a variety of NLP tasks, e. g. machine translation.
no code implementations • 31 Oct 2018 • Baosong Yang, Long-Yue Wang, Derek F. Wong, Lidia S. Chao, Zhaopeng Tu
Self-attention network (SAN) has recently attracted increasing interest due to its fully parallelized computation and flexibility in modeling dependencies.
no code implementations • EMNLP 2018 • Baosong Yang, Zhaopeng Tu, Derek F. Wong, Fandong Meng, Lidia S. Chao, Tong Zhang
Self-attention networks have proven to be of profound value for its strength of capturing global dependencies.
Ranked #29 on Machine Translation on WMT2014 English-German
no code implementations • EMNLP 2017 • Baosong Yang, Derek F. Wong, Tong Xiao, Lidia S. Chao, Jingbo Zhu
This paper proposes a hierarchical attentional neural translation model which focuses on enhancing source-side hierarchical representations by covering both local and global semantic information using a bidirectional tree-based encoder.
no code implementations • LREC 2014 • Liang Tian, Derek F. Wong, Lidia S. Chao, Paulo Quaresma, Francisco Oliveira, Yi Lu, Shuo Li, Yiming Wang, Long-Yue Wang
This paper describes the acquisition of a large scale and high quality parallel corpora for English and Chinese.