Search Results for author: Lidia S. Chao

Found 58 papers, 20 papers with code

RoBLEURT Submission for WMT2021 Metrics Task

no code implementations • WMT (EMNLP) 2021 • Yu Wan, Dayiheng Liu, Baosong Yang, Tianchi Bi, Haibo Zhang, Boxing Chen, Weihua Luo, Derek F. Wong, Lidia S. Chao

After investigating the recent advances of trainable metrics, we conclude several aspects of vital importance to obtain a well-performed metric model by: 1) jointly leveraging the advantages of source-included model and reference-only model, 2) continuously pre-training the model with massive synthetic data pairs, and 3) fine-tuning the model with data denoising strategy.

Denoising

Paper
Add Code

A Two-Stage Prediction-Aware Contrastive Learning Framework for Multi-Intent NLU

no code implementations • 5 May 2024 • Guanhua Chen, Yutong Yao, Derek F. Wong, Lidia S. Chao

Multi-intent natural language understanding (NLU) presents a formidable challenge due to the model confusion arising from multiple intents within a single utterance.

Paper
Add Code

3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset

1 code implementation • 29 Apr 2024 • Xinyu Ma, Xuebo Liu, Derek F. Wong, Jun Rao, Bei Li, Liang Ding, Lidia S. Chao, DaCheng Tao, Min Zhang

Experimental results show that MMT models trained on our dataset exhibit a greater ability to exploit visual information than those trained on other MMT datasets.

Multimodal Machine Translation Sentence +2

Paper
Code

Prefix Text as a Yarn: Eliciting Non-English Alignment in Foundation Language Model

no code implementations • 25 Apr 2024 • Runzhe Zhan, Xinyi Yang, Derek F. Wong, Lidia S. Chao, Yue Zhang

While supervised fine-tuning (SFT) has been a straightforward approach for tailoring the output of foundation large language model (LLM) to specific preferences, concerns have been raised about the depth of this alignment, with some critiques suggesting it is merely "superficial".

Language Modelling Large Language Model +2

Paper
Add Code

Let's Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model

no code implementations • 18 Mar 2024 • Haoyun Xu, Runzhe Zhan, Derek F. Wong, Lidia S. Chao

Large Language Models (LLMs) are composed of neurons that exhibit various behaviors and roles, which become increasingly diversified as models scale.

Language Modelling Large Language Model

Paper
Add Code

A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future Directions

1 code implementation • 23 Oct 2023 • Junchao Wu, Shu Yang, Runzhe Zhan, Yulin Yuan, Derek F. Wong, Lidia S. Chao

In this survey, we collate recent research breakthroughs in this area and underscore the pressing need to bolster detector research.

LLM-generated Text Detection Text Detection

122

Paper
Code

Human-in-the-loop Machine Translation with Large Language Model

1 code implementation • 13 Oct 2023 • Xinyi Yang, Runzhe Zhan, Derek F. Wong, Junchao Wu, Lidia S. Chao

The large language model (LLM) has garnered significant attention due to its in-context learning mechanisms and emergent capabilities.

In-Context Learning Language Modelling +5

Paper
Code

Can LMs Generalize to Future Data? An Empirical Analysis on Text Summarization

1 code implementation • 3 May 2023 • Chi Seng Cheang, Hou Pong Chan, Derek F. Wong, Xuebo Liu, Zhaocong Li, Yanming Sun, Shudong Liu, Lidia S. Chao

Moreover, the knowledge memorized by PLMs may quickly become outdated, which affects the generalization performance of PLMs on future data.

Abstractive Text Summarization

Paper
Code

Is ChatGPT a Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation

no code implementations • 4 Apr 2023 • Tao Fang, Shu Yang, Kaixin Lan, Derek F. Wong, Jinpeng Hu, Lidia S. Chao, Yue Zhang

To showcase its capabilities in GEC, we design zero-shot chain-of-thought (CoT) and few-shot CoT settings using in-context learning for ChatGPT.

Grammatical Error Correction In-Context Learning +2

Paper
Add Code

ConsistTL: Modeling Consistency in Transfer Learning for Low-Resource Neural Machine Translation

1 code implementation • 8 Dec 2022 • Zhaocong Li, Xuebo Liu, Derek F. Wong, Lidia S. Chao, Min Zhang

In this paper, we propose a novel transfer learning method for NMT, namely ConsistTL, which can continuously transfer knowledge from the parent model during the training of the child model.

Low-Resource Neural Machine Translation NMT +2

Paper
Code

Alibaba-Translate China's Submission for WMT 2022 Metrics Shared Task

1 code implementation • 18 Oct 2022 • Yu Wan, Keqin Bao, Dayiheng Liu, Baosong Yang, Derek F. Wong, Lidia S. Chao, Wenqiang Lei, Jun Xie

In this report, we present our submission to the WMT 2022 Metrics Shared Task.

Language Modelling Translation

Paper
Code

UniTE: Unified Translation Evaluation

2 code implementations • ACL 2022 • Yu Wan, Dayiheng Liu, Baosong Yang, Haibo Zhang, Boxing Chen, Derek F. Wong, Lidia S. Chao

Translation quality evaluation plays a crucial role in machine translation.

Machine Translation Multi-Task Learning +1

Paper
Code

Attention Mechanism with Energy-Friendly Operations

1 code implementation • Findings (ACL) 2022 • Yu Wan, Baosong Yang, Dayiheng Liu, Rong Xiao, Derek F. Wong, Haibo Zhang, Boxing Chen, Lidia S. Chao

Attention mechanism has become the dominant module in natural language processing models.

Machine Translation Translation

Paper
Code

RoBLEURT Submission for the WMT2021 Metrics Task

no code implementations • 28 Apr 2022 • Yu Wan, Dayiheng Liu, Baosong Yang, Tianchi Bi, Haibo Zhang, Boxing Chen, Weihua Luo, Derek F. Wong, Lidia S. Chao

Denoising

Paper
Add Code

Variance-Aware Machine Translation Test Sets

1 code implementation • 7 Nov 2021 • Runzhe Zhan, Xuebo Liu, Derek F. Wong, Lidia S. Chao

We release 70 small and discriminative test sets for machine translation (MT) evaluation called variance-aware test sets (VAT), covering 35 translation directions from WMT16 to WMT20 competitions.

Machine Translation Translation

Paper
Code

On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation

1 code implementation • Findings (EMNLP) 2021 • Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Shuming Shi, Zhaopeng Tu

Pre-training (PT) and back-translation (BT) are two simple and powerful methods to utilize monolingual data for improving the model performance of neural machine translation (NMT).

Decoder Machine Translation +3

Paper
Code

Difficulty-Aware Machine Translation Evaluation

1 code implementation • ACL 2021 • Runzhe Zhan, Xuebo Liu, Derek F. Wong, Lidia S. Chao

The high-quality translation results produced by machine translation (MT) systems still pose a huge challenge for automatic evaluation.

Machine Translation Sentence +1

Paper
Code

On the Copying Behaviors of Pre-Training for Neural Machine Translation

1 code implementation • Findings (ACL) 2021 • Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Shuming Shi, Zhaopeng Tu

In response to this problem, we propose a simple and effective method named copying penalty to control the copying behaviors in decoding.

Machine Translation NMT +1

Paper
Code

Meta-Curriculum Learning for Domain Adaptation in Neural Machine Translation

1 code implementation • 3 Mar 2021 • Runzhe Zhan, Xuebo Liu, Derek F. Wong, Lidia S. Chao

Meta-learning has been sufficiently validated to be beneficial for low-resource neural machine translation (NMT).

Domain Adaptation Low-Resource Neural Machine Translation +3

Paper
Code

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning

1 code implementation • ICLR 2021 • Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Zhaopeng Tu

Encoder layer fusion (EncoderFusion) is a technique to fuse all the encoder layers (instead of the uppermost layer) for sequence-to-sequence (Seq2Seq) models, which has proven effective on various NLP tasks.

Decoder Grammatical Error Correction +4

Paper
Code

Document Graph for Neural Machine Translation

no code implementations • EMNLP 2021 • Mingzhou Xu, Liangyou Li, Derek. F. Wong, Qun Liu, Lidia S. Chao

Previous works have shown that contextual information can improve the performance of neural machine translation (NMT).

Machine Translation NMT +1

Paper
Add Code

Self-Paced Learning for Neural Machine Translation

1 code implementation • EMNLP 2020 • Yu Wan, Baosong Yang, Derek F. Wong, Yikai Zhou, Lidia S. Chao, Haibo Zhang, Boxing Chen

Recent studies have proven that the training of neural machine translation (NMT) can be facilitated by mimicking the learning process of humans.

Machine Translation NMT +2

Paper
Code

Uncertainty-Aware Curriculum Learning for Neural Machine Translation

no code implementations • ACL 2020 • Yikai Zhou, Baosong Yang, Derek F. Wong, Yu Wan, Lidia S. Chao

We propose uncertainty-aware curriculum learning, which is motivated by the intuition that: 1) the higher the uncertainty in a translation pair, the more complex and rarer the information it contains; and 2) the end of the decline in model uncertainty indicates the completeness of current training stage.

Machine Translation NMT +1

Paper
Add Code

Norm-Based Curriculum Learning for Neural Machine Translation

1 code implementation • ACL 2020 • Xuebo Liu, Houtim Lai, Derek F. Wong, Lidia S. Chao

We use the norm (aka length or module) of a word embedding as a measure of 1) the difficulty of the sentence, 2) the competence of the model, and 3) the weight of the sentence.

Machine Translation NMT +2

Paper
Code

Unsupervised Neural Dialect Translation with Commonality and Diversity Modeling

2 code implementations • 11 Dec 2019 • Yu Wan, Baosong Yang, Derek F. Wong, Lidia S. Chao, Haihua Du, Ben C. H. Ao

As a special machine translation task, dialect translation has two main characteristics: 1) lack of parallel training corpus; and 2) possessing similar grammar between two sides of the translation.

Machine Translation Translation

Paper
Code

Leveraging Local and Global Patterns for Self-Attention Networks

1 code implementation • ACL 2019 • Mingzhou Xu, Derek F. Wong, Baosong Yang, Yue Zhang, Lidia S. Chao

Self-attention networks have received increasing research attention.

Machine Translation Sentence +1

Paper
Code

Shared-Private Bilingual Word Embeddings for Neural Machine Translation

no code implementations • ACL 2019 • Xuebo Liu, Derek F. Wong, Yang Liu, Lidia S. Chao, Tong Xiao, Jingbo Zhu

For similar source and target words, their embeddings tend to share a part of the features and they cooperatively learn these common representation units.

Machine Translation NMT +3

Paper
Add Code

Learning Deep Transformer Models for Machine Translation

2 code implementations • ACL 2019 • Qiang Wang, Bei Li, Tong Xiao, Jingbo Zhu, Changliang Li, Derek F. Wong, Lidia S. Chao

Transformer is the state-of-the-art model in recent machine translation evaluations.

Machine Translation Translation

113

Paper
Code

Assessing the Ability of Self-Attention Networks to Learn Word Order

1 code implementation • ACL 2019 • Baosong Yang, Long-Yue Wang, Derek F. Wong, Lidia S. Chao, Zhaopeng Tu

Self-attention networks (SAN) have attracted a lot of interests due to their high parallelization and strong performance on a variety of NLP tasks, e. g. machine translation.

Machine Translation Position +1

Paper
Code

Convolutional Self-Attention Networks

no code implementations • NAACL 2019 • Baosong Yang, Long-Yue Wang, Derek Wong, Lidia S. Chao, Zhaopeng Tu

Self-attention networks (SANs) have drawn increasing interest due to their high parallelization in computation and flexibility in modeling dependencies.

Machine Translation Translation

Paper
Add Code

Context-Aware Self-Attention Networks

no code implementations • 15 Feb 2019 • Baosong Yang, Jian Li, Derek Wong, Lidia S. Chao, Xing Wang, Zhaopeng Tu

Self-attention model have shown its flexibility in parallel computation and the effectiveness on modeling both long- and short-term dependencies.

Translation

Paper
Add Code

Convolutional Self-Attention Network

no code implementations • 31 Oct 2018 • Baosong Yang, Long-Yue Wang, Derek F. Wong, Lidia S. Chao, Zhaopeng Tu

Self-attention network (SAN) has recently attracted increasing interest due to its fully parallelized computation and flexibility in modeling dependencies.

Translation

Paper
Add Code

Modeling Localness for Self-Attention Networks

no code implementations • EMNLP 2018 • Baosong Yang, Zhaopeng Tu, Derek F. Wong, Fandong Meng, Lidia S. Chao, Tong Zhang

Self-attention networks have proven to be of profound value for its strength of capturing global dependencies.

Ranked #29 on Machine Translation on WMT2014 English-German

Machine Translation Translation

Paper
Add Code

Towards Bidirectional Hierarchical Representations for Attention-Based Neural Machine Translation

no code implementations • EMNLP 2017 • Baosong Yang, Derek F. Wong, Tong Xiao, Lidia S. Chao, Jingbo Zhu

This paper proposes a hierarchical attentional neural translation model which focuses on enhancing source-side hierarchical representations by covering both local and global semantic information using a bidirectional tree-based encoder.

Machine Translation Translation