Search Results for author: Derek F. Wong

Found 66 papers, 26 papers with code

新型冠状病毒肺炎相关的推特主题与情感研究(Exploring COVID-19-related Twitter Topic Dynamics across Countries)

no code implementations CCL 2020 Shuailong Liang, Derek F. Wong, Yue Zhang

我们基于从2020年1月22日至2020年4月30日在推特社交平台上抓取的不同国家和地区发布的50万条推文, 研究了有关 2019新型冠状病毒肺炎相关的主题和人们的观点, 发现了不同国家之间推特用户的普遍关切和看法之间存在着异同, 并且对不同议题的情感态度也有所不同。我们发现大部分推文中包含了强烈的情感, 其中表达爱与支持的推文比较普遍。总体来看, 人们的情感随着时间的推移逐渐正向增强。

RoBLEURT Submission for WMT2021 Metrics Task

no code implementations WMT (EMNLP) 2021 Yu Wan, Dayiheng Liu, Baosong Yang, Tianchi Bi, Haibo Zhang, Boxing Chen, Weihua Luo, Derek F. Wong, Lidia S. Chao

After investigating the recent advances of trainable metrics, we conclude several aspects of vital importance to obtain a well-performed metric model by: 1) jointly leveraging the advantages of source-included model and reference-only model, 2) continuously pre-training the model with massive synthetic data pairs, and 3) fine-tuning the model with data denoising strategy.

Denoising

Let's Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model

no code implementations18 Mar 2024 Haoyun Xu, Runzhe Zhan, Derek F. Wong, Lidia S. Chao

Large Language Models (LLMs) are composed of neurons that exhibit various behaviors and roles, which become increasingly diversified as models scale.

SelectIT: Selective Instruction Tuning for Large Language Models via Uncertainty-Aware Self-Reflection

1 code implementation26 Feb 2024 Liangxin Liu, Xuebo Liu, Derek F. Wong, Dongfang Li, Ziyi Wang, Baotian Hu, Min Zhang

In this work, we propose a novel approach, termed SelectIT, that capitalizes on the foundational capabilities of the LLM itself.

Anchor-based Large Language Models

no code implementations12 Feb 2024 Jianhui Pang, Fanghua Ye, Derek F. Wong, Longyue Wang

Large language models (LLMs) predominantly employ decoder-only transformer architectures, necessitating the retention of keys/values information for historical tokens to provide contextual information and avoid redundant computation.

Computational Efficiency Question Answering

Benchmarking LLMs via Uncertainty Quantification

1 code implementation23 Jan 2024 Fanghua Ye, Mingming Yang, Jianhui Pang, Longyue Wang, Derek F. Wong, Emine Yilmaz, Shuming Shi, Zhaopeng Tu

The proliferation of open-source Large Language Models (LLMs) from various institutions has highlighted the urgent need for comprehensive evaluation methods.

Benchmarking Uncertainty Quantification

Salute the Classic: Revisiting Challenges of Machine Translation in the Age of Large Language Models

1 code implementation16 Jan 2024 Jianhui Pang, Fanghua Ye, Longyue Wang, Dian Yu, Derek F. Wong, Shuming Shi, Zhaopeng Tu

This study revisits these challenges, offering insights into their ongoing relevance in the context of advanced Large Language Models (LLMs): domain mismatch, amount of parallel data, rare word prediction, translation of long sentences, attention model as word alignment, and sub-optimal beam search.

Machine Translation NMT +2

A Survey on LLM-generated Text Detection: Necessity, Methods, and Future Directions

1 code implementation23 Oct 2023 Junchao Wu, Shu Yang, Runzhe Zhan, Yulin Yuan, Derek F. Wong, Lidia S. Chao

In this survey, we collate recent research breakthroughs in this area and underscore the pressing need to bolster detector research.

LLM-generated Text Detection Text Detection

Human-in-the-loop Machine Translation with Large Language Model

1 code implementation13 Oct 2023 Xinyi Yang, Runzhe Zhan, Derek F. Wong, Junchao Wu, Lidia S. Chao

The large language model (LLM) has garnered significant attention due to its in-context learning mechanisms and emergent capabilities.

In-Context Learning Language Modelling +5

How Does Pretraining Improve Discourse-Aware Translation?

no code implementations31 May 2023 Zhihong Huang, Longyue Wang, Siyou Liu, Derek F. Wong

To bridge this gap, we introduce a probing task to interpret the ability of PLMs to capture discourse relation knowledge.

Machine Translation NMT +1

Can LMs Generalize to Future Data? An Empirical Analysis on Text Summarization

1 code implementation3 May 2023 Chi Seng Cheang, Hou Pong Chan, Derek F. Wong, Xuebo Liu, Zhaocong Li, Yanming Sun, Shudong Liu, Lidia S. Chao

Moreover, the knowledge memorized by PLMs may quickly become outdated, which affects the generalization performance of PLMs on future data.

Abstractive Text Summarization

A Paradigm Shift: The Future of Machine Translation Lies with Large Language Models

no code implementations2 May 2023 Chenyang Lyu, Zefeng Du, Jitao Xu, Yitao Duan, Minghao Wu, Teresa Lynn, Alham Fikri Aji, Derek F. Wong, Longyue Wang

We conclude by emphasizing the critical role of LLMs in guiding the future evolution of MT and offer a roadmap for future exploration in the sector.

Document Translation Machine Translation +2

Is ChatGPT a Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation

no code implementations4 Apr 2023 Tao Fang, Shu Yang, Kaixin Lan, Derek F. Wong, Jinpeng Hu, Lidia S. Chao, Yue Zhang

To showcase its capabilities in GEC, we design zero-shot chain-of-thought (CoT) and few-shot CoT settings using in-context learning for ChatGPT.

Grammatical Error Correction In-Context Learning +2

Towards Fine-Grained Information: Identifying the Type and Location of Translation Errors

no code implementations17 Feb 2023 Keqin Bao, Yu Wan, Dayiheng Liu, Baosong Yang, Wenqiang Lei, Xiangnan He, Derek F. Wong, Jun Xie

In this paper, we propose Fine-Grained Translation Error Detection (FG-TED) task, aiming at identifying both the position and the type of translation errors on given source-hypothesis sentence pairs.

Position Sentence +1

Toward Human-Like Evaluation for Natural Language Generation with Error Analysis

1 code implementation20 Dec 2022 Qingyu Lu, Liang Ding, Liping Xie, Kanjian Zhang, Derek F. Wong, DaCheng Tao

To this end, we augment BARTScore by incorporating the human-like error analysis strategies, namely BARTScore++, where the final score consists of both the evaluations of major errors and minor errors.

Language Modelling Machine Translation +2

ConsistTL: Modeling Consistency in Transfer Learning for Low-Resource Neural Machine Translation

1 code implementation8 Dec 2022 Zhaocong Li, Xuebo Liu, Derek F. Wong, Lidia S. Chao, Min Zhang

In this paper, we propose a novel transfer learning method for NMT, namely ConsistTL, which can continuously transfer knowledge from the parent model during the training of the child model.

Low-Resource Neural Machine Translation NMT +2

Alibaba-Translate China's Submission for WMT 2022 Quality Estimation Shared Task

1 code implementation18 Oct 2022 Keqin Bao, Yu Wan, Dayiheng Liu, Baosong Yang, Wenqiang Lei, Xiangnan He, Derek F. Wong, Jun Xie

In this paper, we present our submission to the sentence-level MQM benchmark at Quality Estimation Shared Task, named UniTE (Unified Translation Evaluation).

Sentence XLM-R

RoBLEURT Submission for the WMT2021 Metrics Task

no code implementations28 Apr 2022 Yu Wan, Dayiheng Liu, Baosong Yang, Tianchi Bi, Haibo Zhang, Boxing Chen, Weihua Luo, Derek F. Wong, Lidia S. Chao

After investigating the recent advances of trainable metrics, we conclude several aspects of vital importance to obtain a well-performed metric model by: 1) jointly leveraging the advantages of source-included model and reference-only model, 2) continuously pre-training the model with massive synthetic data pairs, and 3) fine-tuning the model with data denoising strategy.

Denoising

Variance-Aware Machine Translation Test Sets

1 code implementation7 Nov 2021 Runzhe Zhan, Xuebo Liu, Derek F. Wong, Lidia S. Chao

We release 70 small and discriminative test sets for machine translation (MT) evaluation called variance-aware test sets (VAT), covering 35 translation directions from WMT16 to WMT20 competitions.

Machine Translation Translation

On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation

1 code implementation Findings (EMNLP) 2021 Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Shuming Shi, Zhaopeng Tu

Pre-training (PT) and back-translation (BT) are two simple and powerful methods to utilize monolingual data for improving the model performance of neural machine translation (NMT).

Machine Translation NMT +2

Difficulty-Aware Machine Translation Evaluation

1 code implementation ACL 2021 Runzhe Zhan, Xuebo Liu, Derek F. Wong, Lidia S. Chao

The high-quality translation results produced by machine translation (MT) systems still pose a huge challenge for automatic evaluation.

Machine Translation Sentence +1

Rejuvenating Low-Frequency Words: Making the Most of Parallel Data in Non-Autoregressive Translation

1 code implementation ACL 2021 Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, DaCheng Tao, Zhaopeng Tu

Results demonstrate that the proposed approach can significantly and universally improve translation quality by reducing translation errors on low-frequency words.

Knowledge Distillation Translation

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning

1 code implementation ICLR 2021 Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Zhaopeng Tu

Encoder layer fusion (EncoderFusion) is a technique to fuse all the encoder layers (instead of the uppermost layer) for sequence-to-sequence (Seq2Seq) models, which has proven effective on various NLP tasks.

Grammatical Error Correction Machine Translation +3

Understanding and Improving Lexical Choice in Non-Autoregressive Translation

no code implementations ICLR 2021 Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, DaCheng Tao, Zhaopeng Tu

To this end, we introduce an extra Kullback-Leibler divergence term derived by comparing the lexical choice of NAT model and that embedded in the raw data.

Knowledge Distillation Translation

Self-Paced Learning for Neural Machine Translation

1 code implementation EMNLP 2020 Yu Wan, Baosong Yang, Derek F. Wong, Yikai Zhou, Lidia S. Chao, Haibo Zhang, Boxing Chen

Recent studies have proven that the training of neural machine translation (NMT) can be facilitated by mimicking the learning process of humans.

Machine Translation NMT +2

Modeling Voting for System Combination in Machine Translation

1 code implementation14 Jul 2020 Xuancheng Huang, Jiacheng Zhang, Zhixing Tan, Derek F. Wong, Huanbo Luan, Jingfang Xu, Maosong Sun, Yang Liu

System combination is an important technique for combining the hypotheses of different machine translation systems to improve translation performance.

Machine Translation Translation

Uncertainty-Aware Curriculum Learning for Neural Machine Translation

no code implementations ACL 2020 Yikai Zhou, Baosong Yang, Derek F. Wong, Yu Wan, Lidia S. Chao

We propose uncertainty-aware curriculum learning, which is motivated by the intuition that: 1) the higher the uncertainty in a translation pair, the more complex and rarer the information it contains; and 2) the end of the decline in model uncertainty indicates the completeness of current training stage.

Machine Translation NMT +1

Norm-Based Curriculum Learning for Neural Machine Translation

1 code implementation ACL 2020 Xuebo Liu, Houtim Lai, Derek F. Wong, Lidia S. Chao

We use the norm (aka length or module) of a word embedding as a measure of 1) the difficulty of the sentence, 2) the competence of the model, and 3) the weight of the sentence.

Machine Translation NMT +2

Unsupervised Neural Dialect Translation with Commonality and Diversity Modeling

2 code implementations11 Dec 2019 Yu Wan, Baosong Yang, Derek F. Wong, Lidia S. Chao, Haihua Du, Ben C. H. Ao

As a special machine translation task, dialect translation has two main characteristics: 1) lack of parallel training corpus; and 2) possessing similar grammar between two sides of the translation.

Machine Translation Translation

Shared-Private Bilingual Word Embeddings for Neural Machine Translation

no code implementations ACL 2019 Xuebo Liu, Derek F. Wong, Yang Liu, Lidia S. Chao, Tong Xiao, Jingbo Zhu

For similar source and target words, their embeddings tend to share a part of the features and they cooperatively learn these common representation units.

Machine Translation NMT +3

Assessing the Ability of Self-Attention Networks to Learn Word Order

1 code implementation ACL 2019 Baosong Yang, Long-Yue Wang, Derek F. Wong, Lidia S. Chao, Zhaopeng Tu

Self-attention networks (SAN) have attracted a lot of interests due to their high parallelization and strong performance on a variety of NLP tasks, e. g. machine translation.

Machine Translation Position +1

Convolutional Self-Attention Network

no code implementations31 Oct 2018 Baosong Yang, Long-Yue Wang, Derek F. Wong, Lidia S. Chao, Zhaopeng Tu

Self-attention network (SAN) has recently attracted increasing interest due to its fully parallelized computation and flexibility in modeling dependencies.

Translation

Towards Bidirectional Hierarchical Representations for Attention-Based Neural Machine Translation

no code implementations EMNLP 2017 Baosong Yang, Derek F. Wong, Tong Xiao, Lidia S. Chao, Jingbo Zhu

This paper proposes a hierarchical attentional neural translation model which focuses on enhancing source-side hierarchical representations by covering both local and global semantic information using a bidirectional tree-based encoder.

Machine Translation Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.