Search Results for author: Baosong Yang

Found 46 papers, 18 papers with code

Sampling-based Alignment and Hierarchical Sub-sentential Alignment in Chinese--Japanese Translation of Patents

no code implementations • WS 2015 • Wei Yang, Zhongwen Zhao, Baosong Yang, Yves Lepage

Paper
Add Code

Towards Bidirectional Hierarchical Representations for Attention-Based Neural Machine Translation

no code implementations • EMNLP 2017 • Baosong Yang, Derek F. Wong, Tong Xiao, Lidia S. Chao, Jingbo Zhu

This paper proposes a hierarchical attentional neural translation model which focuses on enhancing source-side hierarchical representations by covering both local and global semantic information using a bidirectional tree-based encoder.

Machine Translation Translation

Paper
Add Code

Multi-Head Attention with Disagreement Regularization

no code implementations • EMNLP 2018 • Jian Li, Zhaopeng Tu, Baosong Yang, Michael R. Lyu, Tong Zhang

Multi-head attention is appealing for the ability to jointly attend to information from different representation subspaces at different positions.

Translation

Paper
Add Code

Modeling Localness for Self-Attention Networks

no code implementations • EMNLP 2018 • Baosong Yang, Zhaopeng Tu, Derek F. Wong, Fandong Meng, Lidia S. Chao, Tong Zhang

Self-attention networks have proven to be of profound value for its strength of capturing global dependencies.

Ranked #29 on Machine Translation on WMT2014 English-German

Machine Translation Translation

Paper
Add Code

Convolutional Self-Attention Network

no code implementations • 31 Oct 2018 • Baosong Yang, Long-Yue Wang, Derek F. Wong, Lidia S. Chao, Zhaopeng Tu

Self-attention network (SAN) has recently attracted increasing interest due to its fully parallelized computation and flexibility in modeling dependencies.

Translation

Paper
Add Code

Context-Aware Self-Attention Networks

no code implementations • 15 Feb 2019 • Baosong Yang, Jian Li, Derek Wong, Lidia S. Chao, Xing Wang, Zhaopeng Tu

Self-attention model have shown its flexibility in parallel computation and the effectiveness on modeling both long- and short-term dependencies.

Translation

Paper
Add Code

Modeling Recurrence for Transformer

no code implementations • NAACL 2019 • Jie Hao, Xing Wang, Baosong Yang, Long-Yue Wang, Jinfeng Zhang, Zhaopeng Tu

In addition to the standard recurrent neural network, we introduce a novel attentive recurrent network to leverage the strengths of both attention and recurrent networks.

Machine Translation Translation

Paper
Add Code

Convolutional Self-Attention Networks

no code implementations • NAACL 2019 • Baosong Yang, Long-Yue Wang, Derek Wong, Lidia S. Chao, Zhaopeng Tu

Self-attention networks (SANs) have drawn increasing interest due to their high parallelization in computation and flexibility in modeling dependencies.

Machine Translation Translation

Paper
Add Code

Information Aggregation for Multi-Head Attention with Routing-by-Agreement

no code implementations • NAACL 2019 • Jian Li, Baosong Yang, Zi-Yi Dou, Xing Wang, Michael R. Lyu, Zhaopeng Tu

Multi-head attention is appealing for its ability to jointly extract different types of information from multiple representation subspaces.

Machine Translation Translation

Paper
Add Code

Assessing the Ability of Self-Attention Networks to Learn Word Order

1 code implementation • ACL 2019 • Baosong Yang, Long-Yue Wang, Derek F. Wong, Lidia S. Chao, Zhaopeng Tu

Self-attention networks (SAN) have attracted a lot of interests due to their high parallelization and strong performance on a variety of NLP tasks, e. g. machine translation.

Machine Translation Position +1

Paper
Code

Leveraging Local and Global Patterns for Self-Attention Networks

1 code implementation • ACL 2019 • Mingzhou Xu, Derek F. Wong, Baosong Yang, Yue Zhang, Lidia S. Chao

Self-attention networks have received increasing research attention.

Machine Translation Sentence +1

Paper
Code

Neuron Interaction Based Representation Composition for Neural Machine Translation

no code implementations • 22 Nov 2019 • Jian Li, Xing Wang, Baosong Yang, Shuming Shi, Michael R. Lyu, Zhaopeng Tu

Starting from this intuition, we propose a novel approach to compose representations learned by different components in neural machine translation (e. g., multi-layer networks or multi-head attention), based on modeling strong interactions among neurons in the representation vectors.

Machine Translation Translation

Paper
Add Code

Unsupervised Neural Dialect Translation with Commonality and Diversity Modeling

2 code implementations • 11 Dec 2019 • Yu Wan, Baosong Yang, Derek F. Wong, Lidia S. Chao, Haihua Du, Ben C. H. Ao

As a special machine translation task, dialect translation has two main characteristics: 1) lack of parallel training corpus; and 2) possessing similar grammar between two sides of the translation.

Machine Translation Translation

Paper
Code

Uncertainty-Aware Curriculum Learning for Neural Machine Translation

no code implementations • ACL 2020 • Yikai Zhou, Baosong Yang, Derek F. Wong, Yu Wan, Lidia S. Chao

We propose uncertainty-aware curriculum learning, which is motivated by the intuition that: 1) the higher the uncertainty in a translation pair, the more complex and rarer the information it contains; and 2) the end of the decline in model uncertainty indicates the completeness of current training stage.

Machine Translation NMT +1

Paper
Add Code

Self-Paced Learning for Neural Machine Translation

1 code implementation • EMNLP 2020 • Yu Wan, Baosong Yang, Derek F. Wong, Yikai Zhou, Lidia S. Chao, Haibo Zhang, Boxing Chen

Recent studies have proven that the training of neural machine translation (NMT) can be facilitated by mimicking the learning process of humans.

Machine Translation NMT +2

Paper
Code

Constraint Translation Candidates: A Bridge between Neural Query Translation and Cross-lingual Information Retrieval

no code implementations • 26 Oct 2020 • Tianchi Bi, Liang Yao, Baosong Yang, Haibo Zhang, Weihua Luo, Boxing Chen

Query translation (QT) is a key component in cross-lingual information retrieval system (CLIR).

Cross-Lingual Information Retrieval Machine Translation +3

Paper
Add Code

Exploiting Neural Query Translation into Cross Lingual Information Retrieval

no code implementations • 26 Oct 2020 • Liang Yao, Baosong Yang, Haibo Zhang, Weihua Luo, Boxing Chen

As a crucial role in cross-language information retrieval (CLIR), query translation has three main challenges: 1) the adequacy of translation; 2) the lack of in-domain parallel training data; and 3) the requisite of low latency.

Cross-Lingual Information Retrieval Data Augmentation +5

Paper
Add Code

Domain Transfer based Data Augmentation for Neural Query Translation

no code implementations • COLING 2020 • Liang Yao, Baosong Yang, Haibo Zhang, Boxing Chen, Weihua Luo

Query translation (QT) serves as a critical factor in successful cross-lingual information retrieval (CLIR).

Cross-Lingual Information Retrieval Data Augmentation +3

Paper
Add Code

Multi-Hop Transformer for Document-Level Machine Translation

no code implementations • NAACL 2021 • Long Zhang, Tong Zhang, Haibo Zhang, Baosong Yang, Wei Ye, Shikun Zhang

Document-level neural machine translation (NMT) has proven to be of profound value for its effectiveness on capturing contextual information.

Document Level Machine Translation Document Translation +4

Paper
Add Code

Towards User-Driven Neural Machine Translation

1 code implementation • ACL 2021 • Huan Lin, Liang Yao, Baosong Yang, Dayiheng Liu, Haibo Zhang, Weihua Luo, Degen Huang, Jinsong Su

Furthermore, we contribute the first Chinese-English parallel corpus annotated with user behavior called UDT-Corpus.

Contrastive Learning Machine Translation +3

Paper
Code

Bridging Subword Gaps in Pretrain-Finetune Paradigm for Natural Language Generation

no code implementations • ACL 2021 • Xin Liu, Baosong Yang, Dayiheng Liu, Haibo Zhang, Weihua Luo, Min Zhang, Haiying Zhang, Jinsong Su

A well-known limitation in pretrain-finetune paradigm lies in its inflexibility caused by the one-size-fits-all vocabulary.

Text Generation

Paper
Add Code

Leveraging Advantages of Interactive and Non-Interactive Models for Vector-Based Cross-Lingual Information Retrieval

no code implementations • 3 Nov 2021 • Linlong Xu, Baosong Yang, Xiaoyu Lv, Tianchi Bi, Dayiheng Liu, Haibo Zhang

Interactive and non-interactive model are the two de-facto standard frameworks in vector-based cross-lingual information retrieval (V-CLIR), which embed queries and documents in synchronous and asynchronous fashions, respectively.

Computational Efficiency Cross-Lingual Information Retrieval +4

Paper
Add Code

KGR^4: Retrieval, Retrospect, Refine and Rethink for Commonsense Generation

1 code implementation • 15 Dec 2021 • Xin Liu, Dayiheng Liu, Baosong Yang, Haibo Zhang, Junwei Ding, Wenqing Yao, Weihua Luo, Haiying Zhang, Jinsong Su

Generative commonsense reasoning requires machines to generate sentences describing an everyday scenario given several concepts, which has attracted much attention recently.

Retrieval Sentence

Paper
Code

Frequency-Aware Contrastive Learning for Neural Machine Translation

no code implementations • 29 Dec 2021 • Tong Zhang, Wei Ye, Baosong Yang, Long Zhang, Xingzhang Ren, Dayiheng Liu, Jinan Sun, Shikun Zhang, Haibo Zhang, Wen Zhao

Inspired by the observation that low-frequency words form a more compact embedding space, we tackle this challenge from a representation learning perspective.

Contrastive Learning Machine Translation +3

Paper
Add Code

RMBR: A Regularized Minimum Bayes Risk Reranking Framework for Machine Translation

no code implementations • 1 Mar 2022 • Yidan Zhang, Yu Wan, Dayiheng Liu, Baosong Yang, Zhenan He

Recently, Minimum Bayes Risk (MBR) decoding has been proposed to improve the quality for NMT, which seeks for a consensus translation that is closest on average to other candidates from the n-best list.

Machine Translation NMT +1

Paper
Add Code

RoBLEURT Submission for the WMT2021 Metrics Task

no code implementations • 28 Apr 2022 • Yu Wan, Dayiheng Liu, Baosong Yang, Tianchi Bi, Haibo Zhang, Boxing Chen, Weihua Luo, Derek F. Wong, Lidia S. Chao

After investigating the recent advances of trainable metrics, we conclude several aspects of vital importance to obtain a well-performed metric model by: 1) jointly leveraging the advantages of source-included model and reference-only model, 2) continuously pre-training the model with massive synthetic data pairs, and 3) fine-tuning the model with data denoising strategy.

Denoising

Paper
Add Code

UniTE: Unified Translation Evaluation

2 code implementations • ACL 2022 • Yu Wan, Dayiheng Liu, Baosong Yang, Haibo Zhang, Boxing Chen, Derek F. Wong, Lidia S. Chao

Translation quality evaluation plays a crucial role in machine translation.

Machine Translation Multi-Task Learning +1

Paper
Code

Attention Mechanism with Energy-Friendly Operations

1 code implementation • Findings (ACL) 2022 • Yu Wan, Baosong Yang, Dayiheng Liu, Rong Xiao, Derek F. Wong, Haibo Zhang, Boxing Chen, Lidia S. Chao

Attention mechanism has become the dominant module in natural language processing models.

Machine Translation Translation

Paper
Code

Tailor: A Prompt-Based Approach to Attribute-Based Controlled Text Generation

no code implementations • 28 Apr 2022 • Kexin Yang, Dayiheng Liu, Wenqiang Lei, Baosong Yang, Mingfeng Xue, Boxing Chen, Jun Xie

We experimentally find that these prompts can be simply concatenated as a whole to multi-attribute CTG without any re-training, yet raises problems of fluency decrease and position sensitivity.

Attribute Position +1

Paper
Add Code

Dangling-Aware Entity Alignment with Mixed High-Order Proximities

no code implementations • Findings (NAACL) 2022 • Juncheng Liu, Zequn Sun, Bryan Hooi, Yiwei Wang, Dayiheng Liu, Baosong Yang, Xiaokui Xiao, Muhao Chen

We study dangling-aware entity alignment in knowledge graphs (KGs), which is an underexplored but important problem.

Entity Alignment Knowledge Graphs +1

Paper
Add Code

Should We Rely on Entity Mentions for Relation Extraction? Debiasing Relation Extraction with Counterfactual Analysis

1 code implementation • NAACL 2022 • Yiwei Wang, Muhao Chen, Wenxuan Zhou, Yujun Cai, Yuxuan Liang, Dayiheng Liu, Baosong Yang, Juncheng Liu, Bryan Hooi

In this paper, we propose the CORE (Counterfactual Analysis based Relation Extraction) debiasing method that guides the RE models to focus on the main effects of textual context without losing the entity information.

counterfactual Relation +2

Paper
Code

Draft, Command, and Edit: Controllable Text Editing in E-Commerce

no code implementations • 11 Aug 2022 • Kexin Yang, Dayiheng Liu, Wenqiang Lei, Baosong Yang, Qian Qu, Jiancheng Lv

To address this challenge, we explore a new draft-command-edit manner in description generation, leading to the proposed new task-controllable text editing in E-commerce.

Attribute Data Augmentation

Paper
Add Code

Alibaba-Translate China's Submission for WMT 2022 Metrics Shared Task

1 code implementation • 18 Oct 2022 • Yu Wan, Keqin Bao, Dayiheng Liu, Baosong Yang, Derek F. Wong, Lidia S. Chao, Wenqiang Lei, Jun Xie

In this report, we present our submission to the WMT 2022 Metrics Shared Task.

Language Modelling Translation

Paper
Code

Alibaba-Translate China's Submission for WMT 2022 Quality Estimation Shared Task

1 code implementation • 18 Oct 2022 • Keqin Bao, Yu Wan, Dayiheng Liu, Baosong Yang, Wenqiang Lei, Xiangnan He, Derek F. Wong, Jun Xie

In this paper, we present our submission to the sentence-level MQM benchmark at Quality Estimation Shared Task, named UniTE (Unified Translation Evaluation).

Sentence XLM-R

Paper
Code

WR-ONE2SET: Towards Well-Calibrated Keyphrase Generation

1 code implementation • 13 Nov 2022 • Binbin Xie, Xiangpeng Wei, Baosong Yang, Huan Lin, Jun Xie, Xiaoli Wang, Min Zhang, Jinsong Su

Keyphrase generation aims to automatically generate short phrases summarizing an input document.

Keyphrase Generation

Paper
Code

Competency-Aware Neural Machine Translation: Can Machine Translation Know its Own Translation Quality?

1 code implementation • 25 Nov 2022 • Pei Zhang, Baosong Yang, Haoran Wei, Dayiheng Liu, Kai Fan, Luo Si, Jun Xie

The lack of competency awareness makes NMT untrustworthy.

Machine Translation NMT +2

Paper
Code

Towards Fine-Grained Information: Identifying the Type and Location of Translation Errors

no code implementations • 17 Feb 2023 • Keqin Bao, Yu Wan, Dayiheng Liu, Baosong Yang, Wenqiang Lei, Xiangnan He, Derek F. Wong, Jun Xie

In this paper, we propose Fine-Grained Translation Error Detection (FG-TED) task, aiming at identifying both the position and the type of translation errors on given source-hypothesis sentence pairs.

Position Sentence +1

Paper
Add Code

From Statistical Methods to Deep Learning, Automatic Keyphrase Prediction: A Survey

no code implementations • 4 May 2023 • Binbin Xie, Jia Song, Liangying Shao, Suhang Wu, Xiangpeng Wei, Baosong Yang, Huan Lin, Jun Xie, Jinsong Su

In this paper, we comprehensively summarize representative studies from the perspectives of dominant models, datasets and evaluation metrics.

Paper
Add Code

Bridging the Domain Gaps in Context Representations for k-Nearest Neighbor Neural Machine Translation

1 code implementation • 26 May 2023 • Zhiwei Cao, Baosong Yang, Huan Lin, Suhang Wu, Xiangpeng Wei, Dayiheng Liu, Jun Xie, Min Zhang, Jinsong Su

$k$-Nearest neighbor machine translation ($k$NN-MT) has attracted increasing attention due to its ability to non-parametrically adapt to new translation domains.

Domain Adaptation Machine Translation +3

Paper
Code

Meta-Reasoning: Semantics-Symbol Deconstruction for Large Language Models

1 code implementation • 30 Jun 2023 • Yiming Wang, Zhuosheng Zhang, Pei Zhang, Baosong Yang, Rui Wang

Neural-symbolic methods have demonstrated efficiency in enhancing the reasoning abilities of large language models (LLMs).

Domain Generalization In-Context Learning +1

Paper
Code

PolyLM: An Open Source Polyglot Large Language Model

1 code implementation • 12 Jul 2023 • Xiangpeng Wei, Haoran Wei, Huan Lin, TianHao Li, Pei Zhang, Xingzhang Ren, Mei Li, Yu Wan, Zhiwei Cao, Binbin Xie, Tianxiang Hu, Shangjie Li, Binyuan Hui, Bowen Yu, Dayiheng Liu, Baosong Yang, Fei Huang, Jun Xie

Large language models (LLMs) demonstrate remarkable ability to comprehend, reason, and generate following nature language instructions.

Language Modelling Large Language Model +1

6,055

Paper
Code

Challenges of Neural Machine Translation for Short Texts

no code implementations • CL (ACL) 2022 • Yu Wan, Baosong Yang, Derek Fai Wong, Lidia Sam Chao, Liang Yao, Haibo Zhang, Boxing Chen

After empirically investigating the rationale behind this, we summarize two challenges in NMT for STs associated with translation error types above, respectively: (1) the imbalanced length distribution in training set intensifies model inference calibration over STs, leading to more over-translation cases on STs; and (2) the lack of contextual information forces NMT to have higher data uncertainty on short sentences, and thus NMT model is troubled by considerable mistranslation errors.

Machine Translation NMT +2

Paper
Add Code

GCPG: A General Framework for Controllable Paraphrase Generation

no code implementations • Findings (ACL) 2022 • Kexin Yang, Dayiheng Liu, Wenqiang Lei, Baosong Yang, Haibo Zhang, Xue Zhao, Wenqing Yao, Boxing Chen

Under GCPG, we reconstruct commonly adopted lexical condition (i. e., Keywords) and syntactical conditions (i. e., Part-Of-Speech sequence, Constituent Tree, Masked Template and Sentential Exemplar) and study the combination of the two types.

Paraphrase Generation Sentence