Search Results for author: Boxing Chen

Found 71 papers, 23 papers with code

Bilingual Methods for Adaptive Training Data Selection for Machine Translation

no code implementations AMTA 2016 Boxing Chen, Roland Kuhn, George Foster, Colin Cherry, Fei Huang

In this paper, we propose a new data selection method which uses semi-supervised convolutional neural networks based on bitokens (Bi-SSCNNs) for training machine translation systems from a large bilingual corpus.

Machine Translation NMT +2

Mutual-Learning Improves End-to-End Speech Translation

no code implementations EMNLP 2021 Jiawei Zhao, Wei Luo, Boxing Chen, Andrew Gilman

In this paper, we propose an alternative–a trainable mutual-learning scenario, where the MT and the ST models are collaboratively trained and are considered as peers, rather than teacher/student.

Knowledge Distillation Machine Translation +1

GCPG: A General Framework for Controllable Paraphrase Generation

no code implementations Findings (ACL) 2022 Kexin Yang, Dayiheng Liu, Wenqiang Lei, Baosong Yang, Haibo Zhang, Xue Zhao, Wenqing Yao, Boxing Chen

Under GCPG, we reconstruct commonly adopted lexical condition (i. e., Keywords) and syntactical conditions (i. e., Part-Of-Speech sequence, Constituent Tree, Masked Template and Sentential Exemplar) and study the combination of the two types.

Paraphrase Generation Sentence

Challenges of Neural Machine Translation for Short Texts

no code implementations CL (ACL) 2022 Yu Wan, Baosong Yang, Derek Fai Wong, Lidia Sam Chao, Liang Yao, Haibo Zhang, Boxing Chen

After empirically investigating the rationale behind this, we summarize two challenges in NMT for STs associated with translation error types above, respectively: (1) the imbalanced length distribution in training set intensifies model inference calibration over STs, leading to more over-translation cases on STs; and (2) the lack of contextual information forces NMT to have higher data uncertainty on short sentences, and thus NMT model is troubled by considerable mistranslation errors.

Machine Translation NMT +2

Alibaba Speech Translation Systems for IWSLT 2018

no code implementations IWSLT (EMNLP) 2018 Nguyen Bach, Hongjie Chen, Kai Fan, Cheung-Chi Leung, Bo Li, Chongjia Ni, Rong Tong, Pei Zhang, Boxing Chen, Bin Ma, Fei Huang

This work describes the En→De Alibaba speech translation system developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2018.

Sentence Translation

RoBLEURT Submission for WMT2021 Metrics Task

no code implementations WMT (EMNLP) 2021 Yu Wan, Dayiheng Liu, Baosong Yang, Tianchi Bi, Haibo Zhang, Boxing Chen, Weihua Luo, Derek F. Wong, Lidia S. Chao

After investigating the recent advances of trainable metrics, we conclude several aspects of vital importance to obtain a well-performed metric model by: 1) jointly leveraging the advantages of source-included model and reference-only model, 2) continuously pre-training the model with massive synthetic data pairs, and 3) fine-tuning the model with data denoising strategy.

Denoising

On the importance of Data Scale in Pretraining Arabic Language Models

1 code implementation15 Jan 2024 Abbas Ghaddar, Philippe Langlais, Mehdi Rezagholizadeh, Boxing Chen

Pretraining monolingual language models have been proven to be vital for performance in Arabic Natural Language Processing (NLP) tasks.

Language Modelling

NoMIRACL: Knowing When You Don't Know for Robust Multilingual Retrieval-Augmented Generation

1 code implementation18 Dec 2023 Nandan Thakur, Luiz Bonifacio, Xinyu Zhang, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Boxing Chen, Mehdi Rezagholizadeh, Jimmy Lin

We measure LLM robustness using two metrics: (i) hallucination rate, measuring model tendency to hallucinate an answer, when the answer is not present in passages in the non-relevant subset, and (ii) error rate, measuring model inaccuracy to recognize relevant passages in the relevant subset.

Hallucination Language Modelling +2

Mitigating Outlier Activations in Low-Precision Fine-Tuning of Language Models

no code implementations14 Dec 2023 Alireza Ghaffari, Justin Yu, Mahsa Ghazvini Nejad, Masoud Asgharian, Boxing Chen, Vahid Partovi Nia

The benefit of using integers for outlier values is that it enables us to use operator tiling to avoid performing 16-bit integer matrix multiplication to address this problem effectively.

Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference

no code implementations16 Sep 2023 Parsa Kavehzadeh, Mojtaba Valipour, Marzieh Tahaei, Ali Ghodsi, Boxing Chen, Mehdi Rezagholizadeh

We extend SortedNet to generative NLP tasks, making large language models dynamic without any Pre-Training and by only replacing Standard Fine-Tuning (SFT) with Sorted Fine-Tuning (SoFT).

Instruction Following Question Answering +1

SortedNet, a Place for Every Network and Every Network in its Place: Towards a Generalized Solution for Training Many-in-One Neural Networks

no code implementations1 Sep 2023 Mojtaba Valipour, Mehdi Rezagholizadeh, Hossein Rajabzadeh, Parsa Kavehzadeh, Marzieh Tahaei, Boxing Chen, Ali Ghodsi

Deep neural networks (DNNs) must cater to a variety of users with different performance needs and budgets, leading to the costly practice of training, storing, and maintaining numerous specific models.

Image Classification Model Selection

One Adapter for All Programming Languages? Adapter Tuning for Code Search and Summarization

1 code implementation28 Mar 2023 Deze Wang, Boxing Chen, Shanshan Li, Wei Luo, Shaoliang Peng, Wei Dong, Xiangke Liao

To alleviate the potentially catastrophic forgetting issue in multilingual models, we fix all pre-trained model parameters, insert the parameter-efficient structure adapter, and fine-tune it.

Code Search Code Summarization

Translate the Beauty in Songs: Jointly Learning to Align Melody and Translate Lyrics

no code implementations28 Mar 2023 Chengxi Li, Kai Fan, Jiajun Bu, Boxing Chen, Zhongqiang Huang, Zhi Yu

Song translation requires both translation of lyrics and alignment of music notes so that the resulting verse can be sung to the accompanying melody, which is a challenging problem that has attracted some interests in different aspects of the translation process.

Translation

Mathematical Challenges in Deep Learning

no code implementations24 Mar 2023 Vahid Partovi Nia, Guojun Zhang, Ivan Kobyzev, Michael R. Metel, Xinlin Li, Ke Sun, Sobhan Hemati, Masoud Asgharian, Linglong Kong, Wulong Liu, Boxing Chen

Deep models are dominating the artificial intelligence (AI) industry since the ImageNet challenge in 2012.

Adapting Offline Speech Translation Models for Streaming with Future-Aware Distillation and Inference

1 code implementation14 Mar 2023 Biao Fu, Minpeng Liao, Kai Fan, Zhongqiang Huang, Boxing Chen, Yidong Chen, Xiaodong Shi

A popular approach to streaming speech translation is to employ a single offline model with a wait-k policy to support different latency requirements, which is simpler than training multiple online models with different latency constraints.

FAD Translation

RobustDistiller: Compressing Universal Speech Representations for Enhanced Environment Robustness

no code implementations18 Feb 2023 Heitor R. Guimarães, Arthur Pimentel, Anderson R. Avila, Mehdi Rezagholizadeh, Boxing Chen, Tiago H. Falk

The proposed layer-wise distillation recipe is evaluated on top of three well-established universal representations, as well as with three downstream tasks.

Knowledge Distillation Multi-Task Learning

Discrete Cross-Modal Alignment Enables Zero-Shot Speech Translation

1 code implementation18 Oct 2022 Chen Wang, Yuchen Liu, Boxing Chen, Jiajun Zhang, Wei Luo, Zhongqiang Huang, Chengqing Zong

Existing zero-shot methods fail to align the two modalities of speech and text into a shared semantic space, resulting in much worse performance compared to the supervised ST methods.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Non-Parametric Domain Adaptation for End-to-End Speech Translation

1 code implementation23 May 2022 Yichao Du, Weizhi Wang, Zhirui Zhang, Boxing Chen, Tong Xu, Jun Xie, Enhong Chen

End-to-End Speech Translation (E2E-ST) has received increasing attention due to the potential of its less error propagation, lower latency, and fewer parameters.

Domain Adaptation Translation

RoBLEURT Submission for the WMT2021 Metrics Task

no code implementations28 Apr 2022 Yu Wan, Dayiheng Liu, Baosong Yang, Tianchi Bi, Haibo Zhang, Boxing Chen, Weihua Luo, Derek F. Wong, Lidia S. Chao

After investigating the recent advances of trainable metrics, we conclude several aspects of vital importance to obtain a well-performed metric model by: 1) jointly leveraging the advantages of source-included model and reference-only model, 2) continuously pre-training the model with massive synthetic data pairs, and 3) fine-tuning the model with data denoising strategy.

Denoising

Tailor: A Prompt-Based Approach to Attribute-Based Controlled Text Generation

no code implementations28 Apr 2022 Kexin Yang, Dayiheng Liu, Wenqiang Lei, Baosong Yang, Mingfeng Xue, Boxing Chen, Jun Xie

We experimentally find that these prompts can be simply concatenated as a whole to multi-attribute CTG without any re-training, yet raises problems of fluency decrease and position sensitivity.

Attribute Position +1

Efficient Cluster-Based k-Nearest-Neighbor Machine Translation

2 code implementations ACL 2022 Dexin Wang, Kai Fan, Boxing Chen, Deyi Xiong

k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT).

Contrastive Learning Domain Adaptation +4

DePA: Improving Non-autoregressive Machine Translation with Dependency-Aware Decoder

1 code implementation30 Mar 2022 Jiaao Zhan, Qian Chen, Boxing Chen, Wen Wang, Yu Bai, Yang Gao

We propose a novel and general Dependency-Aware Decoder (DePA) to enhance target dependency modeling in the decoder of fully NAT models from two perspectives: decoder self-attention and decoder input.

Machine Translation Translation

QEMind: Alibaba's Submission to the WMT21 Quality Estimation Shared Task

no code implementations30 Dec 2021 Jiayi Wang, Ke Wang, Boxing Chen, Yu Zhao, Weihua Luo, Yuqi Zhang

Quality Estimation, as a crucial step of quality control for machine translation, has been explored for years.

Machine Translation Sentence +1

Regularizing End-to-End Speech Translation with Triangular Decomposition Agreement

1 code implementation21 Dec 2021 Yichao Du, Zhirui Zhang, Weizhi Wang, Boxing Chen, Jun Xie, Tong Xu

In this paper, we attempt to model the joint probability of transcription and translation based on the speech input to directly leverage such triplet data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Unifying Cross-lingual Summarization and Machine Translation with Compression Rate

1 code implementation15 Oct 2021 Yu Bai, Heyan Huang, Kai Fan, Yang Gao, Yiming Zhu, Jiaao Zhan, Zewen Chi, Boxing Chen

Through introducing compression rate, the information ratio between the source and the target text, we regard the MT task as a special CLS task with a compression rate of 100%.

Data Augmentation Machine Translation +1

Non-Parametric Unsupervised Domain Adaptation for Neural Machine Translation

1 code implementation Findings (EMNLP) 2021 Xin Zheng, Zhirui Zhang, ShuJian Huang, Boxing Chen, Jun Xie, Weihua Luo, Jiajun Chen

Recently, $k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor ($k$NN) retrieval to achieve domain adaptation without retraining.

Machine Translation NMT +3

Rethinking Zero-shot Neural Machine Translation: From a Perspective of Latent Variables

1 code implementation Findings (EMNLP) 2021 Weizhi Wang, Zhirui Zhang, Yichao Du, Boxing Chen, Jun Xie, Weihua Luo

However, it usually suffers from capturing spurious correlations between the output language and language invariant semantics due to the maximum likelihood training objective, leading to poor transfer performance on zero-shot translation.

Denoising Machine Translation +2

Task-Oriented Dialogue System as Natural Language Generation

1 code implementation31 Aug 2021 Weizhi Wang, Zhirui Zhang, Junliang Guo, Yinpei Dai, Boxing Chen, Weihua Luo

In this paper, we propose to formulate the task-oriented dialogue system as the purely natural language generation task, so as to fully leverage the large-scale pre-trained models like GPT-2 and simplify complicated delexicalization prepossessing.

Text Generation Transfer Learning

Context-Interactive Pre-Training for Document Machine Translation

no code implementations NAACL 2021 Pengcheng Yang, Pei Zhang, Boxing Chen, Jun Xie, Weihua Luo

Document machine translation aims to translate the source sentence into the target language in the presence of additional contextual information.

Machine Translation Sentence +1

Continual Learning for Neural Machine Translation

no code implementations NAACL 2021 Yue Cao, Hao-Ran Wei, Boxing Chen, Xiaojun Wan

In practical applications, NMT models are usually trained on a general domain corpus and then fine-tuned by continuing training on the in-domain corpus.

Continual Learning Knowledge Distillation +3

G-Transformer for Document-level Machine Translation

1 code implementation ACL 2021 Guangsheng Bao, Yue Zhang, Zhiyang Teng, Boxing Chen, Weihua Luo

However, study shows that when we further enlarge the translation unit to a whole document, supervised training of Transformer can fail.

Document Level Machine Translation Inductive Bias +3

Adaptive Nearest Neighbor Machine Translation

3 code implementations ACL 2021 Xin Zheng, Zhirui Zhang, Junliang Guo, ShuJian Huang, Boxing Chen, Weihua Luo, Jiajun Chen

On four benchmark machine translation datasets, we demonstrate that the proposed method is able to effectively filter out the noises in retrieval results and significantly outperforms the vanilla kNN-MT model.

Machine Translation NMT +2

Towards Variable-Length Textual Adversarial Attacks

no code implementations16 Apr 2021 Junliang Guo, Zhirui Zhang, Linlin Zhang, Linli Xu, Boxing Chen, Enhong Chen, Weihua Luo

In this way, our approach is able to more comprehensively find adversarial examples around the decision boundary and effectively conduct adversarial attacks.

Machine Translation Translation

Exploiting Neural Query Translation into Cross Lingual Information Retrieval

no code implementations26 Oct 2020 Liang Yao, Baosong Yang, Haibo Zhang, Weihua Luo, Boxing Chen

As a crucial role in cross-language information retrieval (CLIR), query translation has three main challenges: 1) the adequacy of translation; 2) the lack of in-domain parallel training data; and 3) the requisite of low latency.

Cross-Lingual Information Retrieval Data Augmentation +5

Incorporating BERT into Parallel Sequence Decoding with Adapters

1 code implementation NeurIPS 2020 Junliang Guo, Zhirui Zhang, Linli Xu, Hao-Ran Wei, Boxing Chen, Enhong Chen

Our framework is based on a parallel sequence decoding algorithm named Mask-Predict considering the bi-directional and conditional independent nature of BERT, and can be adapted to traditional autoregressive decoding easily.

Machine Translation Natural Language Understanding +2

Self-Paced Learning for Neural Machine Translation

1 code implementation EMNLP 2020 Yu Wan, Baosong Yang, Derek F. Wong, Yikai Zhou, Lidia S. Chao, Haibo Zhang, Boxing Chen

Recent studies have proven that the training of neural machine translation (NMT) can be facilitated by mimicking the learning process of humans.

Machine Translation NMT +2

Iterative Domain-Repaired Back-Translation

no code implementations EMNLP 2020 Hao-Ran Wei, Zhirui Zhang, Boxing Chen, Weihua Luo

In this paper, we focus on the domain-specific translation with low resources, where in-domain parallel corpora are scarce or nonexistent.

Domain Adaptation NMT +1

Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation

no code implementations EMNLP 2020 Pei Zhang, Boxing Chen, Niyu Ge, Kai Fan

In this paper, we research extensively the pros and cons of the standard transformer in document-level translation, and find that the auto-regressive property can simultaneously bring both the advantage of the consistency and the disadvantage of error accumulation.

Machine Translation NMT +1

Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences

1 code implementation ACL 2020 Xiangyu Duan, Baijun Ji, Hao Jia, Min Tan, Min Zhang, Boxing Chen, Weihua Luo, Yue Zhang

In this paper, we propose a new task of machine translation (MT), which is based on no parallel sentences but can refer to a ground-truth bilingual dictionary.

Machine Translation Translation +1

Visual Agreement Regularized Training for Multi-Modal Machine Translation

no code implementations27 Dec 2019 Pengcheng Yang, Boxing Chen, Pei Zhang, Xu sun

Further analysis demonstrates that the proposed regularized training can effectively improve the agreement of attention on the image, leading to better use of visual information.

Machine Translation Sentence +1

Cross-lingual Pre-training Based Transfer for Zero-shot Neural Machine Translation

no code implementations3 Dec 2019 Baijun Ji, Zhirui Zhang, Xiangyu Duan, Min Zhang, Boxing Chen, Weihua Luo

However, existing transfer methods involving a common target language are far from success in the extreme scenario of zero-shot translation, due to the language space mismatch problem between transferor (the parent model) and transferee (the child model) on the source side.

Machine Translation NMT +2

Neural Zero-Inflated Quality Estimation Model For Automatic Speech Recognition System

no code implementations3 Oct 2019 Kai Fan, Jiayi Wang, Bo Li, Shiliang Zhang, Boxing Chen, Niyu Ge, Zhijie Yan

The performances of automatic speech recognition (ASR) systems are usually evaluated by the metric word error rate (WER) when the manually transcribed data are provided, which are, however, expensively available in the real scenario.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Zero-Shot Cross-Lingual Abstractive Sentence Summarization through Teaching Generation and Attention

1 code implementation ACL 2019 Xiangyu Duan, Mingming Yin, Min Zhang, Boxing Chen, Weihua Luo

But there is no cross-lingual parallel corpus, whose source sentence language is different to the summary language, to directly train a cross-lingual ASSUM system.

Sentence Sentence Summarization +1

Lattice Transformer for Speech Translation

no code implementations ACL 2019 Pei Zhang, Boxing Chen, Niyu Ge, Kai Fan

Recent advances in sequence modeling have highlighted the strengths of the transformer architecture, especially in achieving state-of-the-art machine translation results.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Alibaba Submission for WMT18 Quality Estimation Task

no code implementations WS 2018 Jiayi Wang, Kai Fan, Bo Li, Fengming Zhou, Boxing Chen, Yangbin Shi, Luo Si

The goal of WMT 2018 Shared Task on Translation Quality Estimation is to investigate automatic methods for estimating the quality of machine translation results without reference translations.

Automatic Post-Editing Language Modelling +2

Alibaba Submission to the WMT18 Parallel Corpus Filtering Task

no code implementations WS 2018 Jun Lu, Xiaoyu Lv, Yangbin Shi, Boxing Chen

This paper describes the Alibaba Machine Translation Group submissions to the WMT 2018 Shared Task on Parallel Corpus Filtering.

Machine Translation Sentence +2

"Bilingual Expert" Can Find Translation Errors

1 code implementation25 Jul 2018 Kai Fan, Jiayi Wang, Bo Li, Fengming Zhou, Boxing Chen, Luo Si

Recent advances in statistical machine translation via the adoption of neural sequence-to-sequence models empower the end-to-end system to achieve state-of-the-art in many WMT benchmarks.

Language Modelling Machine Translation +1

Cost Weighting for Neural Machine Translation Domain Adaptation

no code implementations WS 2017 Boxing Chen, Colin Cherry, George Foster, Samuel Larkin

We compare cost weighting to two traditional domain adaptation techniques developed for statistical machine translation: data selection and sub-corpus weighting.

Domain Adaptation Machine Translation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.