Search Results for author: Kai Fan

Found 35 papers, 5 papers with code

Alibaba’s Submission for the WMT 2020 APE Shared Task: Improving Automatic Post-Editing with Pre-trained Conditional Cross-Lingual BERT

no code implementations WMT (EMNLP) 2020 Jiayi Wang, Ke Wang, Kai Fan, Yuqi Zhang, Jun Lu, Xin Ge, Yangbin Shi, Yu Zhao

We also apply an imitation learning strategy to augment a reasonable amount of pseudo APE training data, potentially preventing the model to overfit on the limited real training data and boosting the performance on held-out data.

Automatic Post-Editing Benchmarking +4

Alibaba Speech Translation Systems for IWSLT 2018

no code implementations IWSLT (EMNLP) 2018 Nguyen Bach, Hongjie Chen, Kai Fan, Cheung-Chi Leung, Bo Li, Chongjia Ni, Rong Tong, Pei Zhang, Boxing Chen, Bin Ma, Fei Huang

This work describes the En→De Alibaba speech translation system developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2018.


Structural Supervision for Word Alignment and Machine Translation

no code implementations Findings (ACL) 2022 Lei LI, Kai Fan, Hongjia Li, Chun Yuan

Syntactic structure has long been argued to be potentially useful for enforcing accurate word alignment and improving generalization performance of machine translation.

Machine Translation Multi-Task Learning +2

Neural Machine Translation with Dynamic Graph Convolutional Decoder

no code implementations28 May 2023 Lei LI, Kai Fan, Lingyu Yang, Hongjia Li, Chun Yuan

Existing wisdom demonstrates the significance of syntactic knowledge for the improvement of neural machine translation models.

Machine Translation Translation

Translate the Beauty in Songs: Jointly Learning to Align Melody and Translate Lyrics

no code implementations28 Mar 2023 Chengxi Li, Kai Fan, Jiajun Bu, Boxing Chen, Zhongqiang Huang, Zhi Yu

Song translation requires both translation of lyrics and alignment of music notes so that the resulting verse can be sung to the accompanying melody, which is a challenging problem that has attracted some interests in different aspects of the translation process.


Adapting Offline Speech Translation Models for Streaming with Future-Aware Distillation and Inference

no code implementations14 Mar 2023 Biao Fu, Kai Fan, Minpeng Liao, Zhongqiang Huang, Boxing Chen, Yidong Chen, Xiaodong Shi

A popular approach to streaming speech translation is to employ a single offline model with a \textit{wait-$k$} policy to support different latency requirements, which is simpler than training multiple online models with different latency constraints.

FAD Translation

A scalable pipeline for COVID-19: the case study of Germany, Czechia and Poland

no code implementations27 Aug 2022 Wildan Abdussalam, Adam Mertel, Kai Fan, Lennart Schüler, Weronika Schlechte-Wełnicz, Justin M. Calabrese

Here we report the design of a scalable pipeline which serves as a data synchronization to support inter-country top-down spatiotemporal observations and forecasting models of COVID-19, named the where2test, for Germany, Czechia and Poland.

Decision Making

Efficient Cluster-Based k-Nearest-Neighbor Machine Translation

2 code implementations ACL 2022 Dexin Wang, Kai Fan, Boxing Chen, Deyi Xiong

k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT).

Contrastive Learning Domain Adaptation +4

StrokeNet: Stroke Assisted and Hierarchical Graph Reasoning Networks

no code implementations23 Nov 2021 Lei LI, Kai Fan, Chun Yuan

Scene text detection is still a challenging task, as there may be extremely small or low-resolution strokes, and close or arbitrary-shaped texts.

Node Classification Relational Reasoning +2

Unifying Cross-lingual Summarization and Machine Translation with Compression Rate

1 code implementation15 Oct 2021 Yu Bai, Heyan Huang, Kai Fan, Yang Gao, Yiming Zhu, Jiaao Zhan, Zewen Chi, Boxing Chen

Through introducing compression rate, the information ratio between the source and the target text, we regard the MT task as a special CLS task with a compression rate of 100%.

Data Augmentation Machine Translation +1

Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation

no code implementations EMNLP 2020 Pei Zhang, Boxing Chen, Niyu Ge, Kai Fan

In this paper, we research extensively the pros and cons of the standard transformer in document-level translation, and find that the auto-regressive property can simultaneously bring both the advantage of the consistency and the disadvantage of error accumulation.

Machine Translation NMT +1

A Practical Framework for Relation Extraction with Noisy Labels Based on Doubly Transitional Loss

no code implementations28 Apr 2020 Shanchan Wu, Kai Fan

One transition is basically parameterized by a non-linear transformation between hidden layers that implicitly represents the conversion between the true and noisy labels, and it can be readily optimized together with other model parameters.

Relation Extraction

Neural Zero-Inflated Quality Estimation Model For Automatic Speech Recognition System

no code implementations3 Oct 2019 Kai Fan, Jiayi Wang, Bo Li, Shiliang Zhang, Boxing Chen, Niyu Ge, Zhijie Yan

The performances of automatic speech recognition (ASR) systems are usually evaluated by the metric word error rate (WER) when the manually transcribed data are provided, which are, however, expensively available in the real scenario.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Lattice Transformer for Speech Translation

no code implementations ACL 2019 Pei Zhang, Boxing Chen, Niyu Ge, Kai Fan

Recent advances in sequence modeling have highlighted the strengths of the transformer architecture, especially in achieving state-of-the-art machine translation results.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Unsupervised Multi-modal Neural Machine Translation

no code implementations CVPR 2019 Yuanhang Su, Kai Fan, Nguyen Bach, C. -C. Jay Kuo, Fei Huang

Unsupervised neural machine translation (UNMT) has recently achieved remarkable results with only large monolingual corpora in each language.

Machine Translation Translation

Improving Distantly Supervised Relation Extraction with Neural Noise Converter and Conditional Optimal Selector

no code implementations14 Nov 2018 Shanchan Wu, Kai Fan, Qiong Zhang

Distant supervised relation extraction has been successfully applied to large corpus with thousands of relations.

Relation Extraction

Alibaba Submission for WMT18 Quality Estimation Task

no code implementations WS 2018 Jiayi Wang, Kai Fan, Bo Li, Fengming Zhou, Boxing Chen, Yangbin Shi, Luo Si

The goal of WMT 2018 Shared Task on Translation Quality Estimation is to investigate automatic methods for estimating the quality of machine translation results without reference translations.

Automatic Post-Editing Language Modelling +1

"Bilingual Expert" Can Find Translation Errors

1 code implementation25 Jul 2018 Kai Fan, Jiayi Wang, Bo Li, Fengming Zhou, Boxing Chen, Luo Si

Recent advances in statistical machine translation via the adoption of neural sequence-to-sequence models empower the end-to-end system to achieve state-of-the-art in many WMT benchmarks.

Language Modelling Machine Translation +1

An inner-loop free solution to inverse problems using deep neural networks

no code implementations NeurIPS 2017 Qi Wei, Kai Fan, Lawrence Carin, Katherine A. Heller

For matrix inversion in the second sub-problem, we learn a convolutional neural network to approximate the matrix inversion, i. e., the inverse mapping is learned by feeding the input through the learned forward network.


Unifying the Stochastic Spectral Descent for Restricted Boltzmann Machines with Bernoulli or Gaussian Inputs

no code implementations28 Mar 2017 Kai Fan

Stochastic gradient descent based algorithms are typically used as the general optimization tools for most deep learning models.

Boosting Variational Inference

no code implementations17 Nov 2016 Fangjian Guo, Xiangyu Wang, Kai Fan, Tamara Broderick, David B. Dunson

Variational inference (VI) provides fast approximations of a Bayesian posterior in part because it formulates posterior approximation as an optimization problem: to find the closest distribution to the exact posterior over some family of distributions.

Variational Inference

Towards Unifying Hamiltonian Monte Carlo and Slice Sampling

no code implementations NeurIPS 2016 Yizhe Zhang, Xiangyu Wang, Changyou Chen, Ricardo Henao, Kai Fan, Lawrence Carin

We unify slice sampling and Hamiltonian Monte Carlo (HMC) sampling, demonstrating their connection via the Hamiltonian-Jacobi equation from Hamiltonian mechanics.

High-Order Stochastic Gradient Thermostats for Bayesian Learning of Deep Models

no code implementations23 Dec 2015 Chunyuan Li, Changyou Chen, Kai Fan, Lawrence Carin

Stochastic gradient MCMC algorithms (SG-MCMC) are a family of diffusion-based sampling methods for large-scale Bayesian learning.

Vocal Bursts Intensity Prediction

Fast Second Order Stochastic Backpropagation for Variational Inference

no code implementations NeurIPS 2015 Kai Fan, Ziteng Wang, Jeff Beck, James Kwok, Katherine A. Heller

We propose a second-order (Hessian or Hessian-free) based optimization method for variational inference inspired by Gaussian backpropagation, and argue that quasi-Newton optimization can be developed as well.

regression Variational Inference

$k$-means: Fighting against Degeneracy in Sequential Monte Carlo with an Application to Tracking

no code implementations13 Nov 2015 Kai Fan, Katherine Heller

Specifically, we propose a Stochastic SMC algorithm which initializes the set of $k$ means, providing the initial centers chosen from the collapsed particles.


Fast Second-Order Stochastic Backpropagation for Variational Inference

no code implementations9 Sep 2015 Kai Fan, Ziteng Wang, Jeff Beck, James Kwok, Katherine Heller

We propose a second-order (Hessian or Hessian-free) based optimization method for variational inference inspired by Gaussian backpropagation, and argue that quasi-Newton optimization can be developed as well.

regression Variational Inference

Efficient Algorithm for Privately Releasing Smooth Queries

no code implementations NeurIPS 2013 Ziteng Wang, Kai Fan, Jia-Qi Zhang, Li-Wei Wang

Outputting the summary runs in time $O(n^{1+\frac{d}{2d+K}})$, and the evaluation algorithm for answering a query runs in time $\tilde O (n^{\frac{d+2+\frac{2d}{K}}{2d+K}} )$.

Cannot find the paper you are looking for? You can Submit a new open access paper.