Search Results for author: Kai Fan

Found 38 papers, 8 papers with code

Structural Supervision for Word Alignment and Machine Translation

no code implementations • Findings (ACL) 2022 • Lei LI, Kai Fan, Hongjia Li, Chun Yuan

Syntactic structure has long been argued to be potentially useful for enforcing accurate word alignment and improving generalization performance of machine translation.

Machine Translation Multi-Task Learning +2

Paper
Add Code

Alibaba Speech Translation Systems for IWSLT 2018

no code implementations • IWSLT (EMNLP) 2018 • Nguyen Bach, Hongjie Chen, Kai Fan, Cheung-Chi Leung, Bo Li, Chongjia Ni, Rong Tong, Pei Zhang, Boxing Chen, Bin Ma, Fei Huang

This work describes the En→De Alibaba speech translation system developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2018.

Sentence Translation

Paper
Add Code

Manifold Adversarial Augmentation for Neural Machine Translation

no code implementations • Findings (ACL) 2021 • Guandan Chen, Kai Fan, Kaibo Zhang, Boxing Chen, Zhongqiang Huang

Machine Translation Translation

Paper
Add Code

Alibaba’s Submission for the WMT 2020 APE Shared Task: Improving Automatic Post-Editing with Pre-trained Conditional Cross-Lingual BERT

no code implementations • WMT (EMNLP) 2020 • Jiayi Wang, Ke Wang, Kai Fan, Yuqi Zhang, Jun Lu, Xin Ge, Yangbin Shi, Yu Zhao

We also apply an imitation learning strategy to augment a reasonable amount of pseudo APE training data, potentially preventing the model to overfit on the limited real training data and boosting the performance on held-out data.

Automatic Post-Editing Benchmarking +4

Paper
Add Code

Probing Multi-modal Machine Translation with Pre-trained Language Model

no code implementations • Findings (ACL) 2021 • Kong Yawei, Kai Fan

Language Modelling Machine Translation +1

Paper
Add Code

MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical dataset evaluation toolkit

1 code implementation • 22 Apr 2024 • Boning Zhang, Chengxi Li, Kai Fan

Large language models (LLMs) have been explored in a variety of reasoning tasks including solving of mathematical problems.

Paper
Code

MARIO: MAth Reasoning with code Interpreter Output -- A Reproducible Pipeline

1 code implementation • 16 Jan 2024 • Minpeng Liao, Wei Luo, Chengxi Li, Jing Wu, Kai Fan

Large language models (LLMs) have seen considerable advancements in natural language understanding tasks, yet there remains a gap to bridge before attaining true artificial general intelligence, especially concerning shortcomings in mathematical reasoning capabilities.

GSM8K Math +2

Paper
Code

Adaptive Policy with Wait-$k$ Model for Simultaneous Translation

no code implementations • 23 Oct 2023 • Libo Zhao, Kai Fan, Wei Luo, Jing Wu, Shushu Wang, Ziqian Zeng, Zhongqiang Huang

Simultaneous machine translation (SiMT) requires a robust read/write policy in conjunction with a high-quality translation model.

Machine Translation Translation

Paper
Add Code

Neural Machine Translation with Dynamic Graph Convolutional Decoder

no code implementations • 28 May 2023 • Lei LI, Kai Fan, Lingyu Yang, Hongjia Li, Chun Yuan

Existing wisdom demonstrates the significance of syntactic knowledge for the improvement of neural machine translation models.

Machine Translation Translation

Paper
Add Code

Translate the Beauty in Songs: Jointly Learning to Align Melody and Translate Lyrics

no code implementations • 28 Mar 2023 • Chengxi Li, Kai Fan, Jiajun Bu, Boxing Chen, Zhongqiang Huang, Zhi Yu

Song translation requires both translation of lyrics and alignment of music notes so that the resulting verse can be sung to the accompanying melody, which is a challenging problem that has attracted some interests in different aspects of the translation process.

Translation

Paper
Add Code

Adapting Offline Speech Translation Models for Streaming with Future-Aware Distillation and Inference

1 code implementation • 14 Mar 2023 • Biao Fu, Minpeng Liao, Kai Fan, Zhongqiang Huang, Boxing Chen, Yidong Chen, Xiaodong Shi

A popular approach to streaming speech translation is to employ a single offline model with a wait-k policy to support different latency requirements, which is simpler than training multiple online models with different latency constraints.

FAD Translation

Paper
Code

Competency-Aware Neural Machine Translation: Can Machine Translation Know its Own Translation Quality?

1 code implementation • 25 Nov 2022 • Pei Zhang, Baosong Yang, Haoran Wei, Dayiheng Liu, Kai Fan, Luo Si, Jun Xie

The lack of competency awareness makes NMT untrustworthy.

Machine Translation NMT +2

Paper
Code

A scalable pipeline for COVID-19: the case study of Germany, Czechia and Poland

no code implementations • 27 Aug 2022 • Wildan Abdussalam, Adam Mertel, Kai Fan, Lennart Schüler, Weronika Schlechte-Wełnicz, Justin M. Calabrese

Here we report the design of a scalable pipeline which serves as a data synchronization to support inter-country top-down spatiotemporal observations and forecasting models of COVID-19, named the where2test, for Germany, Czechia and Poland.

Decision Making

Paper
Add Code

Efficient Cluster-Based k-Nearest-Neighbor Machine Translation

2 code implementations • ACL 2022 • Dexin Wang, Kai Fan, Boxing Chen, Deyi Xiong

k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT).

Contrastive Learning Domain Adaptation +4

Paper
Code

StrokeNet: Stroke Assisted and Hierarchical Graph Reasoning Networks

no code implementations • 23 Nov 2021 • Lei LI, Kai Fan, Chun Yuan

Scene text detection is still a challenging task, as there may be extremely small or low-resolution strokes, and close or arbitrary-shaped texts.

Node Classification Relational Reasoning +2

Paper
Add Code

Unifying Cross-lingual Summarization and Machine Translation with Compression Rate

1 code implementation • 15 Oct 2021 • Yu Bai, Heyan Huang, Kai Fan, Yang Gao, Yiming Zhu, Jiaao Zhan, Zewen Chi, Boxing Chen

Through introducing compression rate, the information ratio between the source and the target text, we regard the MT task as a special CLS task with a compression rate of 100%.

Data Augmentation Machine Translation +1

Paper
Code

Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation

no code implementations • EMNLP 2020 • Pei Zhang, Boxing Chen, Niyu Ge, Kai Fan

In this paper, we research extensively the pros and cons of the standard transformer in document-level translation, and find that the auto-regressive property can simultaneously bring both the advantage of the consistency and the disadvantage of error accumulation.

Machine Translation NMT +1

Paper
Add Code

Computer Assisted Translation with Neural Quality Estimation and Automatic Post-Editing

no code implementations • Findings of the Association for Computational Linguistics 2020 • Jiayi Wang, Ke Wang, Niyu Ge, Yangbing Shi, Yu Zhao, Kai Fan

With the advent of neural machine translation, there has been a marked shift towards leveraging and consuming the machine translation results.

Automatic Post-Editing Translation

Paper
Add Code

A Practical Framework for Relation Extraction with Noisy Labels Based on Doubly Transitional Loss

no code implementations • 28 Apr 2020 • Shanchan Wu, Kai Fan

One transition is basically parameterized by a non-linear transformation between hidden layers that implicitly represents the conversion between the true and noisy labels, and it can be readily optimized together with other model parameters.

Relation Relation Extraction

Paper
Add Code

Neural Zero-Inflated Quality Estimation Model For Automatic Speech Recognition System

no code implementations • 3 Oct 2019 • Kai Fan, Jiayi Wang, Bo Li, Shiliang Zhang, Boxing Chen, Niyu Ge, Zhijie Yan

The performances of automatic speech recognition (ASR) systems are usually evaluated by the metric word error rate (WER) when the manually transcribed data are provided, which are, however, expensively available in the real scenario.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Lattice Transformer for Speech Translation

no code implementations • ACL 2019 • Pei Zhang, Boxing Chen, Niyu Ge, Kai Fan

Recent advances in sequence modeling have highlighted the strengths of the transformer architecture, especially in achieving state-of-the-art machine translation results.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Unsupervised Multi-modal Neural Machine Translation

no code implementations • CVPR 2019 • Yuanhang Su, Kai Fan, Nguyen Bach, C. -C. Jay Kuo, Fei Huang

Unsupervised neural machine translation (UNMT) has recently achieved remarkable results with only large monolingual corpora in each language.

Machine Translation Translation

Paper
Add Code

Improving Distantly Supervised Relation Extraction with Neural Noise Converter and Conditional Optimal Selector

no code implementations • 14 Nov 2018 • Shanchan Wu, Kai Fan, Qiong Zhang

Distant supervised relation extraction has been successfully applied to large corpus with thousands of relations.

Relation Relation Extraction

Paper
Add Code

Alibaba Submission for WMT18 Quality Estimation Task

no code implementations • WS 2018 • Jiayi Wang, Kai Fan, Bo Li, Fengming Zhou, Boxing Chen, Yangbin Shi, Luo Si

The goal of WMT 2018 Shared Task on Translation Quality Estimation is to investigate automatic methods for estimating the quality of machine translation results without reference translations.

Automatic Post-Editing Language Modelling +2

Paper
Add Code

"Bilingual Expert" Can Find Translation Errors

1 code implementation • 25 Jul 2018 • Kai Fan, Jiayi Wang, Bo Li, Fengming Zhou, Boxing Chen, Luo Si

Recent advances in statistical machine translation via the adoption of neural sequence-to-sequence models empower the end-to-end system to achieve state-of-the-art in many WMT benchmarks.

Language Modelling Machine Translation +1

Paper
Code

InverseNet: Solving Inverse Problems with Splitting Networks

no code implementations • 1 Dec 2017 • Kai Fan, Qi Wei, Wenlin Wang, Amit Chakraborty, Katherine Heller

We propose a new method that uses deep learning techniques to solve the inverse problems.

Colorization Deblurring +2

Paper
Add Code

Zero-Shot Learning via Class-Conditioned Deep Generative Models

no code implementations • 15 Nov 2017 • Wenlin Wang, Yunchen Pu, Vinay Kumar Verma, Kai Fan, Yizhe Zhang, Changyou Chen, Piyush Rai, Lawrence Carin

We present a deep generative model for learning to predict classes not seen at training time.

Few-Shot Learning Zero-Shot Learning

Paper
Add Code

An inner-loop free solution to inverse problems using deep neural networks

no code implementations • NeurIPS 2017 • Qi Wei, Kai Fan, Lawrence Carin, Katherine A. Heller

For matrix inversion in the second sub-problem, we learn a convolutional neural network to approximate the matrix inversion, i. e., the inverse mapping is learned by feeding the input through the learned forward network.

Denoising

Paper
Add Code

Adversarial Feature Matching for Text Generation

1 code implementation • ICML 2017 • Yizhe Zhang, Zhe Gan, Kai Fan, Zhi Chen, Ricardo Henao, Dinghan Shen, Lawrence Carin

We propose a framework for generating realistic text via adversarial training.

Generative Adversarial Network Text Generation

Paper
Code

Unifying the Stochastic Spectral Descent for Restricted Boltzmann Machines with Bernoulli or Gaussian Inputs

no code implementations • 28 Mar 2017 • Kai Fan

Stochastic gradient descent based algorithms are typically used as the general optimization tools for most deep learning models.

Paper
Add Code

Boosting Variational Inference

no code implementations • 17 Nov 2016 • Fangjian Guo, Xiangyu Wang, Kai Fan, Tamara Broderick, David B. Dunson

Variational inference (VI) provides fast approximations of a Bayesian posterior in part because it formulates posterior approximation as an optimization problem: to find the closest distribution to the exact posterior over some family of distributions.

Variational Inference

Paper
Add Code

Towards Unifying Hamiltonian Monte Carlo and Slice Sampling

no code implementations • NeurIPS 2016 • Yizhe Zhang, Xiangyu Wang, Changyou Chen, Ricardo Henao, Kai Fan, Lawrence Carin

We unify slice sampling and Hamiltonian Monte Carlo (HMC) sampling, demonstrating their connection via the Hamiltonian-Jacobi equation from Hamiltonian mechanics.

Paper
Add Code

High-Order Stochastic Gradient Thermostats for Bayesian Learning of Deep Models

no code implementations • 23 Dec 2015 • Chunyuan Li, Changyou Chen, Kai Fan, Lawrence Carin

Stochastic gradient MCMC algorithms (SG-MCMC) are a family of diffusion-based sampling methods for large-scale Bayesian learning.

Vocal Bursts Intensity Prediction

Paper
Add Code

Fast Second Order Stochastic Backpropagation for Variational Inference

no code implementations • NeurIPS 2015 • Kai Fan, Ziteng Wang, Jeff Beck, James Kwok, Katherine A. Heller

We propose a second-order (Hessian or Hessian-free) based optimization method for variational inference inspired by Gaussian backpropagation, and argue that quasi-Newton optimization can be developed as well.

regression Variational Inference

Paper
Add Code

$k$-means: Fighting against Degeneracy in Sequential Monte Carlo with an Application to Tracking

no code implementations • 13 Nov 2015 • Kai Fan, Katherine Heller

Specifically, we propose a Stochastic SMC algorithm which initializes the set of $k$ means, providing the initial centers chosen from the collapsed particles.

Clustering

Paper
Add Code

Fast Second-Order Stochastic Backpropagation for Variational Inference

no code implementations • 9 Sep 2015 • Kai Fan, Ziteng Wang, Jeff Beck, James Kwok, Katherine Heller

regression Variational Inference

Paper
Add Code

Efficient Algorithm for Privately Releasing Smooth Queries

no code implementations • NeurIPS 2013 • Ziteng Wang, Kai Fan, Jia-Qi Zhang, Li-Wei Wang

Outputting the summary runs in time $O(n^{1+\frac{d}{2d+K}})$, and the evaluation algorithm for answering a query runs in time $\tilde O (n^{\frac{d+2+\frac{2d}{K}}{2d+K}} )$.

Paper
Add Code

A Novel Burst-based Text Representation Model for Scalable Event Detection

no code implementations • ACL 2012 • Xin Zhao, Rishan Chen, Kai Fan, Hongfei Yan, Xiaoming Li

Event Detection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.