no code implementations • IWSLT (EMNLP) 2018 • Nguyen Bach, Hongjie Chen, Kai Fan, Cheung-Chi Leung, Bo Li, Chongjia Ni, Rong Tong, Pei Zhang, Boxing Chen, Bin Ma, Fei Huang
This work describes the En→De Alibaba speech translation system developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2018.
no code implementations • WMT (EMNLP) 2020 • Jiayi Wang, Ke Wang, Kai Fan, Yuqi Zhang, Jun Lu, Xin Ge, Yangbin Shi, Yu Zhao
We also apply an imitation learning strategy to augment a reasonable amount of pseudo APE training data, potentially preventing the model to overfit on the limited real training data and boosting the performance on held-out data.
no code implementations • Findings (ACL) 2022 • Lei LI, Kai Fan, Hongjia Li, Chun Yuan
Syntactic structure has long been argued to be potentially useful for enforcing accurate word alignment and improving generalization performance of machine translation.
no code implementations • 17 Jun 2024 • Zhipeng Qian, Pei Zhang, Baosong Yang, Kai Fan, Yiwei Ma, Derek F. Wong, Xiaoshuai Sun, Rongrong Ji
This paper introduces AnyTrans, an all-encompassing framework for the task-Translate AnyText in the Image (TATI), which includes multilingual text translation and text fusion within images.
1 code implementation • 16 Jun 2024 • Guoxin Chen, Minpeng Liao, Chengxi Li, Kai Fan
Furthermore, from the perspective of learning-to-rank, we train an explicit value model to replicate the behavior of the implicit reward model, complementing standard preference optimization.
no code implementations • 29 May 2024 • Wenjie Li, Kai Fan, Jingyuan Zhang, Hui Li, Wei Yang Bryan Lim, Qiang Yang
Federated Learning (FL) is a promising privacy-preserving machine learning paradigm that allows data owners to collaboratively train models while keeping their data localized.
1 code implementation • 6 May 2024 • Guoxin Chen, Minpeng Liao, Chengxi Li, Kai Fan
Although recent advancements in large language models (LLMs) have significantly improved their performance on various tasks, they still face challenges with complex and symbolic multi-step reasoning, particularly in mathematical reasoning.
2 code implementations • 22 Apr 2024 • Boning Zhang, Chengxi Li, Kai Fan
Large language models (LLMs) have been explored in a variety of reasoning tasks including solving of mathematical problems.
2 code implementations • 16 Jan 2024 • Minpeng Liao, Wei Luo, Chengxi Li, Jing Wu, Kai Fan
Large language models (LLMs) have seen considerable advancements in natural language understanding tasks, yet there remains a gap to bridge before attaining true artificial general intelligence, especially concerning shortcomings in mathematical reasoning capabilities.
no code implementations • 23 Oct 2023 • Libo Zhao, Kai Fan, Wei Luo, Jing Wu, Shushu Wang, Ziqian Zeng, Zhongqiang Huang
Simultaneous machine translation (SiMT) requires a robust read/write policy in conjunction with a high-quality translation model.
no code implementations • 28 May 2023 • Lei LI, Kai Fan, Lingyu Yang, Hongjia Li, Chun Yuan
Existing wisdom demonstrates the significance of syntactic knowledge for the improvement of neural machine translation models.
no code implementations • 28 Mar 2023 • Chengxi Li, Kai Fan, Jiajun Bu, Boxing Chen, Zhongqiang Huang, Zhi Yu
Song translation requires both translation of lyrics and alignment of music notes so that the resulting verse can be sung to the accompanying melody, which is a challenging problem that has attracted some interests in different aspects of the translation process.
1 code implementation • 14 Mar 2023 • Biao Fu, Minpeng Liao, Kai Fan, Zhongqiang Huang, Boxing Chen, Yidong Chen, Xiaodong Shi
A popular approach to streaming speech translation is to employ a single offline model with a wait-k policy to support different latency requirements, which is simpler than training multiple online models with different latency constraints.
1 code implementation • 25 Nov 2022 • Pei Zhang, Baosong Yang, Haoran Wei, Dayiheng Liu, Kai Fan, Luo Si, Jun Xie
The lack of competency awareness makes NMT untrustworthy.
no code implementations • 27 Aug 2022 • Wildan Abdussalam, Adam Mertel, Kai Fan, Lennart Schüler, Weronika Schlechte-Wełnicz, Justin M. Calabrese
Here we report the design of a scalable pipeline which serves as a data synchronization to support inter-country top-down spatiotemporal observations and forecasting models of COVID-19, named the where2test, for Germany, Czechia and Poland.
2 code implementations • ACL 2022 • Dexin Wang, Kai Fan, Boxing Chen, Deyi Xiong
k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT).
no code implementations • 23 Nov 2021 • Lei LI, Kai Fan, Chun Yuan
Scene text detection is still a challenging task, as there may be extremely small or low-resolution strokes, and close or arbitrary-shaped texts.
1 code implementation • 15 Oct 2021 • Yu Bai, Heyan Huang, Kai Fan, Yang Gao, Yiming Zhu, Jiaao Zhan, Zewen Chi, Boxing Chen
Through introducing compression rate, the information ratio between the source and the target text, we regard the MT task as a special CLS task with a compression rate of 100%.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Jiayi Wang, Ke Wang, Niyu Ge, Yangbing Shi, Yu Zhao, Kai Fan
With the advent of neural machine translation, there has been a marked shift towards leveraging and consuming the machine translation results.
no code implementations • EMNLP 2020 • Pei Zhang, Boxing Chen, Niyu Ge, Kai Fan
In this paper, we research extensively the pros and cons of the standard transformer in document-level translation, and find that the auto-regressive property can simultaneously bring both the advantage of the consistency and the disadvantage of error accumulation.
no code implementations • 28 Apr 2020 • Shanchan Wu, Kai Fan
One transition is basically parameterized by a non-linear transformation between hidden layers that implicitly represents the conversion between the true and noisy labels, and it can be readily optimized together with other model parameters.
no code implementations • 3 Oct 2019 • Kai Fan, Jiayi Wang, Bo Li, Shiliang Zhang, Boxing Chen, Niyu Ge, Zhijie Yan
The performances of automatic speech recognition (ASR) systems are usually evaluated by the metric word error rate (WER) when the manually transcribed data are provided, which are, however, expensively available in the real scenario.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • ACL 2019 • Pei Zhang, Boxing Chen, Niyu Ge, Kai Fan
Recent advances in sequence modeling have highlighted the strengths of the transformer architecture, especially in achieving state-of-the-art machine translation results.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • CVPR 2019 • Yuanhang Su, Kai Fan, Nguyen Bach, C. -C. Jay Kuo, Fei Huang
Unsupervised neural machine translation (UNMT) has recently achieved remarkable results with only large monolingual corpora in each language.
no code implementations • 14 Nov 2018 • Shanchan Wu, Kai Fan, Qiong Zhang
Distant supervised relation extraction has been successfully applied to large corpus with thousands of relations.
no code implementations • WS 2018 • Jiayi Wang, Kai Fan, Bo Li, Fengming Zhou, Boxing Chen, Yangbin Shi, Luo Si
The goal of WMT 2018 Shared Task on Translation Quality Estimation is to investigate automatic methods for estimating the quality of machine translation results without reference translations.
1 code implementation • 25 Jul 2018 • Kai Fan, Jiayi Wang, Bo Li, Fengming Zhou, Boxing Chen, Luo Si
Recent advances in statistical machine translation via the adoption of neural sequence-to-sequence models empower the end-to-end system to achieve state-of-the-art in many WMT benchmarks.
no code implementations • 1 Dec 2017 • Kai Fan, Qi Wei, Wenlin Wang, Amit Chakraborty, Katherine Heller
We propose a new method that uses deep learning techniques to solve the inverse problems.
no code implementations • 15 Nov 2017 • Wenlin Wang, Yunchen Pu, Vinay Kumar Verma, Kai Fan, Yizhe Zhang, Changyou Chen, Piyush Rai, Lawrence Carin
We present a deep generative model for learning to predict classes not seen at training time.
no code implementations • NeurIPS 2017 • Qi Wei, Kai Fan, Lawrence Carin, Katherine A. Heller
For matrix inversion in the second sub-problem, we learn a convolutional neural network to approximate the matrix inversion, i. e., the inverse mapping is learned by feeding the input through the learned forward network.
1 code implementation • ICML 2017 • Yizhe Zhang, Zhe Gan, Kai Fan, Zhi Chen, Ricardo Henao, Dinghan Shen, Lawrence Carin
We propose a framework for generating realistic text via adversarial training.
no code implementations • 28 Mar 2017 • Kai Fan
Stochastic gradient descent based algorithms are typically used as the general optimization tools for most deep learning models.
no code implementations • 17 Nov 2016 • Fangjian Guo, Xiangyu Wang, Kai Fan, Tamara Broderick, David B. Dunson
Variational inference (VI) provides fast approximations of a Bayesian posterior in part because it formulates posterior approximation as an optimization problem: to find the closest distribution to the exact posterior over some family of distributions.
no code implementations • NeurIPS 2016 • Yizhe Zhang, Xiangyu Wang, Changyou Chen, Ricardo Henao, Kai Fan, Lawrence Carin
We unify slice sampling and Hamiltonian Monte Carlo (HMC) sampling, demonstrating their connection via the Hamiltonian-Jacobi equation from Hamiltonian mechanics.
no code implementations • 23 Dec 2015 • Chunyuan Li, Changyou Chen, Kai Fan, Lawrence Carin
Stochastic gradient MCMC algorithms (SG-MCMC) are a family of diffusion-based sampling methods for large-scale Bayesian learning.
no code implementations • NeurIPS 2015 • Kai Fan, Ziteng Wang, Jeff Beck, James Kwok, Katherine A. Heller
We propose a second-order (Hessian or Hessian-free) based optimization method for variational inference inspired by Gaussian backpropagation, and argue that quasi-Newton optimization can be developed as well.
no code implementations • 13 Nov 2015 • Kai Fan, Katherine Heller
Specifically, we propose a Stochastic SMC algorithm which initializes the set of $k$ means, providing the initial centers chosen from the collapsed particles.
no code implementations • 9 Sep 2015 • Kai Fan, Ziteng Wang, Jeff Beck, James Kwok, Katherine Heller
We propose a second-order (Hessian or Hessian-free) based optimization method for variational inference inspired by Gaussian backpropagation, and argue that quasi-Newton optimization can be developed as well.
no code implementations • NeurIPS 2013 • Ziteng Wang, Kai Fan, Jia-Qi Zhang, Li-Wei Wang
Outputting the summary runs in time $O(n^{1+\frac{d}{2d+K}})$, and the evaluation algorithm for answering a query runs in time $\tilde O (n^{\frac{d+2+\frac{2d}{K}}{2d+K}} )$.