no code implementations • 29 Jul 2023 • Yiren Wang, Peter C B Phillips, Liangjun Su
With the preliminary nuclear-norm-regularized estimation followed by row- and column-wise linear regressions, we estimate the break point based on the idea of binary segmentation and the latent group structures together with the number of groups before and after the break by sequential testing K-means algorithm simultaneously.
no code implementations • 24 May 2023 • Vikas Raunak, Amr Sharaf, Yiren Wang, Hany Hassan Awadallah, Arul Menezes
In this work, we formalize the task of direct translation post-editing with Large Language Models (LLMs) and explore the use of GPT-4 to automatically post-edit NMT outputs across several language pairs.
no code implementations • 20 Oct 2022 • Yiren Wang, Liangjun Su, Yichong Zhang
In this paper, we propose a class of low-rank panel quantile regression models which allow for unobserved slope heterogeneity over both individuals and time.
no code implementations • 5 Jul 2022 • Yiren Wang, Fatima Tuz-Zahra, Rong Zablocki, Chongzhi Di, Marta M. Jankowska, John Bellettiere, Jordan A. Carlson, Andrea Z. LaCroix, Sheri J. Hartman, Dori E. Rosenberg, Jingjing Zou, Loki Natarajan
Cohort studies are increasingly using accelerometers for physical activity and sedentary behavior estimation.
no code implementations • EMNLP 2020 • Yiren Wang, ChengXiang Zhai, Hany Hassan Awadalla
In this work, we propose a multi-task learning (MTL) framework that jointly trains the model with the translation task on bitext data and two denoising tasks on the monolingual data.
1 code implementation • NeurIPS 2019 • Yiren Wang, Yingce Xia, Fei Tian, Fei Gao, Tao Qin, Cheng Xiang Zhai, Tie-Yan Liu
Neural machine translation models usually use the encoder-decoder framework and generate translation from left to right (or right to left) without fully utilizing the target-side global information.
no code implementations • 22 Nov 2019 • Yiren Wang, Hongzhao Huang, Zhe Liu, Yutong Pang, Yongqiang Wang, ChengXiang Zhai, Fuchun Peng
Although n-gram language models (LMs) have been outperformed by the state-of-the-art neural LMs, they are still widely used in speech recognition due to its high efficiency in inference.
no code implementations • WS 2019 • Yingce Xia, Xu Tan, Fei Tian, Fei Gao, Weicong Chen, Yang Fan, Linyuan Gong, Yichong Leng, Renqian Luo, Yiren Wang, Lijun Wu, Jinhua Zhu, Tao Qin, Tie-Yan Liu
We Microsoft Research Asia made submissions to 11 language directions in the WMT19 news translation tasks.
no code implementations • IJCNLP 2019 • Lijun Wu, Yiren Wang, Yingce Xia, Tao Qin, Jian-Huang Lai, Tie-Yan Liu
In this work, we study how to use both the source-side and target-side monolingual data for NMT, and propose an effective strategy leveraging both of them.
Ranked #1 on Machine Translation on WMT2016 English-German (SacreBLEU metric, using extra training data)
1 code implementation • ACL 2019 • Lijun Wu, Yiren Wang, Yingce Xia, Fei Tian, Fei Gao, Tao Qin, Jian-Huang Lai, Tie-Yan Liu
While very deep neural networks have shown effectiveness for computer vision and text classification applications, how to increase the network depth of neural machine translation (NMT) models for better translation quality remains a challenging problem.
Ranked #11 on Machine Translation on WMT2014 English-French
no code implementations • ICLR 2019 • Yiren Wang, Yingce Xia, Tianyu He, Fei Tian, Tao Qin, ChengXiang Zhai, Tie-Yan Liu
Dual learning has attracted much attention in machine learning, computer vision and natural language processing communities.
Ranked #1 on Machine Translation on WMT2016 English-German
no code implementations • 22 Feb 2019 • Yiren Wang, Fei Tian, Di He, Tao Qin, ChengXiang Zhai, Tie-Yan Liu
However, the high efficiency has come at the cost of not capturing the sequential dependency on the target side of translation, which causes NAT to suffer from two kinds of translation errors: 1) repeated translations (due to indistinguishable adjacent decoder hidden states), and 2) incomplete translations (due to incomplete transfer of source side information via the decoder hidden states).