1 code implementation • 28 Feb 2024 • Vilém Zouhar, Shuoyang Ding, Anna Currey, Tatyana Badeka, Jenyuan Wang, Brian Thompson
We introduce a new, extensive multidimensional quality metrics (MQM) annotated dataset covering 11 language pairs in the biomedical domain.
1 code implementation • AMTA 2022 • Weiting Tan, Shuoyang Ding, Huda Khayrallah, Philipp Koehn
Neural Machine Translation (NMT) models are known to suffer from noisy inputs.
no code implementations • WMT (EMNLP) 2021 • Shuoyang Ding, Marcin Junczys-Dowmunt, Matt Post, Christian Federmann, Philipp Koehn
This paper presents the JHU-Microsoft joint submission for WMT 2021 quality estimation shared task.
1 code implementation • EMNLP 2021 • Shuoyang Ding, Marcin Junczys-Dowmunt, Matt Post, Philipp Koehn
We propose a novel scheme to use the Levenshtein Transformer to perform the task of word-level quality estimation.
1 code implementation • NAACL 2021 • Shuoyang Ding, Philipp Koehn
Saliency methods are widely used to interpret neural network predictions, but different variants of saliency methods often disagree even on the interpretations of the same prediction made by the same model.
1 code implementation • 18 Sep 2019 • Yiming Wang, Tongfei Chen, Hainan Xu, Shuoyang Ding, Hang Lv, Yiwen Shao, Nanyun Peng, Lei Xie, Shinji Watanabe, Sanjeev Khudanpur
We present Espresso, an open-source, modular, extensible end-to-end neural automatic speech recognition (ASR) toolkit based on the deep learning library PyTorch and the popular neural machine translation toolkit fairseq.
Ranked #1 on Speech Recognition on Hub5'00 CallHome
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
1 code implementation • WS 2019 • Shuoyang Ding, Hainan Xu, Philipp Koehn
Despite their original goal to jointly learn to align and translate, Neural Machine Translation (NMT) models, especially Transformer, are often perceived as not learning interpretable word alignments.
no code implementations • WS 2019 • Shuoyang Ding, Adithya Renduchintala, Kevin Duh
Most neural machine translation systems are built upon subword units extracted by methods such as Byte-Pair Encoding (BPE) or wordpiece.
1 code implementation • WS 2019 • Shuoyang Ding, Philipp Koehn
Stack Long Short-Term Memory (StackLSTM) is useful for various applications such as parsing and string-to-tree neural machine translation, but it is also known to be notoriously difficult to parallelize for GPU training due to the fact that the computations are dependent on discrete operations.
no code implementations • 10 Nov 2018 • Hainan Xu, Shuoyang Ding, Shinji Watanabe
Most end-to-end speech recognition systems model text directly as a sequence of characters or sub-words.
no code implementations • 5 Jun 2018 • Shuoyang Ding, Kevin Duh
Using pre-trained word embeddings as input layer is a common practice in many natural language processing (NLP) tasks, but it is largely neglected for neural machine translation (NMT).
no code implementations • 27 Mar 2018 • Adithya Renduchintala, Shuoyang Ding, Matthew Wiesner, Shinji Watanabe
We present a new end-to-end architecture for automatic speech recognition (ASR) that can be trained using \emph{symbolic} input in addition to the traditional acoustic input.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3