no code implementations • 16 Feb 2023 • Zhuoyuan Mao, Tetsuji Nakagawa
Large-scale language-agnostic sentence embedding models such as LaBSE (Feng et al., 2022) obtain state-of-the-art performance for parallel sentence alignment.
no code implementations • WS 2018 • Wei Wang, Taro Watanabe, Macduff Hughes, Tetsuji Nakagawa, Ciprian Chelba
Measuring domain relevance of data and identifying or selecting well-fit domain data for machine translation (MT) is a well-studied topic, but denoising is not yet.
no code implementations • COLING 2016 • Yusuke Oda, Taku Kudo, Tetsuji Nakagawa, Taro Watanabe
In this paper, we propose a new decoding method for phrase-based statistical machine translation which directly uses multiple preordering candidates as a graph structure.