no code implementations • WMT (EMNLP) 2021 • Farhad Akhbardeh, Arkady Arkhangorodsky, Magdalena Biesialska, Ondřej Bojar, Rajen Chatterjee, Vishrav Chaudhary, Marta R. Costa-Jussa, Cristina España-Bonet, Angela Fan, Christian Federmann, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Leonie Harter, Kenneth Heafield, Christopher Homan, Matthias Huck, Kwabena Amponsah-Kaakyire, Jungo Kasai, Daniel Khashabi, Kevin Knight, Tom Kocmi, Philipp Koehn, Nicholas Lourie, Christof Monz, Makoto Morishita, Masaaki Nagata, Ajay Nagesh, Toshiaki Nakazawa, Matteo Negri, Santanu Pal, Allahsera Auguste Tapo, Marco Turchi, Valentin Vydrin, Marcos Zampieri
This paper presents the results of the newstranslation task, the multilingual low-resourcetranslation for Indo-European languages, thetriangular translation task, and the automaticpost-editing task organised as part of the Con-ference on Machine Translation (WMT) 2021. In the news task, participants were asked tobuild machine translation systems for any of10 language pairs, to be evaluated on test setsconsisting mainly of news stories.
1 code implementation • ECCV 2020 • Soichiro Fujita, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata
This paper proposes a new evaluation framework, Story Oriented Dense video cAptioning evaluation framework (SODA), for measuring the performance of video story description systems.
no code implementations • IWSLT (EMNLP) 2018 • Yuto Takebayashi, Chu Chenhui, Yuki Arase†, Masaaki Nagata
To improve the translation adequacy in neural machine translation (NMT), we propose a rewarding model with target word prediction using bilingual dictionaries inspired by the success of decoder constraints in statistical machine translation.
1 code implementation • 29 Jul 2024 • Tom Kocmi, Eleftherios Avramidis, Rachel Bawden, Ondrej Bojar, Anton Dvorkovich, Christian Federmann, Mark Fishel, Markus Freitag, Thamme Gowda, Roman Grundkiewicz, Barry Haddow, Marzena Karpinska, Philipp Koehn, Benjamin Marie, Kenton Murray, Masaaki Nagata, Martin Popel, Maja Popovic, Mariya Shmatova, Steinþór Steingrímsson, Vilém Zouhar
This is the preliminary ranking of WMT24 General MT systems based on automatic metrics.
no code implementations • 3 Jul 2024 • Minato Kondo, Takehito Utsuro, Masaaki Nagata
The results demonstrate that when utilizing parallel data in continual pre-training, it is essential to alternate between source and target sentences.
no code implementations • 15 May 2024 • Masaaki Nagata, Makoto Morishita, Katsuki Chousa, Norihito Yasuda
Using crowdsourcing, we collected more than 10, 000 URL pairs (parallel top page pairs) of bilingual websites that contain parallel documents and created a Japanese-Chinese parallel corpus of 4. 6M sentence pairs from these websites.
no code implementations • 15 May 2024 • Qiyu Wu, Masaaki Nagata, Zhongtao Miao, Yoshimasa Tsuruoka
In this work, we mitigate the problem in an LLM-based MT model by guiding it to better word alignment.
2 code implementations • 9 Jun 2023 • Qiyu Wu, Masaaki Nagata, Yoshimasa Tsuruoka
Most existing word alignment methods rely on manual alignment datasets or parallel corpora, which limits their usefulness.
no code implementations • 28 Oct 2022 • Makoto Morishita, Jun Suzuki, Masaaki Nagata
With the collected parallel data, we can quickly adapt a machine translation model to the target domain.
1 code implementation • 15 Oct 2022 • Naoki Kobayashi, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata
To promote and further develop RST-style discourse parsing models, we need a strong baseline that can be regarded as a reference for reporting reliable experimental results.
Ranked #1 on Discourse Parsing on Instructional-DT (Instr-DT)
no code implementations • 23 Sep 2022 • Yizhen Wei, Takehito Utsuro, Masaaki Nagata
Based on extended word alignment, we further propose a novel task called refined word-level QE that outputs refined tags and word-level correspondences.
no code implementations • LREC 2022 • Makoto Morishita, Katsuki Chousa, Jun Suzuki, Masaaki Nagata
Most current machine translation models are mainly trained with parallel corpora, and their translation accuracy largely depends on the quality and quantity of the corpora.
no code implementations • ACL 2021 • Sei Iwata, Taro Watanabe, Masaaki Nagata
In the experiments, our model surpassed the sequence labeling baseline.
no code implementations • NAACL 2021 • Naoki Kobayashi, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata
We then pre-train a neural RST parser with the obtained silver data and fine-tune it on the RST-DT.
Ranked #1 on Discourse Parsing on RST-DT (RST-Parseval (Relation) metric, using extra training data)
1 code implementation • EACL 2021 • Makoto Morishita, Jun Suzuki, Tomoharu Iwata, Masaaki Nagata
It is crucial to provide an inter-sentence context in Neural Machine Translation (NMT) models for higher-quality translation.
1 code implementation • COLING 2020 • Katsuki Chousa, Masaaki Nagata, Masaaki Nishino
In particular, our method improved by +53. 9 F1 scores for extracting non-parallel sentences.
no code implementations • EMNLP 2020 • Loïc Barrault, Magdalena Biesialska, Ondřej Bojar, Marta R. Costa-jussà, Christian Federmann, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Matthias Huck, Eric Joanis, Tom Kocmi, Philipp Koehn, Chi-kiu Lo, Nikola Ljubešić, Christof Monz, Makoto Morishita, Masaaki Nagata, Toshiaki Nakazawa, Santanu Pal, Matt Post, Marcos Zampieri
In the news task, participants were asked to build machine translation systems for any of 11 language pairs, to be evaluated on test sets consisting mainly of news stories.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Kosuke Yamada, Tsutomu Hirao, Ryohei Sasano, Koichi Takeda, Masaaki Nagata
Dividing biomedical abstracts into several segments with rhetorical roles is essential for supporting researchers{'} information access in the biomedical domain.
no code implementations • WS 2020 • Hongyi Cui, Yizhen Wei, Shohei Iida, Takehito Utsuro, Masaaki Nagata
In this paper, we introduce University of Tsukuba{'}s submission to the IWSLT20 Open Domain Translation Task.
no code implementations • LREC 2020 • Masaaki Nagata, Makoto Morishita
We improved the translation accuracy using context-aware neural machine translation, and the improvement mainly reflects the betterment of the translation of zero pronouns.
no code implementations • EMNLP 2020 • Masaaki Nagata, Chousa Katsuki, Masaaki Nishino
For example, we achieved an F1 score of 86. 7 for the Chinese-English data, which is 13. 3 points higher than the previous state-of-the-art supervised methods.
no code implementations • 29 Apr 2020 • Katsuki Chousa, Masaaki Nagata, Masaaki Nishino
We also conduct a sentence alignment experiment using En-Ja newspaper articles and find that the proposed method using multilingual BERT achieves significantly better accuracy than a baseline method using a bilingual dictionary and dynamic programming.
1 code implementation • 3 Apr 2020 • Naoki Kobayashi, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata
To obtain better discourse dependency trees, we need to improve the accuracy of RST trees at the upper parts of the structures.
Ranked #3 on Discourse Parsing on RST-DT (RST-Parseval (Span) metric)
no code implementations • LREC 2020 • Makoto Morishita, Jun Suzuki, Masaaki Nagata
We constructed a parallel corpus for English-Japanese, for which the amount of publicly available parallel corpora is still limited.
no code implementations • WS 2019 • Makoto Morishita, Jun Suzuki, Masaaki Nagata
In this paper, we describe our systems that were submitted to the translation shared tasks at WAT 2019.
no code implementations • WS 2019 • Hongyi Cui, Shohei Iida, Po-Hsuan Hung, Takehito Utsuro, Masaaki Nagata
Recently, the Transformer becomes a state-of-the-art architecture in the filed of neural machine translation (NMT).
no code implementations • WS 2019 • Takumi Ohtani, Hidetaka Kamigaito, Masaaki Nagata, Manabu Okumura
We present neural machine translation models for translating a sentence in a text by using a graph-based encoder which can consider coreference relations provided within the text explicitly.
no code implementations • IJCNLP 2019 • Naoki Kobayashi, Tsutomu Hirao, Kengo Nakamura, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata
The first one builds the optimal tree in terms of a dissimilarity score function that is defined for splitting a text span into smaller ones.
no code implementations • IJCNLP 2019 • Masaaki Nishino, Sho Takase, Tsutomu Hirao, Masaaki Nagata
An anagram is a sentence or a phrase that is made by permutating the characters of an input sentence or a phrase.
no code implementations • WS 2019 • Soichiro Murakami, Makoto Morishita, Tsutomu Hirao, Masaaki Nagata
This paper describes NTT's submission to the WMT19 robustness task.
no code implementations • ACL 2019 • Go Yasui, Yoshimasa Tsuruoka, Masaaki Nagata
Traditional model training for sentence generation employs cross-entropy loss as the loss function.
no code implementations • ACL 2019 • Shohei Iida, Ryuichiro Kimura, Hongyi Cui, Po-Hsuan Hung, Takehito Utsuro, Masaaki Nagata
The first hop attention is the scaled dot-product attention which is the same attention mechanism used in the original Transformer.
no code implementations • 13 Jun 2019 • Sho Takase, Jun Suzuki, Masaaki Nagata
This paper proposes a novel Recurrent Neural Network (RNN) language model that takes advantage of character information.
no code implementations • ACL 2019 • Kosuke Nishida, Kyosuke Nishida, Masaaki Nagata, Atsushi Otsuka, Itsumi Saito, Hisako Asano, Junji Tomita
It enables QFE to consider the dependency among the evidence sentences and cover important information in the question sentence.
Ranked #61 on Question Answering on HotpotQA
no code implementations • WS 2018 • Shun Kiyono, Sho Takase, Jun Suzuki, Naoaki Okazaki, Kentaro Inui, Masaaki Nagata
Developing a method for understanding the inner workings of black-box neural methods is an important research endeavor.
no code implementations • WS 2018 • Makoto Morishita, Jun Suzuki, Masaaki Nagata
This paper describes NTT{'}s neural machine translation systems submitted to the WMT 2018 English-German and German-English news translation tasks.
no code implementations • EMNLP 2018 • Tsutomu Hirao, Hidetaka Kamigaito, Masaaki Nagata
This paper tackles automation of the pyramid method, a reliable manual evaluation framework.
1 code implementation • EMNLP 2018 • Sho Takase, Jun Suzuki, Masaaki Nagata
This paper proposes a state-of-the-art recurrent neural network (RNN) language model that combines probability distributions computed not only from a final RNN layer but also from middle layers.
Ranked #8 on Language Modelling on Penn Treebank (Word Level)
no code implementations • COLING 2018 • Makoto Morishita, Jun Suzuki, Masaaki Nagata
We hypothesize that in the NMT model, the appropriate subword units for the following three modules (layers) can differ: (1) the encoder embedding layer, (2) the decoder embedding layer, and (3) the decoder output layer.
1 code implementation • ACL 2018 • Jun Suzuki, Sho Takase, Hidetaka Kamigaito, Makoto Morishita, Masaaki Nagata
This paper investigates the construction of a strong baseline based on general purpose sequence-to-sequence models for constituency parsing.
Ranked #18 on Constituency Parsing on Penn Treebank
no code implementations • NAACL 2018 • Takahiro Ishihara, Katsuhiko Hayashi, Hitoshi Manabe, Masashi Shimbo, Masaaki Nagata
Although neural tensor networks (NTNs) have been successful in many NLP tasks, they require a large number of parameters to be estimated, which often leads to overfitting and a long training time.
no code implementations • NAACL 2018 • Shinsaku Sakaue, Tsutomu Hirao, Masaaki Nishino, Masaaki Nagata
This approach is known to have three advantages: its applicability to many useful submodular objective functions, the efficiency of the greedy algorithm, and the provable performance guarantee.
no code implementations • NAACL 2018 • Hidetaka Kamigaito, Katsuhiko Hayashi, Tsutomu Hirao, Masaaki Nagata
To solve this problem, we propose a higher-order syntactic attention network (HiSAN) that can handle higher-order dependency features as an attention distribution on LSTM hidden states.
Ranked #3 on Sentence Compression on Google Dataset
no code implementations • NAACL 2018 • Ukyo Honda, Tsutomu Hirao, Masaaki Nagata
We propose a simple but highly effective automatic evaluation measure of summarization, pruned Basic Elements (pBE).
no code implementations • 22 Dec 2017 • Shun Kiyono, Sho Takase, Jun Suzuki, Naoaki Okazaki, Kentaro Inui, Masaaki Nagata
The encoder-decoder model is widely used in natural language generation tasks.
no code implementations • WS 2017 • Shunsuke Takeno, Masaaki Nagata, Kazuhide Yamamoto
We propose \textit{prefix constraints}, a novel method to enforce constraints on target sentences in neural machine translation.
no code implementations • WS 2017 • Makoto Morishita, Jun Suzuki, Masaaki Nagata
In this year, we participated in four translation subtasks at WAT 2017.
no code implementations • IJCNLP 2017 • Hidetaka Kamigaito, Katsuhiko Hayashi, Tsutomu Hirao, Hiroya Takamura, Manabu Okumura, Masaaki Nagata
The sequence-to-sequence (Seq2Seq) model has been successfully applied to machine translation (MT).
1 code implementation • IJCNLP 2017 • Sho Takase, Jun Suzuki, Masaaki Nagata
This paper proposes a reinforcing method that refines the output layers of existing Recurrent Neural Network (RNN) language models.
no code implementations • WS 2017 • Takaaki Tanaka, Katsuhiko Hayashi, Masaaki Nagata
We introduce the following hierarchical word structures to dependency parsing in Japanese: morphological units (a short unit word, SUW) and syntactic units (a long unit word, LUW).
no code implementations • ACL 2017 • Tsutomu Hirao, Masaaki Nishino, Masaaki Nagata
This paper derives an Integer Linear Programming (ILP) formulation to obtain an oracle summary of the compressive summarization paradigm in terms of ROUGE.
no code implementations • EACL 2017 • Katsuhiko Hayashi, Masaaki Nagata
This paper presents an efficient and optimal parsing algorithm for probabilistic context-free grammars (PCFGs).
no code implementations • EACL 2017 • Tsutomu Hirao, Masaaki Nishino, Jun Suzuki, Masaaki Nagata
To analyze the limitations and the future directions of the extractive summarization paradigm, this paper proposes an Integer Linear Programming (ILP) formulation to obtain extractive oracle summaries in terms of ROUGE-N. We also propose an algorithm that enumerates all of the oracle summaries for a set of reference summaries to exploit F-measures that evaluate which system summaries contain how many sentences that are extracted as an oracle summary.
no code implementations • EACL 2017 • Jun Suzuki, Masaaki Nagata
This paper tackles the reduction of redundant repeating generation that is often observed in RNN-based encoder-decoder models.
Ranked #4 on Text Summarization on DUC 2004 Task 1
no code implementations • 12 Dec 2016 • Xun Wang, Katsuhito Sudoh, Masaaki Nagata, Tomohide Shibata, Daisuke Kawahara, Sadao Kurohashi
This paper introduces a novel neural network model for question answering, the \emph{entity-based memory network}.
no code implementations • WS 2016 • Katsuhito Sudoh, Masaaki Nagata
This paper presents our Chinese-to-Japanese patent machine translation system for WAT 2016 (Group ID: ntt) that uses syntactic pre-ordering over Chinese dependency structures.
no code implementations • COLING 2016 • Xun Wang, Masaaki Nishino, Tsutomu Hirao, Katsuhito Sudoh, Masaaki Nagata
Existing methods focus on the extraction of key information, but often neglect coherence.
no code implementations • WS 2016 • Shunsuke Takeno, Masaaki Nagata, Kazuhide Yamamoto
We propose a method for integrating Japanese empty category detection into the preordering process of Japanese-to-English statistical machine translation.
no code implementations • WS 2016 • Katsuhiko Hayashi, Tsutomu Hirao, Masaaki Nagata
Ranked #5 on Discourse Parsing on RST-DT (RST-Parseval (Full) metric)