Search Results for author: Masaaki Nagata

Found 91 papers, 10 papers with code

Findings of the 2021 Conference on Machine Translation (WMT21)

no code implementations WMT (EMNLP) 2021 Farhad Akhbardeh, Arkady Arkhangorodsky, Magdalena Biesialska, Ondřej Bojar, Rajen Chatterjee, Vishrav Chaudhary, Marta R. Costa-Jussa, Cristina España-Bonet, Angela Fan, Christian Federmann, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Leonie Harter, Kenneth Heafield, Christopher Homan, Matthias Huck, Kwabena Amponsah-Kaakyire, Jungo Kasai, Daniel Khashabi, Kevin Knight, Tom Kocmi, Philipp Koehn, Nicholas Lourie, Christof Monz, Makoto Morishita, Masaaki Nagata, Ajay Nagesh, Toshiaki Nakazawa, Matteo Negri, Santanu Pal, Allahsera Auguste Tapo, Marco Turchi, Valentin Vydrin, Marcos Zampieri

This paper presents the results of the newstranslation task, the multilingual low-resourcetranslation for Indo-European languages, thetriangular translation task, and the automaticpost-editing task organised as part of the Con-ference on Machine Translation (WMT) 2021. In the news task, participants were asked tobuild machine translation systems for any of10 language pairs, to be evaluated on test setsconsisting mainly of news stories.

Machine Translation Translation

SODA: Story Oriented Dense Video Captioning Evaluation Framework

1 code implementation ECCV 2020 Soichiro Fujita, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata

This paper proposes a new evaluation framework, Story Oriented Dense video cAptioning evaluation framework (SODA), for measuring the performance of video story description systems.

Dense Video Captioning

Word Rewarding for Adequate Neural Machine Translation

no code implementations IWSLT (EMNLP) 2018 Yuto Takebayashi, Chu Chenhui, Yuki Arase†, Masaaki Nagata

To improve the translation adequacy in neural machine translation (NMT), we propose a rewarding model with target word prediction using bilingual dictionaries inspired by the success of decoder constraints in statistical machine translation.

Decoder Machine Translation +2

Enhancing Translation Accuracy of Large Language Models through Continual Pre-Training on Parallel Data

no code implementations3 Jul 2024 Minato Kondo, Takehito Utsuro, Masaaki Nagata

The results demonstrate that when utilizing parallel data in continual pre-training, it is essential to alternate between source and target sentences.

Decoder Translation

A Japanese-Chinese Parallel Corpus Using Crowdsourcing for Web Mining

no code implementations15 May 2024 Masaaki Nagata, Makoto Morishita, Katsuki Chousa, Norihito Yasuda

Using crowdsourcing, we collected more than 10, 000 URL pairs (parallel top page pairs) of bilingual websites that contain parallel documents and created a Japanese-Chinese parallel corpus of 4. 6M sentence pairs from these websites.

Sentence Translation +1

Word Alignment as Preference for Machine Translation

no code implementations15 May 2024 Qiyu Wu, Masaaki Nagata, Zhongtao Miao, Yoshimasa Tsuruoka

In this work, we mitigate the problem in an LLM-based MT model by guiding it to better word alignment.

Hallucination Language Modelling +4

WSPAlign: Word Alignment Pre-training via Large-Scale Weakly Supervised Span Prediction

2 code implementations9 Jun 2023 Qiyu Wu, Masaaki Nagata, Yoshimasa Tsuruoka

Most existing word alignment methods rely on manual alignment datasets or parallel corpora, which limits their usefulness.

Word Alignment

Domain Adaptation of Machine Translation with Crowdworkers

no code implementations28 Oct 2022 Makoto Morishita, Jun Suzuki, Masaaki Nagata

With the collected parallel data, we can quickly adapt a machine translation model to the target domain.

Domain Adaptation Machine Translation +1

A Simple and Strong Baseline for End-to-End Neural RST-style Discourse Parsing

1 code implementation15 Oct 2022 Naoki Kobayashi, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata

To promote and further develop RST-style discourse parsing models, we need a strong baseline that can be regarded as a reference for reporting reliable experimental results.

Discourse Parsing

Extending Word-Level Quality Estimation for Post-Editing Assistance

no code implementations23 Sep 2022 Yizhen Wei, Takehito Utsuro, Masaaki Nagata

Based on extended word alignment, we further propose a novel task called refined word-level QE that outputs refined tags and word-level correspondences.

Word Alignment XLM-R

JParaCrawl v3.0: A Large-scale English-Japanese Parallel Corpus

no code implementations LREC 2022 Makoto Morishita, Katsuki Chousa, Jun Suzuki, Masaaki Nagata

Most current machine translation models are mainly trained with parallel corpora, and their translation accuracy largely depends on the quality and quantity of the corpora.

Machine Translation Sentence +1

Improving Neural RST Parsing Model with Silver Agreement Subtrees

no code implementations NAACL 2021 Naoki Kobayashi, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata

We then pre-train a neural RST parser with the obtained silver data and fine-tune it on the RST-DT.

 Ranked #1 on Discourse Parsing on RST-DT (RST-Parseval (Relation) metric, using extra training data)

Discourse Parsing Relation

Context-aware Neural Machine Translation with Mini-batch Embedding

1 code implementation EACL 2021 Makoto Morishita, Jun Suzuki, Tomoharu Iwata, Masaaki Nagata

It is crucial to provide an inter-sentence context in Neural Machine Translation (NMT) models for higher-quality translation.

Machine Translation NMT +2

A Test Set for Discourse Translation from Japanese to English

no code implementations LREC 2020 Masaaki Nagata, Makoto Morishita

We improved the translation accuracy using context-aware neural machine translation, and the improvement mainly reflects the betterment of the translation of zero pronouns.

Machine Translation Sentence +1

A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT

no code implementations EMNLP 2020 Masaaki Nagata, Chousa Katsuki, Masaaki Nishino

For example, we achieved an F1 score of 86. 7 for the Chinese-English data, which is 13. 3 points higher than the previous state-of-the-art supervised methods.

Question Answering Sentence +1

Bilingual Text Extraction as Reading Comprehension

no code implementations29 Apr 2020 Katsuki Chousa, Masaaki Nagata, Masaaki Nishino

We also conduct a sentence alignment experiment using En-Ja newspaper articles and find that the proposed method using multilingual BERT achieves significantly better accuracy than a baseline method using a bilingual dictionary and dynamic programming.

Reading Comprehension Sentence +1

Top-Down RST Parsing Utilizing Granularity Levels in Documents

1 code implementation3 Apr 2020 Naoki Kobayashi, Tsutomu Hirao, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata

To obtain better discourse dependency trees, we need to improve the accuracy of RST trees at the upper parts of the structures.

Ranked #3 on Discourse Parsing on RST-DT (RST-Parseval (Span) metric)

Discourse Parsing Relation

JParaCrawl: A Large Scale Web-Based English-Japanese Parallel Corpus

no code implementations LREC 2020 Makoto Morishita, Jun Suzuki, Masaaki Nagata

We constructed a parallel corpus for English-Japanese, for which the amount of publicly available parallel corpora is still limited.

Machine Translation Sentence +1

NTT Neural Machine Translation Systems at WAT 2019

no code implementations WS 2019 Makoto Morishita, Jun Suzuki, Masaaki Nagata

In this paper, we describe our systems that were submitted to the translation shared tasks at WAT 2019.

Machine Translation Translation

Context-aware Neural Machine Translation with Coreference Information

no code implementations WS 2019 Takumi Ohtani, Hidetaka Kamigaito, Masaaki Nagata, Manabu Okumura

We present neural machine translation models for translating a sentence in a text by using a graph-based encoder which can consider coreference relations provided within the text explicitly.

Machine Translation Sentence +1

Split or Merge: Which is Better for Unsupervised RST Parsing?

no code implementations IJCNLP 2019 Naoki Kobayashi, Tsutomu Hirao, Kengo Nakamura, Hidetaka Kamigaito, Manabu Okumura, Masaaki Nagata

The first one builds the optimal tree in terms of a dissimilarity score function that is defined for splitting a text span into smaller ones.

Character n-gram Embeddings to Improve RNN Language Models

no code implementations13 Jun 2019 Sho Takase, Jun Suzuki, Masaaki Nagata

This paper proposes a novel Recurrent Neural Network (RNN) language model that takes advantage of character information.

Headline Generation Language Modelling +3

NTT's Neural Machine Translation Systems for WMT 2018

no code implementations WS 2018 Makoto Morishita, Jun Suzuki, Masaaki Nagata

This paper describes NTT{'}s neural machine translation systems submitted to the WMT 2018 English-German and German-English news translation tasks.

Machine Translation Re-Ranking +1

Direct Output Connection for a High-Rank Language Model

1 code implementation EMNLP 2018 Sho Takase, Jun Suzuki, Masaaki Nagata

This paper proposes a state-of-the-art recurrent neural network (RNN) language model that combines probability distributions computed not only from a final RNN layer but also from middle layers.

Constituency Parsing Headline Generation +4

Improving Neural Machine Translation by Incorporating Hierarchical Subword Features

no code implementations COLING 2018 Makoto Morishita, Jun Suzuki, Masaaki Nagata

We hypothesize that in the NMT model, the appropriate subword units for the following three modules (layers) can differ: (1) the encoder embedding layer, (2) the decoder embedding layer, and (3) the decoder output layer.

Decoder Machine Translation +2

Neural Tensor Networks with Diagonal Slice Matrices

no code implementations NAACL 2018 Takahiro Ishihara, Katsuhiko Hayashi, Hitoshi Manabe, Masashi Shimbo, Masaaki Nagata

Although neural tensor networks (NTNs) have been successful in many NLP tasks, they require a large number of parameters to be estimated, which often leads to overfitting and a long training time.

Knowledge Graph Completion Logical Reasoning +2

Provable Fast Greedy Compressive Summarization with Any Monotone Submodular Function

no code implementations NAACL 2018 Shinsaku Sakaue, Tsutomu Hirao, Masaaki Nishino, Masaaki Nagata

This approach is known to have three advantages: its applicability to many useful submodular objective functions, the efficiency of the greedy algorithm, and the provable performance guarantee.

Document Summarization Extractive Summarization +1

Higher-Order Syntactic Attention Network for Longer Sentence Compression

no code implementations NAACL 2018 Hidetaka Kamigaito, Katsuhiko Hayashi, Tsutomu Hirao, Masaaki Nagata

To solve this problem, we propose a higher-order syntactic attention network (HiSAN) that can handle higher-order dependency features as an attention distribution on LSTM hidden states.

Informativeness Machine Translation +2

Pruning Basic Elements for Better Automatic Evaluation of Summaries

no code implementations NAACL 2018 Ukyo Honda, Tsutomu Hirao, Masaaki Nagata

We propose a simple but highly effective automatic evaluation measure of summarization, pruned Basic Elements (pBE).

Word Embeddings Word Similarity

Input-to-Output Gate to Improve RNN Language Models

1 code implementation IJCNLP 2017 Sho Takase, Jun Suzuki, Masaaki Nagata

This paper proposes a reinforcing method that refines the output layers of existing Recurrent Neural Network (RNN) language models.

Hierarchical Word Structure-based Parsing: A Feasibility Study on UD-style Dependency Parsing in Japanese

no code implementations WS 2017 Takaaki Tanaka, Katsuhiko Hayashi, Masaaki Nagata

We introduce the following hierarchical word structures to dependency parsing in Japanese: morphological units (a short unit word, SUW) and syntactic units (a long unit word, LUW).

Chunking Dependency Parsing +2

Oracle Summaries of Compressive Summarization

no code implementations ACL 2017 Tsutomu Hirao, Masaaki Nishino, Masaaki Nagata

This paper derives an Integer Linear Programming (ILP) formulation to obtain an oracle summary of the compressive summarization paradigm in terms of ROUGE.

Sentence Compression

K-best Iterative Viterbi Parsing

no code implementations EACL 2017 Katsuhiko Hayashi, Masaaki Nagata

This paper presents an efficient and optimal parsing algorithm for probabilistic context-free grammars (PCFGs).

Enumeration of Extractive Oracle Summaries

no code implementations EACL 2017 Tsutomu Hirao, Masaaki Nishino, Jun Suzuki, Masaaki Nagata

To analyze the limitations and the future directions of the extractive summarization paradigm, this paper proposes an Integer Linear Programming (ILP) formulation to obtain extractive oracle summaries in terms of ROUGE-N. We also propose an algorithm that enumerates all of the oracle summaries for a set of reference summaries to exploit F-measures that evaluate which system summaries contain how many sentences that are extracted as an oracle summary.

document understanding Extractive Summarization

Chinese-to-Japanese Patent Machine Translation based on Syntactic Pre-ordering for WAT 2016

no code implementations WS 2016 Katsuhito Sudoh, Masaaki Nagata

This paper presents our Chinese-to-Japanese patent machine translation system for WAT 2016 (Group ID: ntt) that uses syntactic pre-ordering over Chinese dependency structures.

Chinese Word Segmentation Dependency Parsing +5

Integrating empty category detection into preordering Machine Translation

no code implementations WS 2016 Shunsuke Takeno, Masaaki Nagata, Kazuhide Yamamoto

We propose a method for integrating Japanese empty category detection into the preordering process of Japanese-to-English statistical machine translation.

Machine Translation Sentence +2

Cannot find the paper you are looking for? You can Submit a new open access paper.