Search Results for author: Sho Takase

Found 33 papers, 14 papers with code

Reducing Odd Generation from Neural Headline Generation

no code implementations • PACLIC 2018 • Shun Kiyono, Sho Takase, Jun Suzuki, Naoaki Okazaki, Kentaro Inui, Masaaki Nagata

Headline Generation

Paper
Add Code

Word-level Perturbation Considering Word Length and Compositional Subwords

1 code implementation • Findings (ACL) 2022 • Tatsuya Hiraoka, Sho Takase, Kei Uchiumi, Atsushi Keyaki, Naoaki Okazaki

We present two simple modifications for word-level perturbation: Word Replacement considering Length (WR-L) and Compositional Word Replacement (CWR). In conventional word replacement, a word in an input is replaced with a word sampled from the entire vocabulary, regardless of the length and context of the target word. WR-L considers the length of a target word by sampling words from the Poisson distribution. CWR considers the compositional candidates by restricting the source of sampling to related words that appear in subword regularization. Experimental results showed that the combination of WR-L and CWR improved the performance of text classification and machine translation.

Machine Translation text-classification +2

Paper
Code

Spike No More: Stabilizing the Pre-training of Large Language Models

no code implementations • 28 Dec 2023 • Sho Takase, Shun Kiyono, Sosuke Kobayashi, Jun Suzuki

Loss spikes often occur during pre-training of large language models.

Language Modelling Large Language Model

Paper
Add Code

Exploring Effectiveness of GPT-3 in Grammatical Error Correction: A Study on Performance and Controllability in Prompt-Based Methods

no code implementations • 29 May 2023 • Mengsay Loem, Masahiro Kaneko, Sho Takase, Naoaki Okazaki

Large-scale pre-trained language models such as GPT-3 have shown remarkable performance across various natural language processing tasks.

Grammatical Error Correction

Paper
Add Code

Nearest Neighbor Non-autoregressive Text Generation

no code implementations • 26 Aug 2022 • Ayana Niwa, Sho Takase, Naoaki Okazaki

In addition, the proposed method outperforms an NAR baseline on the WMT'14 En-De dataset.

Decoder Machine Translation +2

Paper
Add Code

Are Neighbors Enough? Multi-Head Neural n-gram can be Alternative to Self-attention

no code implementations • 27 Jul 2022 • Mengsay Loem, Sho Takase, Masahiro Kaneko, Naoaki Okazaki

Impressive performance of Transformer has been attributed to self-attention, where dependencies between entire input in a sequence are considered at every position.

Position

Paper
Add Code

B2T Connection: Serving Stability and Performance in Deep Transformers

1 code implementation • 1 Jun 2022 • Sho Takase, Shun Kiyono, Sosuke Kobayashi, Jun Suzuki

Recent Transformers tend to be Pre-LN because, in Post-LN with deep Transformers (e. g., those with ten or more layers), the training is often unstable, resulting in useless models.

Text Generation

Paper
Code

Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation

no code implementations • Findings (ACL) 2022 • Sho Takase, Tatsuya Hiraoka, Naoaki Okazaki

Subword regularizations use multiple subword segmentations during training to improve the robustness of neural machine translation models.

Machine Translation Segmentation +1

Paper
Add Code

Interpretability for Language Learners Using Example-Based Grammatical Error Correction

1 code implementation • ACL 2022 • Masahiro Kaneko, Sho Takase, Ayana Niwa, Naoaki Okazaki

In this study, we introduce an Example-Based GEC (EB-GEC) that presents examples to language learners as a basis for a correction result.

Grammatical Error Correction

Paper
Code

ExtraPhrase: Efficient Data Augmentation for Abstractive Summarization

no code implementations • NAACL (ACL) 2022 • Mengsay Loem, Sho Takase, Masahiro Kaneko, Naoaki Okazaki

Through experiments, we show that ExtraPhrase improves the performance of abstractive summarization tasks by more than 0. 50 points in ROUGE scores compared to the setting without data augmentation.

Abstractive Text Summarization Data Augmentation +1

Paper
Add Code

Joint Optimization of Tokenization and Downstream Model

2 code implementations • Findings (ACL) 2021 • Tatsuya Hiraoka, Sho Takase, Kei Uchiumi, Atsushi Keyaki, Naoaki Okazaki

Since traditional tokenizers are isolated from a downstream task and model, they cannot output an appropriate tokenization depending on the task and model, although recent studies imply that the appropriate tokenization improves the performance.

Machine Translation text-classification +2

Paper
Code

Lessons on Parameter Sharing across Layers in Transformers

2 code implementations • 13 Apr 2021 • Sho Takase, Shun Kiyono

We propose a parameter sharing method for Transformers (Vaswani et al., 2017).

Ranked #1 on Machine Translation on WMT2014 English-German

Machine Translation

Paper
Code

Rethinking Perturbations in Encoder-Decoders for Fast Training

1 code implementation • NAACL 2021 • Sho Takase, Shun Kiyono

We often use perturbations to regularize neural models.

Ranked #1 on Text Summarization on DUC 2004 Task 1 (using extra training data)

Machine Translation Text Summarization

Paper
Code

Optimizing Word Segmentation for Downstream Task

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Tatsuya Hiraoka, Sho Takase, Kei Uchiumi, Atsushi Keyaki, Naoaki Okazaki

In traditional NLP, we tokenize a given sentence as a preprocessing, and thus the tokenization is unrelated to a target downstream task.

Natural Language Inference Segmentation +4

Paper
Code

Multi-Task Learning for Cross-Lingual Abstractive Summarization

no code implementations • LREC 2022 • Sho Takase, Naoaki Okazaki

Experimental results indicate that Transum improves the performance from the strong baseline, Transformer, in Chinese-English, Arabic-English, and English-Japanese translation datasets.

Abstractive Text Summarization Cross-Lingual Abstractive Summarization +4

Paper
Add Code

Improving Truthfulness of Headline Generation

1 code implementation • ACL 2020 • Kazuki Matsumaru, Sho Takase, Naoaki Okazaki

Building a binary classifier that predicts an entailment relation between an article and its headline, we filter out untruthful instances from the supervision data.

Abstractive Text Summarization Decoder +2

Paper
Code

Evaluation Dataset for Zero Pronoun in Japanese to English Translation

no code implementations • LREC 2020 • Sho Shimazu, Sho Takase, Toshiaki Nakazawa, Naoaki Okazaki

Therefore, we present a hand-crafted dataset to evaluate whether translation models can resolve the zero pronoun problems in Japanese to English translations.

Machine Translation Translation

Paper
Add Code

All Word Embeddings from One Embedding

1 code implementation • NeurIPS 2020 • Sho Takase, Sosuke Kobayashi

The proposed method, ALONE (all word embeddings from one), constructs the embedding of a word by modifying the shared embedding with a filter vector, which is word-specific but non-trainable.

Ranked #3 on Text Summarization on DUC 2004 Task 1

Decoder Machine Translation +3

Paper
Code

Generating Natural Anagrams: Towards Language Generation Under Hard Combinatorial Constraints

no code implementations • IJCNLP 2019 • Masaaki Nishino, Sho Takase, Tsutomu Hirao, Masaaki Nagata

An anagram is a sentence or a phrase that is made by permutating the characters of an input sentence or a phrase.

Sentence Text Generation

Paper
Add Code

Neural Question Generation using Interrogative Phrases

no code implementations • WS 2019 • Yuichi Sasazawa, Sho Takase, Naoaki Okazaki

One of the key requirements of QG is to generate a question such that it results in a target answer.

Question Answering Question Generation +1

Paper
Add Code

Character n-gram Embeddings to Improve RNN Language Models

no code implementations • 13 Jun 2019 • Sho Takase, Jun Suzuki, Masaaki Nagata

This paper proposes a novel Recurrent Neural Network (RNN) language model that takes advantage of character information.

Headline Generation Language Modelling +3

Paper
Add Code

Positional Encoding to Control Output Sequence Length

1 code implementation • NAACL 2019 • Sho Takase, Naoaki Okazaki

Neural encoder-decoder models have been successful in natural language generation tasks.

Ranked #2 on Text Summarization on DUC 2004 Task 1

Abstractive Text Summarization Decoder +2

Paper
Code

Unsupervised Token-wise Alignment to Improve Interpretation of Encoder-Decoder Models

no code implementations • WS 2018 • Shun Kiyono, Sho Takase, Jun Suzuki, Naoaki Okazaki, Kentaro Inui, Masaaki Nagata

Developing a method for understanding the inner workings of black-box neural methods is an important research endeavor.

Decoder Machine Translation +3

Paper
Add Code

Direct Output Connection for a High-Rank Language Model

1 code implementation • EMNLP 2018 • Sho Takase, Jun Suzuki, Masaaki Nagata

This paper proposes a state-of-the-art recurrent neural network (RNN) language model that combines probability distributions computed not only from a final RNN layer but also from middle layers.

Ranked #8 on Language Modelling on Penn Treebank (Word Level)