Search Results for author: Naoki Yoshinaga

Found 36 papers, 6 papers with code

Fine-grained Typing of Emerging Entities in Microblogs

no code implementations Findings (EMNLP) 2021 Satoshi Akasaki, Naoki Yoshinaga, Masashi Toyoda

Experiments on the Twitter datasets confirm the effectiveness of our typing model and the context selector.

Entity Typing

Building Large-Scale Japanese Pronunciation-Annotated Corpora for Reading Heteronymous Logograms

no code implementations LREC 2022 Fumikazu Sato, Naoki Yoshinaga, Masaru Kitsuregawa

In this study, to improve the accuracy of pronunciation prediction, we construct two large-scale Japanese corpora that annotate kanji characters with their pronunciations.

Sentence

Exploratory Model Analysis Using Data-Driven Neuron Representations

no code implementations EMNLP (BlackboxNLP) 2021 Daisuke Oba, Naoki Yoshinaga, Masashi Toyoda

Probing classifiers have been extensively used to inspect whether a model component captures specific linguistic phenomena.

Speculative Sampling in Variational Autoencoders for Dialogue Response Generation

1 code implementation Findings (EMNLP) 2021 Shoetsu Sato, Naoki Yoshinaga, Masashi Toyoda, Masaru Kitsuregawa

Our method chooses the most probable one from redundantly sampled latent variables for tying up the variable with a given response.

Response Generation

Tracing the Roots of Facts in Multilingual Language Models: Independent, Shared, and Transferred Knowledge

1 code implementation8 Mar 2024 Xin Zhao, Naoki Yoshinaga, Daisuke Oba

Acquiring factual knowledge for language models (LMs) in low-resource languages poses a serious challenge, thus resorting to cross-lingual transfer in multilingual LMs (ML-LMs).

Cross-Lingual Transfer Knowledge Probing +1

Rethinking Response Evaluation from Interlocutor's Eye for Open-Domain Dialogue Systems

no code implementations4 Jan 2024 Yuma Tsuta, Naoki Yoshinaga, Shoetsu Sato, Masashi Toyoda

Open-domain dialogue systems have started to engage in continuous conversations with humans.

Summarization-based Data Augmentation for Document Classification

1 code implementation1 Dec 2023 Yueguan Wang, Naoki Yoshinaga

Despite the prevalence of pretrained language models in natural language understanding tasks, understanding lengthy text such as document is still challenging due to the data sparseness problem.

Classification Data Augmentation +2

A Unified Generative Approach to Product Attribute-Value Identification

no code implementations9 Jun 2023 Keiji Shinzato, Naoki Yoshinaga, Yandi Xia, Wei-Te Chen

We finetune a pre-trained generative model, T5, to decode a set of attribute-value pairs as a target sequence from the given product text.

Attribute

Back to Patterns: Efficient Japanese Morphological Analysis with Feature-Sequence Trie

no code implementations30 May 2023 Naoki Yoshinaga

Accurate neural models are much less efficient than non-neural models and are useless for processing billions of social media posts or handling user queries in real time with a limited budget.

Morphological Analysis

Esports Data-to-commentary Generation on Large-scale Data-to-text Dataset

no code implementations21 Dec 2022 Zihan Wang, Naoki Yoshinaga

Therefore, in this study, we introduce a task of generating game commentaries from structured data records to address the problem.

Self-Adaptive Named Entity Recognition by Retrieving Unstructured Knowledge

no code implementations14 Oct 2022 Kosuke Nishida, Naoki Yoshinaga, Kyosuke Nishida

Although named entity recognition (NER) helps us to extract domain-specific entities from text (e. g., artists in the music domain), it is costly to create a large amount of training data or a structured knowledge base to perform accurate NER in the target domain.

named-entity-recognition Named Entity Recognition +1

Early Discovery of Disappearing Entities in Microblogs

no code implementations13 Oct 2022 Satoshi Akasaki, Naoki Yoshinaga, Masashi Toyoda

The major challenge is detecting uncertain contexts of disappearing entities from noisy microblog posts.

Time Series Time Series Analysis +1

Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product Attribute Extraction

no code implementations ACL 2022 Keiji Shinzato, Naoki Yoshinaga, Yandi Xia, Wei-Te Chen

A key challenge in attribute value extraction (AVE) from e-commerce sites is how to handle a large number of attributes for diverse products.

Attribute Attribute Extraction +2

Vocabulary Adaptation for Domain Adaptation in Neural Machine Translation

1 code implementation Findings of the Association for Computational Linguistics 2020 Shoetsu Sato, Jin Sakuma, Naoki Yoshinaga, Masashi Toyoda, Masaru Kitsuregawa

Prior to fine-tuning, our method replaces the embedding layers of the NMT model by projecting general word embeddings induced from monolingual data in a target domain onto a source-domain embedding space.

Domain Adaptation Machine Translation +3

Robust Backed-off Estimation of Out-of-Vocabulary Embeddings

no code implementations Findings of the Association for Computational Linguistics 2020 Nobukazu Fukuda, Naoki Yoshinaga, Masaru Kitsuregawa

In this study, inspired by the processes for creating words from known words, we propose a robust method of estimating oov word embeddings by referring to pre-trained word embeddings for known words with similar surfaces to target oov words.

Word Embeddings Word Similarity

Context-aware Decoder for Neural Machine Translation using a Target-side Document-Level Language Model

no code implementations NAACL 2021 Amane Sugiyama, Naoki Yoshinaga

Although many context-aware neural machine translation models have been proposed to incorporate contexts in translation, most of those models are trained end-to-end on parallel documents aligned in sentence-level.

Language Modelling Machine Translation +2

uBLEU: Uncertainty-Aware Automatic Evaluation Method for Open-Domain Dialogue Systems

no code implementations ACL 2020 Tsuta Yuma, Naoki Yoshinaga, Masashi Toyoda

Experimental results on massive Twitter data confirmed that υBLEU is comparable to ΔBLEU in terms of its correlation with human judgment and that the state of the art automatic evaluation method, RUBER, is improved by integrating υBLEU.

Vocabulary Adaptation for Distant Domain Adaptation in Neural Machine Translation

no code implementations30 Apr 2020 Shoetsu Sato, Jin Sakuma, Naoki Yoshinaga, Masashi Toyoda, Masaru Kitsuregawa

Prior to fine-tuning, our method replaces the embedding layers of the NMT model by projecting general word embeddings induced from monolingual data in a target domain onto a source-domain embedding space.

Domain Adaptation Machine Translation +3

On the Relation between Position Information and Sentence Length in Neural Machine Translation

no code implementations CONLL 2019 Masato Neishi, Naoki Yoshinaga

Although some approaches such as the attention mechanism have partially remedied the problem, we found that the current standard NMT model, Transformer, has difficulty in translating long sentences compared to the former standard, Recurrent Neural Network (RNN)-based model.

Machine Translation NMT +4

Multilingual Model Using Cross-Task Embedding Projection

no code implementations CONLL 2019 Jin Sakuma, Naoki Yoshinaga

We present a method for applying a neural network trained on one (resource-rich) language for a given task to other (resource-poor) languages.

Cross-Lingual Word Embeddings Sentiment Analysis +2

Early Discovery of Emerging Entities in Microblogs

no code implementations8 Jul 2019 Satoshi Akasaki, Naoki Yoshinaga, Masashi Toyoda

Keeping up to date on emerging entities that appear every day is indispensable for various applications, such as social-trend analysis and marketing research.

Marketing

Learning to Describe Unknown Phrases with Local and Global Contexts

no code implementations NAACL 2019 Shonosuke Ishiwatari, Hiroaki Hayashi, Naoki Yoshinaga, Graham Neubig, Shoetsu Sato, Masashi Toyoda, Masaru Kitsuregawa

When reading a text, it is common to become stuck on unfamiliar words and phrases, such as polysemous words with novel senses, rarely used idioms, internet slang, or emerging entities.

Modeling Personal Biases in Language Use by Inducing Personalized Word Embeddings

no code implementations NAACL 2019 Daisuke Oba, Naoki Yoshinaga, Shoetsu Sato, Satoshi Akasaki, Masashi Toyoda

In this study, we propose a method of modeling such personal biases in word meanings (hereafter, semantic variations) with personalized word embeddings obtained by solving a task on subjective text while regarding words used by different individuals as different words.

Multi-class Classification Multi-Task Learning +2

Learning to Describe Phrases with Local and Global Contexts

1 code implementation1 Nov 2018 Shonosuke Ishiwatari, Hiroaki Hayashi, Naoki Yoshinaga, Graham Neubig, Shoetsu Sato, Masashi Toyoda, Masaru Kitsuregawa

When reading a text, it is common to become stuck on unfamiliar words and phrases, such as polysemous words with novel senses, rarely used idioms, internet slang, or emerging entities.

Reading Comprehension

Cannot find the paper you are looking for? You can Submit a new open access paper.