Search Results for author: Sadao Kurohashi

Found 152 papers, 28 papers with code

Meta Ensemble for Japanese-Chinese Neural Machine Translation: Kyoto-U+ECNU Participation to WAT 2020

no code implementations AACL (WAT) 2020 Zhuoyuan Mao, Yibin Shen, Chenhui Chu, Sadao Kurohashi, Cheqing Jin

This paper describes the Japanese-Chinese Neural Machine Translation (NMT) system submitted by the joint team of Kyoto University and East China Normal University (Kyoto-U+ECNU) to WAT 2020 (Nakazawa et al., 2020).

Denoising Machine Translation +2

Improving Event Duration Question Answering by Leveraging Existing Temporal Information Extraction Data

1 code implementation LREC 2022 Felix Virgo, Fei Cheng, Sadao Kurohashi

However, the amount of training data for tasks like duration question answering, i. e., McTACO, is very limited, suggesting a need for external duration information to improve this task.

Question Answering Temporal Information Extraction

JaMIE: A Pipeline Japanese Medical Information Extraction System with Novel Relation Annotation

no code implementations LREC 2022 Fei Cheng, Shuntaro Yada, Ribeka Tanaka, Eiji Aramaki, Sadao Kurohashi

In this paper, we first propose a novel relation annotation schema for investigating the medical and temporal relations between medical entities in Japanese medical reports.

Relation Extraction

Explicit Use of Topicality in Dialogue Response Generation

no code implementations NAACL (ACL) 2022 Takumi Yoshikoshi, Hayato Atarashi, Takashi Kodama, Sadao Kurohashi

In this study, we propose a dialogue system that responds appropriately following the topic by selecting the entity with the highest “topicality.” In topicality estimation, the model is trained through self-supervised learning that regards entities that appear in both context and response as the topic entities.

Response Generation Self-Supervised Learning

A Method for Building a Commonsense Inference Dataset based on Basic Events

no code implementations EMNLP 2020 Kazumasa Omura, Daisuke Kawahara, Sadao Kurohashi

We present a scalable, low-bias, and low-cost method for building a commonsense inference dataset that combines automatic extraction from a corpus and crowdsourcing.

Multiple-choice Transfer Learning

Improving Bridging Reference Resolution using Continuous Essentiality from Crowdsourcing

1 code implementation COLING (CRAC) 2022 Nobuhiro Ueda, Sadao Kurohashi

Bridging reference resolution is the task of finding nouns that complement essential information of another noun.

Flexible Visual Grounding

1 code implementation ACL 2022 Yongmin Kim, Chenhui Chu, Sadao Kurohashi

Existing visual grounding datasets are artificially made, where every query regarding an entity must be able to be grounded to a corresponding image region, i. e., answerable.

Visual Grounding

Kyoto University MT System Description for IWSLT 2017

no code implementations IWSLT 2017 Raj Dabre, Fabien Cromieres, Sadao Kurohashi

We describe here our Machine Translation (MT) model and the results we obtained for the IWSLT 2017 Multilingual Shared Task.

Machine Translation NMT +1

Japanese Zero Anaphora Resolution Can Benefit from Parallel Texts Through Neural Transfer Learning

no code implementations Findings (EMNLP) 2021 Masato Umakoshi, Yugo Murawaki, Sadao Kurohashi

Parallel texts of Japanese and a non-pro-drop language have the potential of improving the performance of Japanese zero anaphora resolution (ZAR) because pronouns dropped in the former are usually mentioned explicitly in the latter.

Cross-Lingual Transfer Language Modelling +3

Improving Commonsense Contingent Reasoning by Pseudo-data and Its Application to the Related Tasks

no code implementations COLING 2022 Kazumasa Omura, Sadao Kurohashi

Contingent reasoning is one of the essential abilities in natural language understanding, and many language resources annotated with contingent relations have been constructed.

Natural Language Understanding Transfer Learning

MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting

1 code implementation26 May 2023 Tatsuro Inaba, Hirokazu Kiyomaru, Fei Cheng, Sadao Kurohashi

Large language models (LLMs) have achieved impressive performance on various reasoning tasks.

Variable-length Neural Interlingua Representations for Zero-shot Neural Machine Translation

no code implementations17 May 2023 Zhuoyuan Mao, Haiyue Song, Raj Dabre, Chenhui Chu, Sadao Kurohashi

The language-independency of encoded representations within multilingual neural machine translation (MNMT) models is crucial for their generalization ability on zero-shot translation.

Machine Translation Translation

Towards Speech Dialogue Translation Mediating Speakers of Different Languages

1 code implementation16 May 2023 Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi

We present a new task, speech dialogue translation mediating speakers of different languages.


SuperDialseg: A Large-scale Dataset for Supervised Dialogue Segmentation

1 code implementation15 May 2023 Junfeng Jiang, Chengzhang Dong, Akiko Aizawa, Sadao Kurohashi

In this paper, we provide a feasible definition of dialogue segmentation points with the help of document-grounded dialogues and release a large-scale supervised dataset called SuperDialseg, containing 9K dialogues based on two prevalent document-grounded dialogue corpora, and also inherit their useful dialogue-related annotations.

Comprehensive Solution Program Centric Pretraining for Table-and-Text Hybrid Numerical Reasoning

no code implementations12 May 2023 Qianying Liu, Dongsheng Yang, Wenjie Zhong, Fei Cheng, Sadao Kurohashi

Numerical reasoning over table-and-text hybrid passages, such as financial reports, poses significant challenges and has numerous potential applications.

GPT-RE: In-context Learning for Relation Extraction using Large Language Models

no code implementations3 May 2023 Zhen Wan, Fei Cheng, Zhuoyuan Mao, Qianying Liu, Haiyue Song, Jiwei Li, Sadao Kurohashi

In spite of the potential for ground-breaking achievements offered by large language models (LLMs) (e. g., GPT-3), they still lag significantly behind fully-supervised baselines (e. g., fine-tuned BERT) in relation extraction (RE).

Relation Extraction Retrieval

Textual Enhanced Contrastive Learning for Solving Math Word Problems

1 code implementation29 Nov 2022 Yibin Shen, Qianying Liu, Zhuoyuan Mao, Fei Cheng, Sadao Kurohashi

Solving math word problems is the task that analyses the relation of quantities and requires an accurate understanding of contextual natural language information.

Contrastive Learning

Rescue Implicit and Long-tail Cases: Nearest Neighbor Relation Extraction

1 code implementation21 Oct 2022 Zhen Wan, Qianying Liu, Zhuoyuan Mao, Fei Cheng, Sadao Kurohashi, Jiwei Li

Relation extraction (RE) has achieved remarkable progress with the help of pre-trained language models.

Relation Extraction

ComSearch: Equation Searching with Combinatorial Strategy for Solving Math Word Problems with Weak Supervision

no code implementations13 Oct 2022 Qianying Liu, Wenyu Guan, Jianhao Shen, Fei Cheng, Sadao Kurohashi

To address this problem, we propose a novel search algorithm with combinatorial strategy \textbf{ComSearch}, which can compress the search space by excluding mathematically equivalent equations.

Seeking Diverse Reasoning Logic: Controlled Equation Expression Generation for Solving Math Word Problems

1 code implementation21 Sep 2022 Yibin Shen, Qianying Liu, Zhuoyuan Mao, Zhen Wan, Fei Cheng, Sadao Kurohashi

To solve Math Word Problems, human students leverage diverse reasoning logic that reaches different possible equation solutions.

EMS: Efficient and Effective Massively Multilingual Sentence Representation Learning

1 code implementation31 May 2022 Zhuoyuan Mao, Chenhui Chu, Sadao Kurohashi

Massively multilingual sentence representation models, e. g., LASER, SBERT-distill, and LaBSE, help significantly improve cross-lingual downstream tasks.

Contrastive Learning Genre classification +3

Relation Extraction with Weighted Contrastive Pre-training on Distant Supervision

no code implementations18 May 2022 Zhen Wan, Fei Cheng, Qianying Liu, Zhuoyuan Mao, Haiyue Song, Sadao Kurohashi

Contrastive pre-training on distant supervision has shown remarkable effectiveness in improving supervised relation extraction tasks.

Contrastive Learning Relation Extraction

When do Contrastive Word Alignments Improve Many-to-many Neural Machine Translation?

no code implementations Findings (NAACL) 2022 Zhuoyuan Mao, Chenhui Chu, Raj Dabre, Haiyue Song, Zhen Wan, Sadao Kurohashi

Meanwhile, the contrastive objective can implicitly utilize automatically learned word alignment, which has not been explored in many-to-many NMT.

Machine Translation NMT +3

VISA: An Ambiguous Subtitles Dataset for Visual Scene-Aware Machine Translation

1 code implementation LREC 2022 Yihang Li, Shuichiro Shimizu, Weiqi Gu, Chenhui Chu, Sadao Kurohashi

Existing multimodal machine translation (MMT) datasets consist of images and video captions or general subtitles, which rarely contain linguistic ambiguity, making visual information not so effective to generate appropriate translations.

Multimodal Machine Translation Translation

Linguistically-driven Multi-task Pre-training for Low-resource Neural Machine Translation

1 code implementation20 Jan 2022 Zhuoyuan Mao, Chenhui Chu, Sadao Kurohashi

In the present study, we propose novel sequence-to-sequence pre-training objectives for low-resource machine translation (NMT): Japanese-specific sequence to sequence (JASS) for language pairs involving Japanese as the source or target language, and English-specific sequence to sequence (ENSS) for language pairs involving English.

Low-Resource Neural Machine Translation NMT +1

Cross-lingual Adaption Model-Agnostic Meta-Learning for Natural Language Understanding

no code implementations10 Nov 2021 Qianying Liu, Fei Cheng, Sadao Kurohashi

Meta learning with auxiliary languages has demonstrated promising improvements for cross-lingual natural language processing.

Cross-Lingual Transfer Meta-Learning +3

JaMIE: A Pipeline Japanese Medical Information Extraction System

1 code implementation8 Nov 2021 Fei Cheng, Shuntaro Yada, Ribeka Tanaka, Eiji Aramaki, Sadao Kurohashi

We present an open-access natural language processing toolkit for Japanese medical information extraction.

Video-guided Machine Translation with Spatial Hierarchical Attention Network

no code implementations ACL 2021 Weiqi Gu, Haiyue Song, Chenhui Chu, Sadao Kurohashi

Video-guided machine translation, as one type of multimodal machine translations, aims to engage video contents as auxiliary information to address the word sense ambiguity problem in machine translation.

Action Detection Machine Translation +2

Contextualized and Generalized Sentence Representations by Contrastive Self-Supervised Learning: A Case Study on Discourse Relation Analysis

no code implementations NAACL 2021 Hirokazu Kiyomaru, Sadao Kurohashi

The model is trained to maximize the similarity between the representation of the target sentence with its context and that of the masked target sentence with the same context.

Self-Supervised Learning

Lightweight Cross-Lingual Sentence Representation Learning

1 code implementation ACL 2021 Zhuoyuan Mao, Prakhar Gupta, Pei Wang, Chenhui Chu, Martin Jaggi, Sadao Kurohashi

Large-scale models for learning fixed-dimensional cross-lingual sentence representations like LASER (Artetxe and Schwenk, 2019b) lead to significant improvement in performance on downstream tasks.

Contrastive Learning Document Classification +3

Frustratingly Easy Edit-based Linguistic Steganography with a Masked Language Model

1 code implementation NAACL 2021 Honai Ueoka, Yugo Murawaki, Sadao Kurohashi

With advances in neural language models, the focus of linguistic steganography has shifted from edit-based approaches to generation-based ones.

Language Modelling Linguistic steganography

Extractive Summarization Considering Discourse and Coreference Relations based on Heterogeneous Graph

no code implementations EACL 2021 Yin Jou Huang, Sadao Kurohashi

In this paper, we propose a heterogeneous graph based model for extractive summarization that incorporates both discourse and coreference relations.

Extractive Summarization

Modeling and Utilizing User's Internal State in Movie Recommendation Dialogue

no code implementations5 Dec 2020 Takashi Kodama, Ribeka Tanaka, Sadao Kurohashi

In this paper, we model the UIS in dialogues, taking movie recommendation dialogues as examples, and construct a dialogue system that changes its response based on the UIS.

Movie Recommendation

Native-like Expression Identification by Contrasting Native and Proficient Second Language Speakers

no code implementations COLING 2020 Oleksandr Harust, Yugo Murawaki, Sadao Kurohashi

We propose a novel task of native-like expression identification by contrasting texts written by native speakers and those by proficient second language speakers.

BERT-based Cohesion Analysis of Japanese Texts

1 code implementation COLING 2020 Nobuhiro Ueda, Daisuke Kawahara, Sadao Kurohashi

The meaning of natural language text is supported by cohesion among various kinds of entities, including coreference relations, predicate-argument structures, and bridging anaphora relations.


Minimize Exposure Bias of Seq2Seq Models in Joint Entity and Relation Extraction

1 code implementation Findings of the Association for Computational Linguistics 2020 Ranran Haoran Zhang, Qianying Liu, Aysa Xuemo Fan, Heng Ji, Daojian Zeng, Fei Cheng, Daisuke Kawahara, Sadao Kurohashi

We propose a novel Sequence-to-Unordered-Multi-Tree (Seq2UMTree) model to minimize the effects of exposure bias by limiting the decoding length to three within a triplet and removing the order among triplets.

Joint Entity and Relation Extraction

Building a Japanese Typo Dataset from Wikipedia's Revision History

no code implementations ACL 2020 Yu Tanaka, Yugo Murawaki, Daisuke Kawahara, Sadao Kurohashi

User generated texts contain many typos for which correction is necessary for NLP systems to work.

JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation

1 code implementation LREC 2020 Zhuoyuan Mao, Fabien Cromieres, Raj Dabre, Haiyue Song, Sadao Kurohashi

Monolingual pre-training approaches such as MASS (MAsked Sequence to Sequence) are extremely effective in boosting NMT quality for languages with small parallel corpora.

Machine Translation NMT +2

Acquiring Social Knowledge about Personality and Driving-related Behavior

no code implementations LREC 2020 Ritsuko Iwai, Daisuke Kawahara, Takatsune Kumada, Sadao Kurohashi

Using them, we automatically extracted collocations between personality descriptors and driving-related behavior from a driving behavior and subjectivity corpus (1, 803, 328 sentences after filtering) and obtained unique 5, 334 collocations.

Development of a Japanese Personality Dictionary based on Psychological Methods

no code implementations LREC 2020 Ritsuko Iwai, Daisuke Kawahara, Takatsune Kumada, Sadao Kurohashi

In this study, we collect personality words, using word embeddings, and construct a personality dictionary with weights for Big Five traits.

Word Embeddings

Adapting BERT to Implicit Discourse Relation Classification with a Focus on Discourse Connectives

no code implementations LREC 2020 Yudai Kishimoto, Yugo Murawaki, Sadao Kurohashi

BERT, a neural network-based language model pre-trained on large corpora, is a breakthrough in natural language processing, significantly outperforming previous state-of-the-art models in numerous tasks.

General Classification Implicit Discourse Relation Classification +2

Pre-training via Leveraging Assisting Languages and Data Selection for Neural Machine Translation

no code implementations23 Jan 2020 Haiyue Song, Raj Dabre, Zhuoyuan Mao, Fei Cheng, Sadao Kurohashi, Eiichiro Sumita

To this end, we propose to exploit monolingual corpora of other languages to complement the scarcity of monolingual corpora for the LOI.

Machine Translation NMT +1

Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation

1 code implementation LREC 2020 Haiyue Song, Raj Dabre, Atsushi Fujita, Sadao Kurohashi

To address this, we examine a language independent framework for parallel corpus mining which is a quick and effective way to mine a parallel corpus from publicly available lectures at Coursera.

Benchmarking Domain Adaptation +3

Emotion helps Sentiment: A Multi-task Model for Sentiment and Emotion Analysis

no code implementations28 Nov 2019 Abhishek Kumar, Asif Ekbal, Daisuke Kawahra, Sadao Kurohashi

Our network also boosts the performance of emotion analysis by 5 F-score points on Stance Sentiment Emotion Corpus.

Emotion Recognition Sentiment Analysis

Automatically Neutralizing Subjective Bias in Text

1 code implementation21 Nov 2019 Reid Pryzant, Richard Diehl Martinez, Nathan Dass, Sadao Kurohashi, Dan Jurafsky, Diyi Yang

To address this issue, we introduce a novel testbed for natural language generation: automatically bringing inappropriately subjective text into a neutral point of view ("neutralizing" biased text).

Text Generation

Machine Comprehension Improves Domain-Specific Japanese Predicate-Argument Structure Analysis

no code implementations WS 2019 Norio Takahashi, Tomohide Shibata, Daisuke Kawahara, Sadao Kurohashi

To improve the accuracy of predicate-argument structure (PAS) analysis, large-scale training data and knowledge for PAS analysis are indispensable.

Reading Comprehension

Overview of the 6th Workshop on Asian Translation

no code implementations WS 2019 Toshiaki Nakazawa, Nobushige Doi, Shohei Higashiyama, Chenchen Ding, Raj Dabre, Hideya Mino, Isao Goto, Win Pa Pa, Anoop Kunchukuttan, Yusuke Oda, Shantipriya Parida, Ond{\v{r}}ej Bojar, Sadao Kurohashi

This paper presents the results of the shared tasks from the 6th workshop on Asian translation (WAT2019) including Ja↔En, Ja↔Zh scientific paper translation subtasks, Ja↔En, Ja↔Ko, Ja↔En patent translation subtasks, Hi↔En, My↔En, Km↔En, Ta↔En mixed domain subtasks and Ru↔Ja news commentary translation task.


Minimally Supervised Learning of Affective Events Using Discourse Relations

no code implementations IJCNLP 2019 Jun Saito, Yugo Murawaki, Sadao Kurohashi

Recognizing affective events that trigger positive or negative sentiment has a wide range of natural language processing applications but remains a challenging problem mainly because the polarity of an event is not necessarily predictable from its constituent words.

Juman++: A Morphological Analysis Toolkit for Scriptio Continua

1 code implementation EMNLP 2018 Arseny Tolmachev, Daisuke Kawahara, Sadao Kurohashi

We present a three-part toolkit for developing morphological analyzers for languages without natural word boundaries.

Art Analysis Language Modelling +2

A Multi-task Ensemble Framework for Emotion, Sentiment and Intensity Prediction

no code implementations3 Aug 2018 Md. Shad Akhtar, Deepanway Ghosal, Asif Ekbal, Pushpak Bhattacharyya, Sadao Kurohashi

In this paper, through multi-task ensemble framework we address three problems of emotion and sentiment analysis i. e. "emotion classification & intensity", "valence, arousal & dominance for emotion" and "valence & arousal} for sentiment".

Emotion Classification General Classification +1

Entity-Centric Joint Modeling of Japanese Coreference Resolution and Predicate Argument Structure Analysis

no code implementations ACL 2018 Tomohide Shibata, Sadao Kurohashi

Our experimental results demonstrate the proposed method can improve the performance of the inter-sentential zero anaphora resolution drastically, which is a notoriously difficult task in predicate argument structure analysis.

coreference-resolution Reading Comprehension

Neural Adversarial Training for Semi-supervised Japanese Predicate-argument Structure Analysis

no code implementations ACL 2018 Shuhei Kurita, Daisuke Kawahara, Sadao Kurohashi

Japanese predicate-argument structure (PAS) analysis involves zero anaphora resolution, which is notoriously difficult.

MMCR4NLP: Multilingual Multiway Corpora Repository for Natural Language Processing

1 code implementation3 Oct 2017 Raj Dabre, Sadao Kurohashi

Multilinguality is gradually becoming ubiquitous in the sense that more and more researchers have successfully shown that using additional languages help improve the results in many Natural Language Processing tasks.

Machine Translation Multilingual NLP +2

Enabling Multi-Source Neural Machine Translation By Concatenating Source Sentences In Multiple Languages

no code implementations MTSummit 2017 Raj Dabre, Fabien Cromieres, Sadao Kurohashi

In this paper, we explore a simple solution to "Multi-Source Neural Machine Translation" (MSNMT) which only relies on preprocessing a N-way multilingual corpus without modifying the Neural Machine Translation (NMT) architecture or training procedure.

Machine Translation NMT +1

An Empirical Comparison of Simple Domain Adaptation Methods for Neural Machine Translation

no code implementations12 Jan 2017 Chenhui Chu, Raj Dabre, Sadao Kurohashi

In this paper, we propose a novel domain adaptation method named "mixed fine tuning" for neural machine translation (NMT).

Domain Adaptation Machine Translation +2

Consistent Word Segmentation, Part-of-Speech Tagging and Dependency Labelling Annotation for Chinese Language

no code implementations COLING 2016 Mo Shen, Wingmui Li, HyunJeong Choe, Chenhui Chu, Daisuke Kawahara, Sadao Kurohashi

In this paper, we propose a new annotation approach to Chinese word segmentation, part-of-speech (POS) tagging and dependency labelling that aims to overcome the two major issues in traditional morphology-based annotation: Inconsistency and data sparsity.

Chinese Word Segmentation Machine Translation +4

Supervised Syntax-based Alignment between English Sentences and Abstract Meaning Representation Graphs

no code implementations7 Jun 2016 Chenhui Chu, Sadao Kurohashi

As alignment links are not given between English sentences and Abstract Meaning Representation (AMR) graphs in the AMR annotation, automatic alignment becomes indispensable for training an AMR parser.

AMR Parsing

Parallel Sentence Extraction from Comparable Corpora with Neural Network Features

no code implementations LREC 2016 Chenhui Chu, Raj Dabre, Sadao Kurohashi

Parallel corpora are crucial for machine translation (MT), however they are quite scarce for most language pairs and domains.

Machine Translation Translation

ASPEC: Asian Scientific Paper Excerpt Corpus

no code implementations LREC 2016 Toshiaki Nakazawa, Manabu Yaguchi, Kiyotaka Uchimoto, Masao Utiyama, Eiichiro Sumita, Sadao Kurohashi, Hitoshi Isahara

In this paper, we describe the details of the ASPEC (Asian Scientific Paper Excerpt Corpus), which is the first large-size parallel corpus of scientific paper domain.

Machine Translation Translation

Bilingual Dictionary Construction with Transliteration Filtering

no code implementations LREC 2014 John Richardson, Toshiaki Nakazawa, Sadao Kurohashi

In this paper we present a bilingual transliteration lexicon of 170K Japanese-English technical terms in the scientific domain.

Translation Transliteration

A Large Scale Database of Strongly-related Events in Japanese

no code implementations LREC 2014 Tomohide Shibata, Shotaro Kohama, Sadao Kurohashi

This paper presents a large scale database of strongly-related events in Japanese, which has been acquired with our proposed method (Shibata and Kurohashi, 2011).

Common Sense Reasoning coreference-resolution +1

Constructing a Chinese---Japanese Parallel Corpus from Wikipedia

no code implementations LREC 2014 Chenhui Chu, Toshiaki Nakazawa, Sadao Kurohashi

Using the system, we construct a Chinese―Japanese parallel corpus with more than 126k highly accurate parallel sentences from Wikipedia.

Machine Translation Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.