Search Results for author: Atsuki Yamaguchi

Found 14 papers, 10 papers with code

Vocabulary Expansion of Chat Models with Unlabeled Target Language Data

1 code implementation16 Dec 2024 Atsuki Yamaguchi, Terufumi Morishita, Aline Villavicencio, Nikolaos Aletras

In this paper, we investigate the impact of using unlabeled target language data for VE on chat models for the first time.

Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus

2 code implementations19 Nov 2024 Terufumi Morishita, Gaku Morio, Atsuki Yamaguchi, Yasuhiro Sogawa

Large language models (LLMs) are capable of solving a wide range of tasks, yet they have struggled with reasoning.

Formal Logic Logical Reasoning +1

An Empirical Study on Cross-lingual Vocabulary Adaptation for Efficient Language Model Inference

1 code implementation16 Feb 2024 Atsuki Yamaguchi, Aline Villavicencio, Nikolaos Aletras

We also show that adapting LLMs that have been pre-trained on more balanced multilingual data results in downstream performance comparable to the original models.

Language Modeling Language Modelling +1

appjsonify: An Academic Paper PDF-to-JSON Conversion Toolkit

1 code implementation2 Oct 2023 Atsuki Yamaguchi, Terufumi Morishita

We present appjsonify, a Python-based PDF-to-JSON conversion toolkit for academic papers.

Document Layout Analysis

Learning Deductive Reasoning from Synthetic Corpus based on Formal Logic

3 code implementations11 Aug 2023 Terufumi Morishita, Gaku Morio, Atsuki Yamaguchi, Yasuhiro Sogawa

We rethink this and adopt a well-grounded set of deduction rules based on formal logic theory, which can derive any other deduction rules when combined in a multistep way.

Formal Logic Logical Reasoning

How do different tokenizers perform on downstream tasks in scriptio continua languages?: A case study in Japanese

1 code implementation16 Jun 2023 Takuro Fujii, Koki Shibata, Atsuki Yamaguchi, Terufumi Morishita, Yasuhiro Sogawa

This paper investigates the effect of tokenizers on the downstream performance of pretrained language models (PLMs) in scriptio continua languages where no explicit spaces exist between words, using Japanese as a case study.

How does the task complexity of masked pretraining objectives affect downstream performance?

1 code implementation18 May 2023 Atsuki Yamaguchi, Hiroaki Ozaki, Terufumi Morishita, Gaku Morio, Yasuhiro Sogawa

Masked language modeling (MLM) is a widely used self-supervised pretraining objective, where a model needs to predict an original token that is replaced with a mask given contexts.

Language Modeling Language Modelling +1

Hitachi at SemEval-2023 Task 3: Exploring Cross-lingual Multi-task Strategies for Genre and Framing Detection in Online News

no code implementations3 Mar 2023 Yuta Koreeda, Ken-ichi Yokote, Hiroaki Ozaki, Atsuki Yamaguchi, Masaya Tsunokake, Yasuhiro Sogawa

Based on the multilingual, multi-task nature of the task and the low-resource setting, we investigated different cross-lingual and multi-task strategies for training the pretrained language models.

Frustratingly Simple Pretraining Alternatives to Masked Language Modeling

1 code implementation EMNLP 2021 Atsuki Yamaguchi, George Chrysostomou, Katerina Margatina, Nikolaos Aletras

Masked language modeling (MLM), a self-supervised pretraining objective, is widely used in natural language processing for learning text representations.

Language Modeling Language Modelling +2

Dialogue Act-based Breakdown Detection in Negotiation Dialogues

1 code implementation EACL 2021 Atsuki Yamaguchi, Kosui Iwasa, Katsuhide Fujita

Thanks to the success of goal-oriented negotiation dialogue systems, studies of negotiation dialogue have gained momentum in terms of both human-human negotiation support and dialogue systems.

Cannot find the paper you are looking for? You can Submit a new open access paper.