Search Results for author: Tatsuya Hiraoka

Found 18 papers, 8 papers with code

Word-level Perturbation Considering Word Length and Compositional Subwords

1 code implementation Findings (ACL) 2022 Tatsuya Hiraoka, Sho Takase, Kei Uchiumi, Atsushi Keyaki, Naoaki Okazaki

We present two simple modifications for word-level perturbation: Word Replacement considering Length (WR-L) and Compositional Word Replacement (CWR). In conventional word replacement, a word in an input is replaced with a word sampled from the entire vocabulary, regardless of the length and context of the target word. WR-L considers the length of a target word by sampling words from the Poisson distribution. CWR considers the compositional candidates by restricting the source of sampling to related words that appear in subword regularization. Experimental results showed that the combination of WR-L and CWR improved the performance of text classification and machine translation.

Machine Translation text-classification +2

RECALL: Library-Like Behavior In Language Models is Enhanced by Self-Referencing Causal Cycles

1 code implementation23 Jan 2025 Munachiso Nwadike, Zangir Iklassov, Toluwani Aremu, Tatsuya Hiraoka, Velibor Bojkovic, Benjamin Heinzerling, Hilal Alqaubeh, Martin Takáč, Kentaro Inui

We introduce the concept of the self-referencing causal cycle (abbreviated RECALL) - a mechanism that enables large language models (LLMs) to bypass the limitations of unidirectional causality, which underlies a phenomenon known as the reversal curse.

The Geometry of Numerical Reasoning: Language Models Compare Numeric Properties in Linear Subspaces

no code implementations17 Oct 2024 Ahmed Oumar El-Shangiti, Tatsuya Hiraoka, Hilal AlQuabeh, Benjamin Heinzerling, Kentaro Inui

This paper investigates whether large language models (LLMs) utilize numerical attributes encoded in a low-dimensional subspace of the embedding space when answering logical comparison questions (e. g., Was Cristiano born before Messi?).

Repetition Neurons: How Do Language Models Produce Repetitions?

no code implementations17 Oct 2024 Tatsuya Hiraoka, Kentaro Inui

This paper introduces repetition neurons, regarded as skill neurons responsible for the repetition problem in text generation tasks.

In-Context Learning Text Generation

LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

no code implementations4 Jul 2024 LLM-jp, :, Akiko Aizawa, Eiji Aramaki, Bowen Chen, Fei Cheng, Hiroyuki Deguchi, Rintaro Enomoto, Kazuki Fujii, Kensuke Fukumoto, Takuya Fukushima, Namgi Han, Yuto Harada, Chikara Hashimoto, Tatsuya Hiraoka, Shohei Hisada, Sosuke Hosokawa, Lu Jie, Keisuke Kamata, Teruhito Kanazawa, Hiroki Kanezashi, Hiroshi Kataoka, Satoru Katsumata, Daisuke Kawahara, Seiya Kawano, Atsushi Keyaki, Keisuke Kiryu, Hirokazu Kiyomaru, Takashi Kodama, Takahiro Kubo, Yohei Kuga, Ryoma Kumon, Shuhei Kurita, Sadao Kurohashi, Conglong Li, Taiki Maekawa, Hiroshi Matsuda, Yusuke Miyao, Kentaro Mizuki, Sakae Mizuki, Yugo Murawaki, Akim Mousterou, Ryo Nakamura, Taishi Nakamura, Kouta Nakayama, Tomoka Nakazato, Takuro Niitsuma, Jiro Nishitoba, Yusuke Oda, Hayato Ogawa, Takumi Okamoto, Naoaki Okazaki, Yohei Oseki, Shintaro Ozaki, Koki Ryu, Rafal Rzepka, Keisuke Sakaguchi, Shota Sasaki, Satoshi Sekine, Kohei Suda, Saku Sugawara, Issa Sugiura, Hiroaki Sugiyama, Hisami Suzuki, Jun Suzuki, Toyotaro Suzumura, Kensuke Tachibana, Yu Takagi, Kyosuke Takami, Koichi Takeda, Masashi Takeshita, Masahiro Tanaka, Kenjiro Taura, Arseny Tolmachev, Nobuhiro Ueda, Zhen Wan, Shuntaro Yada, Sakiko Yahata, Yuya Yamamoto, Yusuke Yamauchi, Hitomi Yanaka, Rio Yokota, Koichiro Yoshino

This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs).

An Analysis of BPE Vocabulary Trimming in Neural Machine Translation

no code implementations30 Mar 2024 Marco Cognetta, Tatsuya Hiraoka, Naoaki Okazaki, Rico Sennrich, Yuval Pinter

We explore threshold vocabulary trimming in Byte-Pair Encoding subword tokenization, a postprocessing step that replaces rare subwords with their component subwords.

Machine Translation Translation

Constructing Multilingual Visual-Text Datasets Revealing Visual Multilingual Ability of Vision Language Models

no code implementations29 Mar 2024 Jesse Atuhurra, Iqra Ali, Tatsuya Hiraoka, Hidetaka Kamigaito, Tomoya Iwakura, Taro Watanabe

Our contribution is four-fold: 1) we introduced nine vision-and-language (VL) tasks (including object recognition, image-text matching, and more) and constructed multilingual visual-text datasets in four languages: English, Japanese, Swahili, and Urdu through utilizing templates containing \textit{questions} and prompting GPT4-V to generate the \textit{answers} and the \textit{rationales}, 2) introduced a new VL task named \textit{unrelatedness}, 3) introduced rationales to enable human understanding of the VLM reasoning process, and 4) employed human evaluation to measure the suitability of proposed datasets for VL tasks.

Image-text matching Object Recognition +1

Knowledge of Pretrained Language Models on Surface Information of Tokens

no code implementations15 Feb 2024 Tatsuya Hiraoka, Naoaki Okazaki

Do pretrained language models have knowledge regarding the surface information of tokens?

Decoder

Downstream Task-Oriented Neural Tokenizer Optimization with Vocabulary Restriction as Post Processing

no code implementations21 Apr 2023 Tatsuya Hiraoka, Tomoya Iwakura

This paper proposes an example of the BiLSTM-based tokenizer with vocabulary restriction, which can capture wider contextual information for the tokenization process than non-neural-based tokenization methods used in existing work.

text-classification Text Classification

MaxMatch-Dropout: Subword Regularization for WordPiece

1 code implementation COLING 2022 Tatsuya Hiraoka

We present a subword regularization method for WordPiece, which uses a maximum matching algorithm for tokenization.

Machine Translation Text Classification +1

Joint Optimization of Tokenization and Downstream Model

2 code implementations Findings (ACL) 2021 Tatsuya Hiraoka, Sho Takase, Kei Uchiumi, Atsushi Keyaki, Naoaki Okazaki

Since traditional tokenizers are isolated from a downstream task and model, they cannot output an appropriate tokenization depending on the task and model, although recent studies imply that the appropriate tokenization improves the performance.

Machine Translation model +3

Stochastic Tokenization with a Language Model for Neural Text Classification

no code implementations ACL 2019 Tatsuya Hiraoka, Hiroyuki Shindo, Yuji Matsumoto

To make the model robust against infrequent tokens, we sampled segmentation for each sentence stochastically during training, which resulted in improved performance of text classification.

General Classification Language Modeling +6

Cannot find the paper you are looking for? You can Submit a new open access paper.