no code implementations • EMNLP 2021 • Shun Kiyono, Sosuke Kobayashi, Jun Suzuki, Kentaro Inui
Position representation is crucial for building position-aware representations in Transformers.
no code implementations • 28 Dec 2023 • Sho Takase, Shun Kiyono, Sosuke Kobayashi, Jun Suzuki
Loss spikes often occur during pre-training of large language models.
1 code implementation • 1 Jun 2022 • Sho Takase, Shun Kiyono, Sosuke Kobayashi, Jun Suzuki
Recent Transformers tend to be Pre-LN because, in Post-LN with deep Transformers (e. g., those with ten or more layers), the training is often unstable, resulting in useless models.
1 code implementation • 31 May 2022 • Sosuke Kobayashi, Eiichi Matsumoto, Vincent Sitzmann
Emerging neural radiance fields (NeRF) are a promising scene representation for computer graphics, enabling high-quality 3D reconstruction and novel view synthesis from image observations.
no code implementations • BigScience (ACL) 2022 • Sosuke Kobayashi, Shun Kiyono, Jun Suzuki, Kentaro Inui
Ensembling is a popular method used to improve performance as a last resort.
no code implementations • 28 Sep 2021 • Hiroki Ouchi, Jun Suzuki, Sosuke Kobayashi, Sho Yokoi, Tatsuki Kuribayashi, Masashi Yoshikawa, Kentaro Inui
Interpretable rationales for model predictions are crucial in practical applications.
1 code implementation • 13 Sep 2021 • Shun Kiyono, Sosuke Kobayashi, Jun Suzuki, Kentaro Inui
Position representation is crucial for building position-aware representations in Transformers.
no code implementations • EMNLP (sustainlp) 2020 • Sosuke Kobayashi, Sho Yokoi, Jun Suzuki, Kentaro Inui
Understanding the influence of a training instance on a neural network model leads to improving interpretability.
1 code implementation • ACL 2020 • Hiroki Ouchi, Jun Suzuki, Sosuke Kobayashi, Sho Yokoi, Tatsuki Kuribayashi, Ryuto Konno, Kentaro Inui
Interpretable rationales for model predictions play a critical role in practical applications.
1 code implementation • NeurIPS 2020 • Sho Takase, Sosuke Kobayashi
The proposed method, ALONE (all word embeddings from one), constructs the embedding of a word by modifying the shared embedding with a filter vector, which is word-specific but non-trainable.
Ranked #3 on Text Summarization on DUC 2004 Task 1
no code implementations • ICLR Workshop LLD 2019 • Takuya Shimada, Shoichiro Yamaguchi, Kohei Hayashi, Sosuke Kobayashi
Data augmentation by mixing samples, such as Mixup, has widely been used typically for classification tasks.
2 code implementations • 22 Nov 2018 • Masaki Saito, Shunta Saito, Masanori Koyama, Sosuke Kobayashi
Training of Generative Adversarial Network (GAN) on a video dataset is a challenge because of the sheer size of the dataset and the complexity of each observation.
1 code implementation • 28 Oct 2018 • Riku Arakawa, Sosuke Kobayashi, Yuya Unno, Yuta Tsuboi, Shin-ichi Maeda
A remedy for this is to train an agent with real-time feedback from a human observer who immediately gives rewards for some actions.
no code implementations • EMNLP 2018 • Sho Yokoi, Sosuke Kobayashi, Kenji Fukumizu, Jun Suzuki, Kentaro Inui
As well as deriving PMI from mutual information, we derive this new measure from the Hilbert--Schmidt independence criterion (HSIC); thus, we call the new measure the pointwise HSIC (PHSIC).
2 code implementations • NAACL 2018 • Sosuke Kobayashi
We stochastically replace words with other words that are predicted by a bi-directional language model at the word positions.
no code implementations • ACL 2018 • Reina Akama, Kento Watanabe, Sho Yokoi, Sosuke Kobayashi, Kentaro Inui
This paper presents the first study aimed at capturing stylistic similarity between words in an unsupervised manner.
no code implementations • IJCNLP 2017 • Reina Akama, Kazuaki Inada, Naoya Inoue, Sosuke Kobayashi, Kentaro Inui
We propose a novel, data-driven, and stylistically consistent dialog response generation system.
1 code implementation • 17 Oct 2017 • Jun Hatori, Yuta Kikuchi, Sosuke Kobayashi, Kuniyuki Takahashi, Yuta Tsuboi, Yuya Unno, Wilson Ko, Jethro Tan
In this paper, we propose the first comprehensive system that can handle unconstrained spoken language and is able to effectively resolve ambiguity in spoken instructions.
1 code implementation • IJCNLP 2017 • Sosuke Kobayashi, Naoaki Okazaki, Kentaro Inui
This study addresses the problem of identifying the meaning of unknown words or entities in a discourse with respect to the word embedding approaches used in neural language models.
no code implementations • WS 2017 • Melissa Roemmele, Sosuke Kobayashi, Naoya Inoue, Andrew Gordon
In this paper we present a system that performs this task using a supervised binary classifier on top of a recurrent neural network to predict the probability that a given story ending is correct.