no code implementations • 1 Apr 2018 • Jiacen Zhang, Nakamasa Inoue, Koichi Shinoda
I-vector based text-independent speaker verification (SV) systems often have poor performance with short utterances, as the biased phonetic distribution in a short utterance makes the extracted i-vector unreliable.
Generative Adversarial Network Text-Independent Speaker Verification
no code implementations • 30 May 2018 • Thao Minh Le, Nakamasa Inoue, Koichi Shinoda
This paper presents a new framework for human action recognition from a 3D skeleton sequence.
Ranked #99 on Skeleton Based Action Recognition on NTU RGB+D
no code implementations • 19 Jul 2018 • Nakamasa Inoue, Koichi Shinoda
Few-shot adaptation provides robust parameter estimation with few training examples, by optimizing the parameters of zero-shot learning and supervised many-shot learning simultaneously.
no code implementations • 12 Nov 2018 • Raden Mu'az Mun'im, Nakamasa Inoue, Koichi Shinoda
We investigate the feasibility of sequence-level knowledge distillation of Sequence-to-Sequence (Seq2Seq) models for Large Vocabulary Continuous Speech Recognition (LVSCR).
no code implementations • 29 Feb 2020 • Takehiko Ohkawa, Naoto Inoue, Hirokatsu Kataoka, Nakamasa Inoue
Herein, we propose Augmented Cyclic Consistency Regularization (ACCR), a novel regularization method for unpaired I2I translation.
no code implementations • 16 Apr 2020 • Mariana Rodrigues Makiuchi, Tifani Warnita, Nakamasa Inoue, Koichi Shinoda, Michitaka Yoshimura, Momoko Kitazawa, Kei Funaki, Yoko Eguchi, Taishiro Kishimoto
We propose a non-invasive and cost-effective method to automatically detect dementia by utilizing solely speech audio data.
no code implementations • 19 Jan 2021 • Nakamasa Inoue, Eisuke Yamagata, Hirokatsu Kataoka
Our main idea is to initialize the network parameters by solving an artificial noise classification problem, where the aim is to classify Perlin noise samples into their noise categories.
2 code implementations • 21 Jan 2021 • Hirokatsu Kataoka, Kazushige Okayasu, Asato Matsumoto, Eisuke Yamagata, Ryosuke Yamada, Nakamasa Inoue, Akio Nakamura, Yutaka Satoh
Is it possible to use convolutional neural networks pre-trained without any natural images to assist natural image understanding?
1 code implementation • 24 Mar 2021 • Kodai Nakashima, Hirokatsu Kataoka, Asato Matsumoto, Kenji Iwata, Nakamasa Inoue
Moreover, although the ViT pre-trained without natural images produces some different visualizations from ImageNet pre-trained ViT, it can interpret natural image datasets to a large extent.
no code implementations • CVPR 2022 • Hirokatsu Kataoka, Ryo Hayamizu, Ryosuke Yamada, Kodai Nakashima, Sora Takashima, Xinyu Zhang, Edgar Josafat Martinez-Noriega, Nakamasa Inoue, Rio Yokota
In the present work, we show that the performance of formula-driven supervised learning (FDSL) can match or even exceed that of ImageNet-21k without the use of real images, human-, and self-supervision during the pre-training of Vision Transformers (ViTs).
1 code implementation • 5 Jul 2022 • Ikuro Sato, Ryota Yamada, Masayuki Tanaka, Nakamasa Inoue, Rei Kawakami
We developed a training algorithm called PoF: Post-Training of Feature Extractor that updates the feature extractor part of an already-trained deep model to search a flatter minimum.
1 code implementation • 19 Dec 2022 • Tatsukichi Shibuya, Nakamasa Inoue, Rei Kawakami, Ikuro Sato
Learning of the feedforward and feedback networks is sufficient to make TP methods capable of training, but is having these layer-wise autoencoders a necessary condition for TP to work?
no code implementations • CVPR 2023 • Sora Takashima, Ryo Hayamizu, Nakamasa Inoue, Hirokatsu Kataoka, Rio Yokota
Unlike JFT-300M which is a static dataset, the quality of synthetic datasets will continue to improve, and the current work is a testament to this possibility.
1 code implementation • ICCV 2023 • Ryo Nakamura, Hirokatsu Kataoka, Sora Takashima, Edgar Josafat Martinez Noriega, Rio Yokota, Nakamasa Inoue
Prior work on FDSL has shown that pre-training vision transformers on such synthetic datasets can yield competitive accuracy on a wide range of downstream tasks.
1 code implementation • ICCV 2023 • Risa Shinoda, Ryo Hayamizu, Kodai Nakashima, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka
SegRCDB has a high potential to contribute to semantic segmentation pre-training and investigation by enabling the creation of large datasets without manual annotation.
1 code implementation • NeurIPS 2023 • Taiki Miyanishi, Fumiya Kitamori, Shuhei Kurita, Jungdae Lee, Motoaki Kawanabe, Nakamasa Inoue
To tackle this problem, we introduce the CityRefer dataset for city-level visual grounding.