no code implementations • 22 Jan 2025 • Takumi Fukuzawa, Kensho Hara, Hirokatsu Kataoka, Toru Tamaki
Experiments with masking background show that models depend on background bias as their performance decreases for Kinetics400.
no code implementations • 16 Jan 2025 • Kohei Torimi, Ryosuke Yamada, Daichi Otsuka, Kensho Hara, Yuki M. Asano, Hirokatsu Kataoka, Yoshimitsu Aoki
Zero-shot recognition models require extensive training data for generalization.
no code implementations • 19 Nov 2024 • Shuntaro Okada, Kenji Doi, Ryota Yoshihashi, Hirokatsu Kataoka, Tomohiro Tanaka
To obtain this noise schedule, we measure the rate of change in the probability distribution of the forward process and use it to determine the noise schedule before training diffusion models.
no code implementations • 20 Sep 2024 • Ryosuke Yamada, Kensho Hara, Hirokatsu Kataoka, Koshi Makihara, Nakamasa Inoue, Rio Yokota, Yutaka Satoh
Throughout the history of computer vision, while research has explored the integration of images (visual) and point clouds (geometric), many advancements in image and 3D object recognition have tended to process these modalities separately.
1 code implementation • 1 Sep 2024 • Go Ohtani, Ryu Tadokoro, Ryosuke Yamada, Yuki M. Asano, Iro Laina, Christian Rupprecht, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka, Yoshimitsu Aoki
In this work, we investigate the understudied effect of the training data used for image super-resolution (SR).
1 code implementation • 1 Aug 2024 • Ryo Nakamura, Ryu Tadokoro, Ryosuke Yamada, Yuki M. Asano, Iro Laina, Christian Rupprecht, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka
To this end, we search for a minimal, purely synthetic pre-training dataset that allows us to achieve performance similar to the 1 million images of ImageNet-1k.
no code implementations • CVPR 2024 • Peifei Zhu, Tsubasa Takahashi, Hirokatsu Kataoka
Diffusion Models (DMs) have shown remarkable capabilities in various image-generation tasks.
1 code implementation • 8 Jan 2024 • Ryu Tadokoro, Ryosuke Yamada, Kodai Nakashima, Ryo Nakamura, Hirokatsu Kataoka
From experimental results, we conclude that effective pre-training can be achieved by looking at primitive geometric objects only.
1 code implementation • 17 Dec 2023 • Shota Nishiyama, Takuma Saito, Ryo Nakamura, Go Ohtani, Hirokatsu Kataoka, Kensho Hara
Our proposed dataset aims to improve the performance of traffic accident recognition by annotating ten types of environmental information as teacher labels in addition to the presence or absence of traffic accidents.
no code implementations • 7 Nov 2023 • Yamato Okamoto, Osada Genki, Iu Yahiro, Rintaro Hasegawa, Peifei Zhu, Hirokatsu Kataoka
In recent years, document processing has flourished and brought numerous benefits.
no code implementations • 23 Oct 2023 • Shuhei Yokoo, Peifei Zhu, Yuchi Ishikawa, Mikihiro Tanaka, Masayoshi Kondo, Hirokatsu Kataoka
Our solution adopts large multimodal models CLIP and BLIP-2 to filter and modify web crawl data, and utilize external datasets along with a bag of tricks to improve the data quality.
no code implementations • 3 Oct 2023 • Yamato Okamoto, Haruto Toyonaga, Yoshihisa Ijiri, Hirokatsu Kataoka
Digital archiving is becoming widespread owing to its effectiveness in protecting valuable books and providing knowledge to many people electronically.
1 code implementation • ICCV 2023 • Risa Shinoda, Ryo Hayamizu, Kodai Nakashima, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka
SegRCDB has a high potential to contribute to semantic segmentation pre-training and investigation by enabling the creation of large datasets without manual annotation.
no code implementations • 26 Sep 2023 • Guoqing Hao, Satoshi Iizuka, Kensho Hara, Edgar Simo-Serra, Hirokatsu Kataoka, Kazuhiro Fukui
We present a novel framework for rectifying occlusions and distortions in degraded texture samples from natural images.
no code implementations • 4 Sep 2023 • Ryota Yoshihashi, Yuya Otsuka, Kenji Doi, Tomohiro Tanaka, Hirokatsu Kataoka
The advance of generative models for images has inspired various training techniques for image recognition utilizing synthetic images.
1 code implementation • ICCV 2023 • Ryo Nakamura, Hirokatsu Kataoka, Sora Takashima, Edgar Josafat Martinez Noriega, Rio Yokota, Nakamasa Inoue
Prior work on FDSL has shown that pre-training vision transformers on such synthetic datasets can yield competitive accuracy on a wide range of downstream tasks.
no code implementations • Smart Agricultural Technology 2023 • Risa Shinoda, Ko Motoki, Kensho Hara, Hirokatsu Kataoka, Ryohei Nakano, Tetsuya Nakazaki, Ryozo Noguchi
The RoseBlooming dataset is the innovative dataset of labeled images for cut flowers at the growing stage.
1 code implementation • CVPR Workshop 2023 • Ryu Tadokoro, Ryosuke Yamada, Hirokatsu Kataoka
Inspired by this approach, we propose the Auto-generated Volumetric Shapes Database (AVS-DB) for data-scarce 3D medical image segmentation tasks.
no code implementations • 6 Mar 2023 • Gido Kato, Yoshihiro Fukuhara, Mariko Isogawa, Hideki Tsunashima, Hirokatsu Kataoka, Shigeo Morishima
To protect privacy and prevent malicious use of deepfake, current studies propose methods that interfere with the generation process, such as detection and destruction approaches.
no code implementations • CVPR 2023 • Sora Takashima, Ryo Hayamizu, Nakamasa Inoue, Hirokatsu Kataoka, Rio Yokota
Unlike JFT-300M which is a static dataset, the quality of synthetic datasets will continue to improve, and the current work is a testament to this possibility.
no code implementations • ICCV 2023 • Peifei Zhu, Genki Osada, Hirokatsu Kataoka, Tsubasa Takahashi
We observe that existing spatial attacks cause large degradation in image quality and find the loss of high-frequency detailed components might be its major reason.
no code implementations • CVPR 2023 • Yue Qiu, Yanjun Sun, Fumiya Matsuzawa, Kenji Iwata, Hirokatsu Kataoka
This paper proposes a new visual reasoning formulation that aims at discovering changes between image pairs and their temporal orders.
1 code implementation • 29 Jul 2022 • Itsuki Ueda, Yoshihiro Fukuhara, Hirokatsu Kataoka, Hiroaki Aizawa, Hidehiko Shishido, Itaru Kitahara
However, it is difficult to achieve high localization performance by only density fields-based methods such as Neural Radiance Field (NeRF) since they do not provide density gradient in most empty regions.
no code implementations • CVPR 2022 • Hirokatsu Kataoka, Ryo Hayamizu, Ryosuke Yamada, Kodai Nakashima, Sora Takashima, Xinyu Zhang, Edgar Josafat Martinez-Noriega, Nakamasa Inoue, Rio Yokota
In the present work, we show that the performance of formula-driven supervised learning (FDSL) can match or even exceed that of ImageNet-21k without the use of real images, human-, and self-supervision during the pre-training of Vision Transformers (ViTs).
no code implementations • 17 Mar 2022 • Shintaro Yamamoto, Hirokatsu Kataoka, Ryota Suzuki, Seitaro Shinagawa, Shigeo Morishima
To alleviate this problem, we organized a group of non-native English speakers to write summaries of papers presented at a computer vision conference to share the knowledge of the papers read by the group.
1 code implementation • CVPR 2022 • Ryosuke Yamada, Hirokatsu Kataoka, Naoya Chiba, Yukiyasu Domae, Tetsuya OGATA
Moreover, the PC-FractalDB pre-trained model is especially effective in training with limited data.
Ranked #18 on
3D Object Detection
on SUN-RGBD val
(using extra training data)
2 code implementations • ICCV 2021 • Yue Qiu, Shintaro Yamamoto, Kodai Nakashima, Ryota Suzuki, Kenji Iwata, Hirokatsu Kataoka, Yutaka Satoh
Change captioning tasks aim to detect changes in image pairs observed before and after a scene change and generate a natural language description of the changes.
1 code implementation • 24 Mar 2021 • Kodai Nakashima, Hirokatsu Kataoka, Asato Matsumoto, Kenji Iwata, Nakamasa Inoue
Moreover, although the ViT pre-trained without natural images produces some different visualizations from ImageNet pre-trained ViT, it can interpret natural image datasets to a large extent.
2 code implementations • 21 Jan 2021 • Hirokatsu Kataoka, Kazushige Okayasu, Asato Matsumoto, Eisuke Yamagata, Ryosuke Yamada, Nakamasa Inoue, Akio Nakamura, Yutaka Satoh
Is it possible to use convolutional neural networks pre-trained without any natural images to assist natural image understanding?
no code implementations • 19 Jan 2021 • Nakamasa Inoue, Eisuke Yamagata, Hirokatsu Kataoka
Our main idea is to initialize the network parameters by solving an artificial noise classification problem, where the aim is to classify Perlin noise samples into their noise categories.
1 code implementation • 14 Jul 2020 • Yuchi Ishikawa, Seito Kasai, Yoshimitsu Aoki, Hirokatsu Kataoka
Our model architecture consists of a long-term feature extractor and two branches: the Action Segmentation Branch (ASB) and the Boundary Regression Branch (BRB).
Ranked #11 on
Action Segmentation
on GTEA
no code implementations • 19 May 2020 • Seito Kasai, Yuchi Ishikawa, Masaki Hayashi, Yoshimitsu Aoki, Kensho Hara, Hirokatsu Kataoka
In this paper, we present a framework that jointly retrieves and spatiotemporally highlights actions in videos by enhancing current deep cross-modal retrieval methods.
11 code implementations • 10 Apr 2020 • Hirokatsu Kataoka, Tenga Wakamiya, Kensho Hara, Yutaka Satoh
Therefore, in the present paper, we conduct exploration study in order to improve spatiotemporal 3D CNNs as follows: (i) Recently proposed large-scale video datasets help improve spatiotemporal 3D CNNs in terms of video classification accuracy.
1 code implementation • 27 Mar 2020 • Munetaka Minoguchi, Ken Okayama, Yutaka Satoh, Hirokatsu Kataoka
To construct an algorithm that can provide robust person detection, we present a dataset with over 8 million images that was produced in a weakly supervised manner.
no code implementations • 29 Feb 2020 • Takehiko Ohkawa, Naoto Inoue, Hirokatsu Kataoka, Nakamasa Inoue
Herein, we propose Augmented Cyclic Consistency Regularization (ACCR), a novel regularization method for unpaired I2I translation.
no code implementations • 25 Sep 2019 • Masahiro Kato, Yoshihiro Fukuhara, Hirokatsu Kataoka, Shigeo Morishima
Our main idea is to apply a framework of learning with rejection and adversarial examples to assist in the decision making for such suspicious samples.
1 code implementation • 19 May 2019 • Takahiro Itazuri, Yoshihiro Fukuhara, Hirokatsu Kataoka, Shigeo Morishima
In this paper, we address the open question: "What do adversarially robust models look at?"
no code implementations • 16 Nov 2018 • Shintaro Yamamoto, Yoshihiro Fukuhara, Ryota Suzuki, Shigeo Morishima, Hirokatsu Kataoka
Due to the recent boom in artificial intelligence (AI) research, including computer vision (CV), it has become impossible for researchers in these fields to keep up with the exponentially increasing number of manuscripts.
no code implementations • 22 Sep 2018 • Ryota Natsume, Kazuki Inoue, Yoshihiro Fukuhara, Shintaro Yamamoto, Shigeo Morishima, Hirokatsu Kataoka
Face recognition research is one of the most active topics in computer vision (CV), and deep neural networks (DNN) are now filling the gap between human-level and computer-driven performance levels in face verification algorithms.
no code implementations • 30 May 2018 • Kota Yoshida, Munetaka Minoguchi, Kenichiro Wani, Akio Nakamura, Hirokatsu Kataoka
In the present paper, in order to consider this question from an academic standpoint, we generate an image caption that draws a "laugh" by a computer.
no code implementations • CVPR 2018 • Tomoyuki Suzuki, Hirokatsu Kataoka, Yoshimitsu Aoki, Yutaka Satoh
In this paper, we propose a novel approach for traffic accident anticipation through (i) Adaptive Loss for Early Anticipation (AdaLEA) and (ii) a large-scale self-annotated incident database for anticipation.
no code implementations • 7 Apr 2018 • Hirokatsu Kataoka, Teppei Suzuki, Shoko Oikawa, Yasuhiro Matsui, Yutaka Satoh
Because of their recent introduction, self-driving cars and advanced driver assistance system (ADAS) equipped vehicles have had little opportunity to learn, the dangerous traffic (including near-miss incident) scenarios that provide normal drivers with strong motivation to drive safely.
26 code implementations • CVPR 2018 • Kensho Hara, Hirokatsu Kataoka, Yutaka Satoh
The purpose of this study is to determine whether current video datasets have sufficient data for training very deep convolutional neural networks (CNNs) with spatio-temporal three-dimensional (3D) kernels.
Ranked #51 on
Action Recognition
on UCF101
1 code implementation • 25 Aug 2017 • Kensho Hara, Hirokatsu Kataoka, Yutaka Satoh
The 3D ResNets trained on the Kinetics did not suffer from overfitting despite the large number of parameters of the model, and achieved better performance than relatively shallow networks, such as C3D.
2 code implementations • 20 Jul 2017 • Hirokatsu Kataoka, Soma Shirakabe, Yun He, Shunya Ueta, Teppei Suzuki, Kaori Abe, Asako Kanezaki, Shin'ichiro Morita, Toshiyuki Yabe, Yoshihiro Kanehara, Hiroya Yatsuyanagi, Shinya Maruyama, Ryosuke Takasawa, Masataka Fuchida, Yudai Miyashita, Kazushige Okayasu, Yuta Matsuzaki
The paper gives futuristic challenges disscussed in the cvpaper. challenge.
no code implementations • 10 May 2017 • Hirokatsu Kataoka, Kaori Abe, Akio Nakamura, Yutaka Satoh
The paper presents a novel concept for collaborative descriptors between deeply learned and hand-crafted features.
no code implementations • 7 Apr 2017 • Yuta Matsuzaki, Kazushige Okayasu, Takaaki Imanari, Naomichi Kobayashi, Yoshihiro Kanehara, Ryousuke Takasawa, Akio Nakamura, Hirokatsu Kataoka
In this paper, we aim to estimate the Winner of world-wide film festival from the exhibited movie poster.
3 code implementations • 23 Mar 2017 • Kaori Abe, Teppei Suzuki, Shunya Ueta, Akio Nakamura, Yutaka Satoh, Hirokatsu Kataoka
The paper presents a novel concept that analyzes and visualizes worldwide fashion trends.
no code implementations • 30 Aug 2016 • Hirokatsu Kataoka, Yun He, Soma Shirakabe, Yutaka Satoh
Information of time differentiation is extremely important cue for a motion representation.
no code implementations • 29 Aug 2016 • Hirokatsu Kataoka, Kensho Hara, Yutaka Satoh
The objective of this paper is to evaluate "human action recognition without human".
no code implementations • 26 May 2016 • Hirokatsu Kataoka, Yudai Miyashita, Tomoaki Yamabe, Soma Shirakabe, Shin'ichi Sato, Hironori Hoshino, Ryo Kato, Kaori Abe, Takaaki Imanari, Naomichi Kobayashi, Shinichiro Morita, Akio Nakamura
The "cvpaper. challenge" is a group composed of members from AIST, Tokyo Denki Univ.
no code implementations • 1 May 2016 • Hirokatsu Kataoka, Masaki Hayashi, Kenji Iwata, Yutaka Satoh, Yoshimitsu Aoki, Slobodan Ilic
Latent Dirichlet allocation (LDA) is used to develop approximations of human motion primitives; these are mid-level representations, and they adaptively integrate dominant vectors when classifying human activities.
no code implementations • 26 Apr 2016 • Teppei Suzuki, Soma Shirakabe, Yudai Miyashita, Akio Nakamura, Yutaka Satoh, Hirokatsu Kataoka
By the detected change areas, however, a human cannot understand how different the two images.
no code implementations • 25 Sep 2015 • Hirokatsu Kataoka, Kenji Iwata, Yutaka Satoh
In this paper, we evaluate convolutional neural network (CNN) features using the AlexNet architecture and very deep convolutional network (VGGNet) architecture.