Search Results for author: Kazuya Takeda

Found 29 papers, 12 papers with code

DRUformer: Enhancing the driving scene Important object detection with driving relationship self-understanding

no code implementations11 Nov 2023 Yingjie Niu, Ming Ding, Keisuke Fujii, Kento Ohtani, Alexander Carballo, Kazuya Takeda

The DRUformer is a transformer-based multi-modal important object detection model that takes into account the relationships between all the participants in the driving scenario.

object-detection Object Detection

Runner re-identification from single-view running video in the open-world setting

1 code implementation18 Oct 2023 Tomohiro Suzuki, Kazushi Tsutsui, Kazuya Takeda, Keisuke Fujii

However, most of the current studies on player re-identification in multi- or single-view sports videos focus on re-identification in the closed-world setting using labeled image dataset, and player re-identification in the open-world setting for automatic video analysis is not well developed.

Audio Difference Learning for Audio Captioning

no code implementations15 Sep 2023 Tatsuya Komatsu, Yusuke Fujita, Kazuya Takeda, Tomoki Toda

Furthermore, a unique technique is proposed that involves mixing the input audio with additional audio, and using the additional audio as a reference.

Audio captioning

R-Cut: Enhancing Explainability in Vision Transformers with Relationship Weighted Out and Cut

no code implementations18 Jul 2023 Yingjie Niu, Ming Ding, Maoning Ge, Robin Karlsson, Yuxiao Zhang, Kazuya Takeda

Our method aims to improve trust in classification results and empower users to gain a deeper understanding of the model for downstream tasks by providing visualizations of class-specific maps.

Image Classification

Action valuation of on- and off-ball soccer players based on multi-agent deep reinforcement learning

no code implementations29 May 2023 Hiroshi Nakahara, Kazushi Tsutsui, Kazuya Takeda, Keisuke Fujii

In this paper, we propose a method of valuing possible actions for on- and off-ball soccer players in a single holistic framework based on multi-agent deep reinforcement learning.


Estimation of control area in badminton doubles with pose information from top and back view drone videos

1 code implementation7 May 2023 Ning Ding, Kazuya Takeda, Wenhui Jin, Yingjiu Bei, Keisuke Fujii

In this work, we present the first annotated drone dataset from top and back views in badminton doubles and propose a framework to estimate the control area probability map, which can be used to evaluate teamwork performance.

Visual Tracking

Predictive World Models from Real-World Partial Observations

1 code implementation12 Jan 2023 Robin Karlsson, Alexander Carballo, Keisuke Fujii, Kento Ohtani, Kazuya Takeda

By extending HVAEs to cases where complete ground truth states do not exist, we facilitate continual learning of spatial prediction as a step towards realizing explainable and comprehensive predictive world models for real-world mobile robotics applications.

Continual Learning Open-Ended Question Answering +1

Estimating counterfactual treatment outcomes over time in complex multiagent scenarios

no code implementations4 Jun 2022 Keisuke Fujii, Koh Takeuchi, Atsushi Kuribayashi, Naoya Takeishi, Yoshinobu Kawahara, Kazuya Takeda

Evaluation of intervention in a multiagent system, e. g., when humans should intervene in autonomous driving systems and when a player should pass to teammates for a good shot, is challenging in various engineering and scientific fields.

Autonomous Driving counterfactual

Evaluation of creating scoring opportunities for teammates in soccer via trajectory prediction

1 code implementation4 Jun 2022 Masakiyo Teranishi, Kazushi Tsutsui, Kazuya Takeda, Keisuke Fujii

However, it has remained difficult to evaluate an attacking player without receiving the ball, and to reveal how movement contributes to the creation of scoring opportunities for teammates.

Trajectory Prediction

ViCE: Improving Dense Representation Learning by Superpixelization and Contrasting Cluster Assignment

1 code implementation24 Nov 2021 Robin Karlsson, Tomoki Hayashi, Keisuke Fujii, Alexander Carballo, Kento Ohtani, Kazuya Takeda

Recent self-supervised models have demonstrated equal or better performance than supervised methods, opening for AI systems to learn visual representations from practically unlimited data.

Contrastive Learning Domain Generalization +4

Learning a Model for Inferring a Spatial Road Lane Network Graph using Self-Supervision

1 code implementation5 Jul 2021 Robin Karlsson, David Robert Wong, Simon Thompson, Kazuya Takeda

A formal road lane network model is presented and proves that any structured road scene can be represented by a directed acyclic graph of at most depth three while retaining the notion of intersection regions, and that this is the most compressed representation.

Autonomous Vehicles Self-Supervised Learning

Anomalous Sound Detection Using a Binary Classification Model and Class Centroids

no code implementations11 Jun 2021 Ibuki Kuroyanagi, Tomoki Hayashi, Kazuya Takeda, Tomoki Toda

Our results showed that multi-task learning using binary classification and metric learning to consider the distance from each class centroid in the feature space is effective, and performance can be significantly improved by using even a small amount of anomalous data during training.

Binary Classification Classification +2

Characterization of Multiple 3D LiDARs for Localization and Mapping using Normal Distributions Transform

no code implementations3 Apr 2020 Alexander Carballo, Abraham Monrroy, David Wong, Patiphon Narksri, Jacob Lambert, Yuki Kitsukawa, Eijiro Takeuchi, Shinpei Kato, Kazuya Takeda

In this work, we present a detailed comparison of ten different 3D LiDAR sensors, covering a range of manufacturers, models, and laser configurations, for the tasks of mapping and vehicle localization, using as common reference the Normal Distributions Transform (NDT) algorithm implemented in the self-driving open source platform Autoware.


LIBRE: The Multiple 3D LiDAR Dataset

no code implementations13 Mar 2020 Alexander Carballo, Jacob Lambert, Abraham Monrroy-Cano, David Robert Wong, Patiphon Narksri, Yuki Kitsukawa, Eijiro Takeuchi, Shinpei Kato, Kazuya Takeda

In this work, we present LIBRE: LiDAR Benchmarking and Reference, a first-of-its-kind dataset featuring 10 different LiDAR sensors, covering a range of manufacturers, models, and laser configurations.


ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit

3 code implementations24 Oct 2019 Tomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang, Xu Tan

Furthermore, the unified design enables the integration of ASR functions with TTS, e. g., ASR-based objective evaluation and semi-supervised learning with both ASR and TTS models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

A Survey of Autonomous Driving: Common Practices and Emerging Technologies

no code implementations12 Jun 2019 Ekim Yurtsever, Jacob Lambert, Alexander Carballo, Kazuya Takeda

Automated driving systems (ADSs) promise a safe, comfortable and efficient driving experience.

Robotics Systems and Control Systems and Control

Back-Translation-Style Data Augmentation for End-to-End ASR

no code implementations28 Jul 2018 Tomoki Hayashi, Shinji Watanabe, Yu Zhang, Tomoki Toda, Takaaki Hori, Ramon Astudillo, Kazuya Takeda

In this paper we propose a novel data augmentation method for attention-based end-to-end automatic speech recognition (E2E-ASR), utilizing a large amount of text which is not paired with speech signals.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Multi-Head Decoder for End-to-End Speech Recognition

no code implementations22 Apr 2018 Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Kazuya Takeda

This paper presents a new network architecture called multi-head decoder for end-to-end speech recognition as an extension of a multi-head attention model.

Decoder speech-recognition +1

Causal analysis of task completion errors in spoken music retrieval interactions

no code implementations LREC 2012 Sunao Hara, Norihide Kitaoka, Kazuya Takeda

In this paper, we analyze the causes of task completion errors in spoken dialog systems, using a decision tree with N-gram features of the dialog to detect task-incomplete dialogs.

General Classification Retrieval +1

Cannot find the paper you are looking for? You can Submit a new open access paper.