Search Results for author: Kazuhiro Nakadai

Found 9 papers, 1 papers with code

From Blurry to Brilliant Detection: YOLOv5-Based Aerial Object Detection with Super Resolution

no code implementations26 Jan 2024 Ragib Amin Nihal, Benjamin Yen, Katsutoshi Itoyama, Kazuhiro Nakadai

The demand for accurate object detection in aerial imagery has surged with the widespread use of drones and satellite technology.

Object object-detection +2

Is the Ideal Ratio Mask Really the Best? -- Exploring the Best Extraction Performance and Optimal Mask of Mask-based Beamformers

no code implementations21 Sep 2023 Atsuo Hiroe, Katsutoshi Itoyama, Kazuhiro Nakadai

Via the experiments with the CHiME-3 dataset, we verify that the four BFs have the same peak performance as the upper bound provided by the ideal MWF BF, whereas the optimal mask depends on the adopted BF and differs from the IRM.

Retraining-free Customized ASR for Enharmonic Words Based on a Named-Entity-Aware Model and Phoneme Similarity Estimation

no code implementations29 May 2023 Yui Sudo, Kazuya Hata, Kazuhiro Nakadai

End-to-end automatic speech recognition (E2E-ASR) has the potential to improve performance, but a specific issue that needs to be addressed is the difficulty it has in handling enharmonic words: named entities (NEs) with the same pronunciation and part of speech that are spelled differently.

Automatic Speech Recognition speech-recognition +1

Metric-based multimodal meta-learning for human movement identification via footstep recognition

no code implementations15 Nov 2021 Muhammad Shakeel, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

We describe a novel metric-based learning approach that introduces a multimodal framework and uses deep audio and geophone encoders in siamese configuration to design an adaptable and lightweight supervised model.

Activity Recognition Contrastive Learning +1

Detecting earthquakes: a novel deep learning-based approach for effective disaster response

no code implementations1 Apr 2021 Shakeel Muhammad, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

In the present study, we present an intelligent earthquake signal detector that provides added assistance to automate traditional disaster responses.

Disaster Response Specificity

CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments

no code implementations7 Nov 2018 Nelson Yalta, Shinji Watanabe, Takaaki Hori, Kazuhiro Nakadai, Tetsuya OGATA

By employing a convolutional neural network (CNN)-based multichannel end-to-end speech recognition system, this study attempts to overcome the presents difficulties in everyday environments.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Weakly Supervised Deep Recurrent Neural Networks for Basic Dance Step Generation

3 code implementations3 Jul 2018 Nelson Yalta, Shinji Watanabe, Kazuhiro Nakadai, Tetsuya OGATA

However, applying DNNs for generating dance to a piece of music is nevertheless challenging, because of 1) DNNs need to generate large sequences while mapping the music input, 2) the DNN needs to constraint the motion beat to the music, and 3) DNNs require a considerable amount of hand-crafted data.

Motion Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.