Search Results for author: Kazuhiro Nakadai

Found 9 papers, 1 papers with code

From Blurry to Brilliant Detection: YOLOv5-Based Aerial Object Detection with Super Resolution

no code implementations • 26 Jan 2024 • Ragib Amin Nihal, Benjamin Yen, Katsutoshi Itoyama, Kazuhiro Nakadai

The demand for accurate object detection in aerial imagery has surged with the widespread use of drones and satellite technology.

Object object-detection +2

Paper
Add Code

Is the Ideal Ratio Mask Really the Best? -- Exploring the Best Extraction Performance and Optimal Mask of Mask-based Beamformers

no code implementations • 21 Sep 2023 • Atsuo Hiroe, Katsutoshi Itoyama, Kazuhiro Nakadai

Via the experiments with the CHiME-3 dataset, we verify that the four BFs have the same peak performance as the upper bound provided by the ideal MWF BF, whereas the optimal mask depends on the adopted BF and differs from the IRM.

Paper
Add Code

Retraining-free Customized ASR for Enharmonic Words Based on a Named-Entity-Aware Model and Phoneme Similarity Estimation

no code implementations • 29 May 2023 • Yui Sudo, Kazuya Hata, Kazuhiro Nakadai

End-to-end automatic speech recognition (E2E-ASR) has the potential to improve performance, but a specific issue that needs to be addressed is the difficulty it has in handling enharmonic words: named entities (NEs) with the same pronunciation and part of speech that are spelled differently.

Automatic Speech Recognition speech-recognition +1

Paper
Add Code

Metric-based multimodal meta-learning for human movement identification via footstep recognition

no code implementations • 15 Nov 2021 • Muhammad Shakeel, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

We describe a novel metric-based learning approach that introduces a multimodal framework and uses deep audio and geophone encoders in siamese configuration to design an adaptable and lightweight supervised model.

Activity Recognition Contrastive Learning +1

Paper
Add Code

Detecting earthquakes: a novel deep learning-based approach for effective disaster response

no code implementations • 1 Apr 2021 • Shakeel Muhammad, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

In the present study, we present an intelligent earthquake signal detector that provides added assistance to automate traditional disaster responses.

Disaster Response Specificity

Paper
Add Code

CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments

no code implementations • 7 Nov 2018 • Nelson Yalta, Shinji Watanabe, Takaaki Hori, Kazuhiro Nakadai, Tetsuya OGATA

By employing a convolutional neural network (CNN)-based multichannel end-to-end speech recognition system, this study attempts to overcome the presents difficulties in everyday environments.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Weakly Supervised Deep Recurrent Neural Networks for Basic Dance Step Generation

3 code implementations • 3 Jul 2018 • Nelson Yalta, Shinji Watanabe, Kazuhiro Nakadai, Tetsuya OGATA

However, applying DNNs for generating dance to a piece of music is nevertheless challenging, because of 1) DNNs need to generate large sequences while mapping the music input, 2) the DNN needs to constraint the motion beat to the music, and 3) DNNs require a considerable amount of hand-crafted data.

Motion Estimation

Paper
Code

Deep JSLC: A Multimodal Corpus Collection for Data-driven Generation of Japanese Sign Language Expressions

no code implementations • LREC 2018 • Heike Brock, Kazuhiro Nakadai

Data Augmentation

Paper
Add Code

Construction of Japanese Audio-Visual Emotion Database and Its Application in Emotion Recognition

no code implementations • LREC 2016 • Nurul Lubis, R Gomez, y, Sakriani Sakti, Keisuke Nakamura, Koichiro Yoshino, Satoshi Nakamura, Kazuhiro Nakadai

Emotional aspects play a vital role in making human communication a rich and dynamic experience.

Emotion Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.