Search Results for author: Katsutoshi Itoyama

Found 8 papers, 0 papers with code

From Blurry to Brilliant Detection: YOLOv5-Based Aerial Object Detection with Super Resolution

no code implementations • 26 Jan 2024 • Ragib Amin Nihal, Benjamin Yen, Katsutoshi Itoyama, Kazuhiro Nakadai

The demand for accurate object detection in aerial imagery has surged with the widespread use of drones and satellite technology.

Object object-detection +2

Paper
Add Code

Is the Ideal Ratio Mask Really the Best? -- Exploring the Best Extraction Performance and Optimal Mask of Mask-based Beamformers

no code implementations • 21 Sep 2023 • Atsuo Hiroe, Katsutoshi Itoyama, Kazuhiro Nakadai

Via the experiments with the CHiME-3 dataset, we verify that the four BFs have the same peak performance as the upper bound provided by the ideal MWF BF, whereas the optimal mask depends on the adopted BF and differs from the IRM.

Paper
Add Code

Metric-based multimodal meta-learning for human movement identification via footstep recognition

no code implementations • 15 Nov 2021 • Muhammad Shakeel, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

We describe a novel metric-based learning approach that introduces a multimodal framework and uses deep audio and geophone encoders in siamese configuration to design an adaptable and lightweight supervised model.

Activity Recognition Contrastive Learning +1

Paper
Add Code

Detecting earthquakes: a novel deep learning-based approach for effective disaster response

no code implementations • 1 Apr 2021 • Shakeel Muhammad, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

In the present study, we present an intelligent earthquake signal detector that provides added assistance to automate traditional disaster responses.

Disaster Response Specificity

Paper
Add Code

Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition

no code implementations • 22 Mar 2019 • Kazuki Shimada, Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara

To solve this problem, we take an unsupervised approach that decomposes each TF bin into the sum of speech and noise by using multichannel nonnegative matrix factorization (MNMF).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization

no code implementations • 31 Oct 2017 • Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara

This paper presents a statistical method of single-channel speech enhancement that uses a variational autoencoder (VAE) as a prior distribution on clean speech.

Speech Enhancement

Paper
Add Code

Generative Statistical Models with Self-Emergent Grammar of Chord Sequences

no code implementations • 7 Aug 2017 • Hiroaki Tsushima, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii

Generative statistical models of chord sequences play crucial roles in music processing.

Paper
Add Code

Parallel Speech Corpora of Japanese Dialects

no code implementations • LREC 2016 • Koichiro Yoshino, Naoki Hirayama, Shinsuke Mori, Fumihiko Takahashi, Katsutoshi Itoyama, Hiroshi G. Okuno

Binary file summaries/549. html matches

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.