Personalized speech enhancement has been a field of active research for suppression of speechlike interferers such as competing speakers or TV dialogues.
Recently, speech enhancement technologies that are based on deep learning have received considerable research attention.
As verified in indoor and outdoor 3D LiDAR datasets, our proposed method yields robust global registration performance compared with other global registration methods, even for distant point cloud pairs.
Given these metric poses and monocular sequences, we propose a self-supervised learning method for the pre-trained supervised monocular depth networks to enable metrically scaled depth estimation.
Furthermore, the proposed enhancement system was compared with a baseline system with speaker embeddings and interchannel phase difference.
We present a novel approach for estimating depth from a monocular camera as it moves through complex and crowded indoor environments, e. g., a department store or a metro station.
In this paper, we introduce 5 new indoor datasets for visual localization in challenging real-world environments.
We present a novel algorithm for self-supervised monocular depth completion.