no code implementations • 28 Mar 2024 • Minje Kim, In-soo Kim, Junil Choi
Limited capacity of fronthaul links in a cell-free massive multiple-input multiple-output (MIMO) system can cause quantization errors at a central processing unit (CPU) during data transmission, complicating the centralized rate optimization problem.
1 code implementation • 13 Mar 2024 • Minje Kim, Tae-Kyun Kim
Creating personalized hand avatars is important to offer a realistic experience to users on AR / VR platforms.
no code implementations • 4 Mar 2024 • Jaehoon Jang, Inha Lee, Minje Kim, Kyungdon Joo
Indoor scenes we are living in are visually homogenous or textureless, while they inherently have structural forms and provide enough structural priors for 3D scene reconstruction.
no code implementations • 26 Feb 2024 • Miyeon Lee, Sucheol Kim, Minje Kim, Dong-Hyun Jung, Junil Choi
Our analyses can be used to design reliable satellite cluster networks by effectively estimating the impact of system parameters on the coverage performance.
no code implementations • 7 Jan 2024 • Darius Petermann, Minje Kim
In this work, we explore the task of hierarchical distance-based speech separation defined on a hyperbolic manifold.
no code implementations • 14 Nov 2023 • Haici Yang, Inseon Jang, Minje Kim
In low-bitrate speech coding, end-to-end speech coding networks aim to learn compact yet expressive features and a powerful decoder in a single network.
no code implementations • 14 Mar 2023 • Darius Petermann, Inseon Jang, Minje Kim
Spectral sub-bands do not portray the same perceptual relevance.
no code implementations • 14 Nov 2022 • Anastasia Kuznetsova, Aswin Sivaraman, Minje Kim
In the proposed method, we show that the quality of the NSS system's synthetic data matters, and if they are good enough the augmented dataset can be used to improve the PSE system that outperforms the speaker-agnostic baseline.
no code implementations • 4 Nov 2022 • Haici Yang, Wootaek Lim, Minje Kim
Low and ultra-low-bitrate neural speech coding achieves unprecedented coding gain by generating speech signals from compact speech features.
no code implementations • 22 Mar 2022 • Haici Yang, Sanna Wager, Spencer Russell, Mike Luo, Minje Kim, Wontak Kim
In the stereo-to-multichannel upmixing problem for music, one of the main tasks is to set the directionality of the instrument sources in the multichannel rendering results.
no code implementations • 15 Feb 2022 • Darius Petermann, Minje Kim
With the recent advancements of data driven approaches using deep neural networks, music source separation has been formulated as an instrument-specific supervised problem.
no code implementations • 17 Nov 2021 • Sunwoo Kim, Minje Kim
In this paper, we present a blockwise optimization method for masking-based networks (BLOOM-Net) for training scalable speech enhancement networks.
no code implementations • 28 Jul 2021 • Haici Yang, Shivani Firodiya, Nicholas J. Bryan, Minje Kim
In this work, we learn to remix music directly by re-purposing Conv-TasNet, a well-known source separation model, into two neural remixing architectures.
no code implementations • 22 Jul 2021 • Darius Petermann, SeungKwon Beack, Minje Kim
The assumption is that, in a mirrored autoencoder topology, a decoder layer reconstructs the intermediate feature representation of its corresponding encoder layer.
no code implementations • 8 May 2021 • Aswin Sivaraman, Minje Kim
To this end, we propose using an ensemble model wherein each specialist module denoises noisy utterances from a distinct partition of training set speakers.
no code implementations • 8 May 2021 • Sunwoo Kim, Minje Kim
In addition, since the compact personalized models can outperform larger general-purpose models, we claim that the proposed method performs model compression with no loss of denoising performance.
no code implementations • 5 Apr 2021 • Aswin Sivaraman, Minje Kim
To this end, we pose personalization as either a zero-shot task, in which no additional clean speech of the target speaker is used for training, or a few-shot learning task, in which the goal is to minimize the duration of the clean speech used for transfer learning.
no code implementations • 5 Apr 2021 • Aswin Sivaraman, Sunwoo Kim, Minje Kim
Training personalized speech enhancement models is innately a no-shot learning problem due to privacy constraints and limited access to noise-free speech from the target user.
no code implementations • 27 Mar 2021 • Kai Zhen, Jongmo Sung, Mi Suk Lee, Seungkwon Beak, Minje Kim
We formulate the speech coding problem as an autoencoding task, where a convolutional neural network (CNN) performs encoding and decoding as a neural waveform codec (NWC) during its feedforward routine.
no code implementations • 31 Dec 2020 • Kai Zhen, Mi Suk Lee, Jongmo Sung, SeungKwon Beack, Minje Kim
Conventional audio coding technologies commonly leverage human perception of sound, or psychoacoustics, to reduce the bitrate while preserving the perceptual quality of the decoded audio signals.
1 code implementation • 6 Nov 2020 • Aswin Sivaraman, Minje Kim
This work explores how self-supervised learning can be universally used to discover speaker-specific features towards enabling personalized speech enhancement models.
1 code implementation • 16 May 2020 • Aswin Sivaraman, Minje Kim
In this paper, we investigate a deep learning approach for speech denoising through an efficient ensemble of specialist neural networks.
1 code implementation • 14 Feb 2020 • Sunwoo Kim, Haici Yang, Minje Kim
Speech enhancement tasks have seen significant improvements with the advance of deep learning technology, but with the cost of increased computational complexity.
1 code implementation • 12 Feb 2020 • Sanna Wager, George Tzanetakis, Cheng-i Wang, Minje Kim
We train our neural network model using a dataset of 4, 702 amateur karaoke performances selected for good intonation.
no code implementations • 26 Aug 2019 • Sunwoo Kim, Minje Kim
We propose an iteration-free source separation algorithm based on Winner-Take-All (WTA) hash codes, which is a faster, yet accurate alternative to a complex machine learning model for single-channel source separation in a resource-constrained environment.
no code implementations • 23 Aug 2019 • Sunwoo Kim, Mrinmoy Maity, Minje Kim
Our experiments show that the proposed BGRU method produces source separation results greater than that of a real-valued fully connected network, with 11-12 dB mean Signal-to-Distortion Ratio (SDR).
no code implementations • 18 Aug 2019 • Kai Zhen, Mi Suk Lee, Minje Kim
In speech enhancement, an end-to-end deep neural network converts a noisy speech signal to a clean speech directly in time domain without time-frequency transformation or mask estimation.
no code implementations • 18 Jun 2019 • Kai Zhen, Jongmo Sung, Mi Suk Lee, Seung-Kwon Beack, Minje Kim
Speech codecs learn compact representations of speech signals to facilitate data transmission.
no code implementations • 3 May 2019 • Vibhatha Abeykoon, Geoffrey Fox, Minje Kim
In this research, we identify the bottlenecks in model synchronization in parallel stochastic gradient descent (PSGD)-based SVM algorithm with respect to the training model synchronization frequency (MSF).
no code implementations • ICLR 2020 • Qian Lou, Feng Guo, Lantao Liu, Minje Kim, Lei Jiang
Recent network quantization techniques quantize each weight kernel in a convolutional layer independently for higher inference accuracy, since the weight kernels in a layer exhibit different variances and hence have different amounts of redundancy.
no code implementations • 3 Feb 2019 • Sanna Wager, George Tzanetakis, Cheng-i Wang, Lijiang Guo, Aswin Sivaraman, Minje Kim
This approach differs from commercially used automatic pitch correction systems, where notes in the vocal tracks are shifted to be centered around notes in a user-defined score or mapped to the closest pitch among the twelve equal-tempered scale degrees.
6 code implementations • 19 Dec 2017 • Michael Garrett Bechtel, Elise McEllhiney, Minje Kim, Heechul Yun
We present DeepPicar, a low-cost deep neural network based autonomous car platform.
Other Computer Science Distributed, Parallel, and Cluster Computing Performance
no code implementations • 29 May 2017 • Minje Kim
Therefore, the AE can gauge the quality of the module-specific denoised result by seeing its AE reconstruction error, e. g. low error means that the module output is similar to clean speech.
no code implementations • 22 Jan 2016 • Minje Kim, Paris Smaragdis
Based on the assumption that there exists a neural network that efficiently represents a set of Boolean functions between all binary inputs and outputs, we propose a process for developing and deploying neural networks whose weight parameters, bias terms, input, and intermediate hidden layer output signals, are all binary-valued, and require only basic bit logic for the feedforward pass.
2 code implementations • 13 Feb 2015 • Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, Paris Smaragdis
In this paper, we explore joint optimization of masking functions and deep recurrent neural networks for monaural source separation tasks, including monaural speech separation, monaural singing voice separation, and speech denoising.
1 code implementation • ICASSP 2014 • Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, Paris Smaragdis
In this paper, we study deep learning for monaural speech separation.