Search Results for author: Minje Kim

Found 36 papers, 8 papers with code

Meta-Heuristic Fronthaul Bit Allocation for Cell-free Massive MIMO Systems

no code implementations28 Mar 2024 Minje Kim, In-soo Kim, Junil Choi

Limited capacity of fronthaul links in a cell-free massive multiple-input multiple-output (MIMO) system can cause quantization errors at a central processing unit (CPU) during data transmission, complicating the centralized rate optimization problem.

Fairness Quantization

BiTT: Bi-directional Texture Reconstruction of Interacting Two Hands from a Single Image

1 code implementation13 Mar 2024 Minje Kim, Tae-Kyun Kim

Creating personalized hand avatars is important to offer a realistic experience to users on AR / VR platforms.

AiSDF: Structure-aware Neural Signed Distance Fields in Indoor Scenes

no code implementations4 Mar 2024 Jaehoon Jang, Inha Lee, Minje Kim, Kyungdon Joo

Indoor scenes we are living in are visually homogenous or textureless, while they inherently have structural forms and provide enough structural priors for 3D scene reconstruction.

3D Scene Reconstruction

Analyzing Downlink Coverage in Clustered Low Earth Orbit Satellite Constellations: A Stochastic Geometry Approach

no code implementations26 Feb 2024 Miyeon Lee, Sucheol Kim, Minje Kim, Dong-Hyun Jung, Junil Choi

Our analyses can be used to design reliable satellite cluster networks by effectively estimating the impact of system parameters on the coverage performance.

Point Processes

Hyperbolic Distance-Based Speech Separation

no code implementations7 Jan 2024 Darius Petermann, Minje Kim

In this work, we explore the task of hierarchical distance-based speech separation defined on a hyperbolic manifold.

Speech Separation

Generative De-Quantization for Neural Speech Codec via Latent Diffusion

no code implementations14 Nov 2023 Haici Yang, Inseon Jang, Minje Kim

In low-bitrate speech coding, end-to-end speech coding networks aim to learn compact yet expressive features and a powerful decoder in a single network.

Quantization Representation Learning

The Potential of Neural Speech Synthesis-based Data Augmentation for Personalized Speech Enhancement

no code implementations14 Nov 2022 Anastasia Kuznetsova, Aswin Sivaraman, Minje Kim

In the proposed method, we show that the quality of the NSS system's synthetic data matters, and if they are good enough the augmented dataset can be used to improve the PSE system that outperforms the speaker-agnostic baseline.

Data Augmentation Speech Enhancement +1

Neural Feature Predictor and Discriminative Residual Coding for Low-Bitrate Speech Coding

no code implementations4 Nov 2022 Haici Yang, Wootaek Lim, Minje Kim

Low and ultra-low-bitrate neural speech coding achieves unprecedented coding gain by generating speech signals from compact speech features.

Upmixing via style transfer: a variational autoencoder for disentangling spatial images and musical content

no code implementations22 Mar 2022 Haici Yang, Sanna Wager, Spencer Russell, Mike Luo, Minje Kim, Wontak Kim

In the stereo-to-multichannel upmixing problem for music, one of the main tasks is to set the directionality of the instrument sources in the multichannel rendering results.

Style Transfer

SpaIn-Net: Spatially-Informed Stereophonic Music Source Separation

no code implementations15 Feb 2022 Darius Petermann, Minje Kim

With the recent advancements of data driven approaches using deep neural networks, music source separation has been formulated as an instrument-specific supervised problem.

Disentanglement Music Source Separation

BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement

no code implementations17 Nov 2021 Sunwoo Kim, Minje Kim

In this paper, we present a blockwise optimization method for masking-based networks (BLOOM-Net) for training scalable speech enhancement networks.

Speech Enhancement

Don't Separate, Learn to Remix: End-to-End Neural Remixing with Joint Optimization

no code implementations28 Jul 2021 Haici Yang, Shivani Firodiya, Nicholas J. Bryan, Minje Kim

In this work, we learn to remix music directly by re-purposing Conv-TasNet, a well-known source separation model, into two neural remixing architectures.

Data Augmentation

HARP-Net: Hyper-Autoencoded Reconstruction Propagation for Scalable Neural Audio Coding

no code implementations22 Jul 2021 Darius Petermann, SeungKwon Beack, Minje Kim

The assumption is that, in a mirrored autoencoder topology, a decoder layer reconstructs the intermediate feature representation of its corresponding encoder layer.

Quantization

Zero-Shot Personalized Speech Enhancement through Speaker-Informed Model Selection

no code implementations8 May 2021 Aswin Sivaraman, Minje Kim

To this end, we propose using an ensemble model wherein each specialist module denoises noisy utterances from a distinct partition of training set speakers.

Clustering Denoising +4

Test-Time Adaptation Toward Personalized Speech Enhancement: Zero-Shot Learning with Knowledge Distillation

no code implementations8 May 2021 Sunwoo Kim, Minje Kim

In addition, since the compact personalized models can outperform larger general-purpose models, we claim that the proposed method performs model compression with no loss of denoising performance.

Denoising Knowledge Distillation +5

Efficient Personalized Speech Enhancement through Self-Supervised Learning

no code implementations5 Apr 2021 Aswin Sivaraman, Minje Kim

To this end, we pose personalization as either a zero-shot task, in which no additional clean speech of the target speaker is used for training, or a few-shot learning task, in which the goal is to minimize the duration of the clean speech used for transfer learning.

Few-Shot Learning Model Compression +3

Personalized Speech Enhancement through Self-Supervised Data Augmentation and Purification

no code implementations5 Apr 2021 Aswin Sivaraman, Sunwoo Kim, Minje Kim

Training personalized speech enhancement models is innately a no-shot learning problem due to privacy constraints and limited access to noise-free speech from the target user.

Data Augmentation Denoising +3

Scalable and Efficient Neural Speech Coding: A Hybrid Design

no code implementations27 Mar 2021 Kai Zhen, Jongmo Sung, Mi Suk Lee, Seungkwon Beak, Minje Kim

We formulate the speech coding problem as an autoencoding task, where a convolutional neural network (CNN) performs encoding and decoding as a neural waveform codec (NWC) during its feedforward routine.

Quantization

Psychoacoustic Calibration of Loss Functions for Efficient End-to-End Neural Audio Coding

no code implementations31 Dec 2020 Kai Zhen, Mi Suk Lee, Jongmo Sung, SeungKwon Beack, Minje Kim

Conventional audio coding technologies commonly leverage human perception of sound, or psychoacoustics, to reduce the bitrate while preserving the perceptual quality of the decoded audio signals.

Self-Supervised Learning from Contrastive Mixtures for Personalized Speech Enhancement

1 code implementation6 Nov 2020 Aswin Sivaraman, Minje Kim

This work explores how self-supervised learning can be universally used to discover speaker-specific features towards enabling personalized speech enhancement models.

Contrastive Learning Few-Shot Learning +3

Sparse Mixture of Local Experts for Efficient Speech Enhancement

1 code implementation16 May 2020 Aswin Sivaraman, Minje Kim

In this paper, we investigate a deep learning approach for speech denoising through an efficient ensemble of specialist neural networks.

Speech Denoising Speech Enhancement

Boosted Locality Sensitive Hashing: Discriminative Binary Codes for Source Separation

1 code implementation14 Feb 2020 Sunwoo Kim, Haici Yang, Minje Kim

Speech enhancement tasks have seen significant improvements with the advance of deep learning technology, but with the cost of increased computational complexity.

Binary Classification Denoising +2

Deep Autotuner: a Pitch Correcting Network for Singing Performances

1 code implementation12 Feb 2020 Sanna Wager, George Tzanetakis, Cheng-i Wang, Minje Kim

We train our neural network model using a dataset of 4, 702 amateur karaoke performances selected for good intonation.

Nearest Neighbor Search-Based Bitwise Source Separation Using Discriminant Winner-Take-All Hashing

no code implementations26 Aug 2019 Sunwoo Kim, Minje Kim

We propose an iteration-free source separation algorithm based on Winner-Take-All (WTA) hash codes, which is a faster, yet accurate alternative to a complex machine learning model for single-channel source separation in a resource-constrained environment.

Denoising

Incremental Binarization On Recurrent Neural Networks For Single-Channel Source Separation

no code implementations23 Aug 2019 Sunwoo Kim, Mrinmoy Maity, Minje Kim

Our experiments show that the proposed BGRU method produces source separation results greater than that of a real-valued fully connected network, with 11-12 dB mean Signal-to-Distortion Ratio (SDR).

Binarization Quantization

A Dual-Staged Context Aggregation Method Towards Efficient End-To-End Speech Enhancement

no code implementations18 Aug 2019 Kai Zhen, Mi Suk Lee, Minje Kim

In speech enhancement, an end-to-end deep neural network converts a noisy speech signal to a clean speech directly in time domain without time-frequency transformation or mask estimation.

Speech Enhancement

Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding

no code implementations18 Jun 2019 Kai Zhen, Jongmo Sung, Mi Suk Lee, Seung-Kwon Beack, Minje Kim

Speech codecs learn compact representations of speech signals to facilitate data transmission.

Performance Optimization on Model Synchronization in Parallel Stochastic Gradient Descent Based SVM

no code implementations3 May 2019 Vibhatha Abeykoon, Geoffrey Fox, Minje Kim

In this research, we identify the bottlenecks in model synchronization in parallel stochastic gradient descent (PSGD)-based SVM algorithm with respect to the training model synchronization frequency (MSF).

Model Optimization

AutoQ: Automated Kernel-Wise Neural Network Quantization

no code implementations ICLR 2020 Qian Lou, Feng Guo, Lantao Liu, Minje Kim, Lei Jiang

Recent network quantization techniques quantize each weight kernel in a convolutional layer independently for higher inference accuracy, since the weight kernels in a layer exhibit different variances and hence have different amounts of redundancy.

AutoML Quantization

Deep Autotuner: A Data-Driven Approach to Natural-Sounding Pitch Correction for Singing Voice in Karaoke Performances

no code implementations3 Feb 2019 Sanna Wager, George Tzanetakis, Cheng-i Wang, Lijiang Guo, Aswin Sivaraman, Minje Kim

This approach differs from commercially used automatic pitch correction systems, where notes in the vocal tracks are shifted to be centered around notes in a user-defined score or mapped to the closest pitch among the twelve equal-tempered scale degrees.

DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car

6 code implementations19 Dec 2017 Michael Garrett Bechtel, Elise McEllhiney, Minje Kim, Heechul Yun

We present DeepPicar, a low-cost deep neural network based autonomous car platform.

Other Computer Science Distributed, Parallel, and Cluster Computing Performance

Collaborative Deep Learning for Speech Enhancement: A Run-Time Model Selection Method Using Autoencoders

no code implementations29 May 2017 Minje Kim

Therefore, the AE can gauge the quality of the module-specific denoised result by seeing its AE reconstruction error, e. g. low error means that the module output is similar to clean speech.

Model Selection Speech Enhancement

Bitwise Neural Networks

no code implementations22 Jan 2016 Minje Kim, Paris Smaragdis

Based on the assumption that there exists a neural network that efficiently represents a set of Boolean functions between all binary inputs and outputs, we propose a process for developing and deploying neural networks whose weight parameters, bias terms, input, and intermediate hidden layer output signals, are all binary-valued, and require only basic bit logic for the feedforward pass.

Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation

2 code implementations13 Feb 2015 Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, Paris Smaragdis

In this paper, we explore joint optimization of masking functions and deep recurrent neural networks for monaural source separation tasks, including monaural speech separation, monaural singing voice separation, and speech denoising.

Denoising Speech Denoising +1

Cannot find the paper you are looking for? You can Submit a new open access paper.