Search Results for author: Seong-Whan Lee

Found 94 papers, 19 papers with code

Parameter-free Geometric Document Layout Analysis

no code implementations IEEE Transactions on Pattern Analysis and Machine Intelligence 2001 Seong-Whan Lee, Senior Member, IEEE, and Dae-Seok Ryu

Based on the proposed periodicity measure, multiscale analysis, and confirmation procedure, we could develop a robust method for geometric document layout analysis independent of character font sizes, text line spacing, and document layout structures.

Attribute Document Layout Analysis +1

Relative Attributing Propagation: Interpreting the Comparative Contributions of Individual Units in Deep Neural Networks

1 code implementation1 Apr 2019 Woo-Jeoung Nam, Shir Gur, Jaesik Choi, Lior Wolf, Seong-Whan Lee

As Deep Neural Networks (DNNs) have demonstrated superhuman performance in a variety of fields, there is an increasing interest in understanding the complex internal mechanisms of DNNs.

Interpreting Undesirable Pixels for Image Classification on Black-Box Models

no code implementations27 Sep 2019 Sin-Han Kang, Hong-Gyu Jung, Seong-Whan Lee

To tackle this issue, in this paper, we propose an explanation method that visualizes undesirable regions to classify an image as a target class.

Classification General Classification +1

Network of Evolvable Neural Units: Evolving to Learn at a Synaptic Level

no code implementations16 Dec 2019 Paul Bertens, Seong-Whan Lee

Here a model is proposed that bridges Neuroscience, Machine Learning and Evolutionary Algorithms to evolve individual soma and synaptic compartment models of neurons in a scalable manner.

Evolutionary Algorithms

Mel-spectrogram augmentation for sequence to sequence voice conversion

2 code implementations6 Jan 2020 Yeongtae Hwang, Hyemin Cho, Hongsun Yang, Dong-Ok Won, Insoo Oh, Seong-Whan Lee

In addition, we proposed new policies (i. e., frequency warping, loudness and time length control) for more data variations.

Voice Conversion

Towards Brain-Computer Interfaces for Drone Swarm Control

no code implementations3 Feb 2020 Ji-Hoon Jeong, Dae-Hyeok Lee, Hyung-Ju Ahn, Seong-Whan Lee

Hence, we could confirm the feasibility of the drone swarm control system based on EEG signals for performing high-level tasks.

Brain Computer Interface EEG +1

A Two-Stream Symmetric Network with Bidirectional Ensemble for Aerial Image Matching

2 code implementations4 Feb 2020 Jae-Hyun Park, Woo-Jeoung Nam, Seong-Whan Lee

As a result, the training process of the deep network is regularized and the network becomes robust for the variance of aerial images.

Three-Stream Fusion Network for First-Person Interaction Recognition

no code implementations19 Feb 2020 Ye-Ji Kim, Dong-Gyu Lee, Seong-Whan Lee

First-person interaction recognition is a challenging task because of unstable video conditions resulting from the camera wearer's movement.

Activity Recognition Human Interaction Recognition

A Novel Online Action Detection Framework from Untrimmed Video Streams

no code implementations17 Mar 2020 Da-Hye Yoon, Nam-Gyu Cho, Seong-Whan Lee

Online temporal action localization from an untrimmed video stream is a challenging problem in computer vision.

Online Action Detection Temporal Action Localization

Few-Shot Learning with Geometric Constraints

no code implementations20 Mar 2020 Hong-Gyu Jung, Seong-Whan Lee

We assume a network trained for base categories with a large number of training examples, and we aim to add novel categories to it that have only a few, e. g., one or five, training examples.

Few-Shot Learning

Prediction of Memory Retrieval Performance Using Ear-EEG Signals

no code implementations4 May 2020 Jenifer Kalafatovich, Minji Lee, Seong-Whan Lee

These results showed that it is possible to predict performance of a memory task using ear-EEG signals and it could be used for predicting memory retrieval in a practical brain-computer interface.

Brain Computer Interface EEG +1

End-to-End Automatic Sleep Stage Classification Using Spectral-Temporal Sleep Features

no code implementations4 May 2020 Hyeong-Jin Kim, Minji Lee, Seong-Whan Lee

For five sleep stage classification, the classification performance 85. 6% and 91. 1% using the raw input data and the proposed input, respectively.

Automatic Sleep Stage Classification Classification +2

Assessment of Unconsciousness for Memory Consolidation Using EEG Signals

no code implementations15 May 2020 Gi-Hwan Shin, Minji Lee, Seong-Whan Lee

Seven participants performed two memory tasks (word-pairs and visuo-spatial) before and after the nap to assess the memory consolidation during unconsciousness.

EEG

Decoding of Intuitive Visual Motion Imagery Using Convolutional Neural Network under 3D-BCI Training Environment

no code implementations15 May 2020 Byoung-Hee Kwon, Ji-Hoon Jeong, Jeong-Hyun Cho, Seong-Whan Lee

As a result, the averaged classification performance of the proposed architecture for 4 classes from 16 channels was 67. 50 % across all subjects.

Brain Computer Interface General Classification

Reconstructing ERP Signals Using Generative Adversarial Networks for Mobile Brain-Machine Interface

no code implementations18 May 2020 Young-Eun Lee, Minji Lee, Seong-Whan Lee

As a result, the reconstructed signals had important components such as N200 and P300 similar to ERP during standing.

EEG ERP

Few-Shot Object Detection via Knowledge Transfer

no code implementations28 Aug 2020 Geonuk Kim, Hong-Gyu Jung, Seong-Whan Lee

If there are only a few training data and annotations, the object detectors easily overfit and fail to generalize.

Few-Shot Object Detection Object +2

Decoding Visual Recognition of Objects from EEG Signals based on Attention-Driven Convolutional Neural Network

no code implementations28 Aug 2020 Jenifer Kalafatovich, Minji Lee, Seong-Whan Lee

Our findings showed that EEG signals are possible to differentiate when subjects are presented with visual stimulus of different semantic categories and at an exemplar-level with a high classification accuracy; this demonstrates its viability to be applied it in a real-world BMI.

EEG General Classification

Classification of Imagined Speech Using Siamese Neural Network

no code implementations28 Aug 2020 Dong-Yeon Lee, Minji Lee, Seong-Whan Lee

The proposed framework would help to increase the classification performance of imagined speech for a small amount of data and implement an intuitive communication system.

Classification EEG +1

Online Multi-Object Tracking and Segmentation with GMPHD Filter and Mask-based Affinity Fusion

1 code implementation31 Aug 2020 Young-min Song, Young-chul Yoon, Kwangjin Yoon, Moongu Jeon, Seong-Whan Lee, Witold Pedrycz

One affinity, for position and motion, is computed by using the GMPHD filter, and the other affinity, for appearance is computed by using the responses from a single object tracker such as a kernalized correlation filter.

Instance Segmentation Multi-Object Tracking +2

Rotation Invariant Aerial Image Retrieval with Group Convolutional Metric Learning

no code implementations19 Oct 2020 Hyunseung Chung, Woo-Jeoung Nam, Seong-Whan Lee

In this work, we introduce a novel method for retrieving aerial images by merging group convolution with attention mechanism and metric learning, resulting in robustness to rotational variations.

Image Retrieval Metric Learning +1

Interpreting Deep Neural Networks with Relative Sectional Propagation by Analyzing Comparative Gradients and Hostile Activations

no code implementations7 Dec 2020 Woo-Jeoung Nam, Jaesik Choi, Seong-Whan Lee

As a result, it is possible to assign the bi-polar relevance scores of the target (positive) and hostile (negative) attributions while maintaining each attribution aligned with the importance.

Uncertainty-Aware Human Mesh Recovery From Video by Learning Part-Based 3D Dynamics

no code implementations ICCV 2021 Gun-Hee Lee, Seong-Whan Lee

Despite the recent success of 3D human reconstruction methods, recovering the accurate and smooth 3D human motion from video is still challenging.

3D Human Reconstruction 3D Reconstruction +2

Human Interaction Recognition Framework based on Interacting Body Part Attention

no code implementations22 Jan 2021 Dong-Gyu Lee, Seong-Whan Lee

In this paper, we propose a novel framework that simultaneously considers both implicit and explicit representations of human interactions by fusing information of local image where the interaction actively occurred, primitive motion with the posture of individual subject's body parts, and the co-occurrence of overall appearance change.

Activity Recognition In Videos Human Activity Recognition +1

Visual Question Answering based on Local-Scene-Aware Referring Expression Generation

no code implementations22 Jan 2021 Jung-Jun Kim, Dong-Gyu Lee, Jialin Wu, Hong-Gyu Jung, Seong-Whan Lee

We quantitatively and qualitatively evaluated the proposed method on the VQA v2 dataset and compared it with state-of-the-art methods in terms of answer prediction.

Question Answering Referring Expression +2

Weakly Supervised Thoracic Disease Localization via Disease Masks

no code implementations25 Jan 2021 Hyun-Woo Kim, Hong-Gyu Jung, Seong-Whan Lee

To enable a deep learning-based system to be used in the medical domain as a computer-aided diagnosis system, it is essential to not only classify diseases but also present the locations of the diseases.

FBCNet: A Multi-view Convolutional Neural Network for Brain-Computer Interface

1 code implementation17 Mar 2021 Ravikiran Mane, Effie Chew, Karen Chua, Kai Keng Ang, Neethu Robinson, A. P. Vinod, Seong-Whan Lee, Cuntai Guan

With this design, we compare FBCNet with state-of-the-art (SOTA) BCI algorithm on four MI datasets: The BCI competition IV dataset 2a (BCIC-IV-2a), the OpenBMI dataset, and two large datasets from chronic stroke patients.

Binary Classification Classification +3

ACNet: Mask-Aware Attention with Dynamic Context Enhancement for Robust Acne Detection

no code implementations31 May 2021 Kyungseo Min, Gun-Hee Lee, Seong-Whan Lee

To address these problems, we propose an acne detection network which consists of three components, specifically: Composite Feature Refinement, Dynamic Context Enhancement, and Mask-Aware Multi-Attention.

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

2 code implementations4 Jun 2021 Ji-Hoon Kim, Sang-Hoon Lee, Ji-Hyun Lee, Seong-Whan Lee

Although recent works on neural vocoder have improved the quality of synthesized audio, there still exists a gap between generated and ground-truth audio in frequency space.

Audio Synthesis

Reinforce-Aligner: Reinforcement Alignment Search for Robust End-to-End Text-to-Speech

no code implementations5 Jun 2021 Hyunseung Chung, Sang-Hoon Lee, Seong-Whan Lee

Experimental results also show the superiority of our proposed model compared to other state-of-the-art TTS models with internal and external aligners.

Subject-Independent Brain-Computer Interface for Decoding High-Level Visual Imagery Tasks

no code implementations8 Jun 2021 Dae-Hyeok Lee, Dong-Kyun Han, Sung-Jin Kim, Ji-Hoon Jeong, Seong-Whan Lee

Communication between humans and a drone using electroencephalogram (EEG) signals is one of the most challenging issues in the BCI domain.

Brain Computer Interface EEG

Towards Natural Brain-Machine Interaction using Endogenous Potentials based on Deep Neural Networks

no code implementations25 Jun 2021 Hyung-Ju Ahn, Dae-Hyeok Lee, Ji-Hoon Jeong, Seong-Whan Lee

Moreover, our proposed TINN showed the highest accuracy of 0. 93 compared to the previous methods for classifying three different types of mental imagery tasks (MI, VI, and SI).

EEG Motor Imagery

Detection of Abnormal Behavior with Self-Supervised Gaze Estimation

no code implementations14 Jul 2021 Suneung-Kim, Seong-Whan Lee

In this paper, we present a single video conferencing solution using gaze estimation in preparation for these problems.

Anomaly Detection Gaze Estimation

DAL: Feature Learning from Overt Speech to Decode Imagined Speech-based EEG Signals with Convolutional Autoencoder

no code implementations15 Jul 2021 Dae-Hyeok Lee, Sung-Jin Kim, Seong-Whan Lee

In addition, when comparing the performance between w/o and w/ EEG features of overt speech, there was a performance improvement of 7. 42% when including EEG features of overt speech.

Brain Computer Interface EEG

Motor Imagery Classification based on CNN-GRU Network with Spatio-Temporal Feature Representation

no code implementations15 Jul 2021 Ji-Seon Bang, Seong-Whan Lee

In the classification model, CNN is responsible for spatial feature extraction and GRU is responsible for temporal feature extraction.

Classification EEG +1

Joint Dermatological Lesion Classification and Confidence Modeling with Uncertainty Estimation

no code implementations19 Jul 2021 Gun-Hee Lee, Han-Bin Ko, Seong-Whan Lee

Deep learning has played a major role in the interpretation of dermoscopic images for detecting skin defects and abnormalities.

Classification Lesion Classification

Precise Aerial Image Matching based on Deep Homography Estimation

no code implementations19 Jul 2021 Myeong-Seok Oh, Yong-Ju Lee, Seong-Whan Lee

In this paper, we propose a deep homography alignment network to precisely match two aerial images by progressively estimating the various transformation parameters.

Homography Estimation Image Registration

Improving Interpretability of Deep Neural Networks in Medical Diagnosis by Investigating the Individual Units

no code implementations19 Jul 2021 Woo-Jeoung Nam, Seong-Whan Lee

As an intuitive assessment metric for explanations, we report the performance of intersection of Union between visual explanation and bounding box of lesions.

Medical Diagnosis

GC-TTS: Few-shot Speaker Adaptation with Geometric Constraints

no code implementations16 Aug 2021 Ji-Hoon Kim, Sang-Hoon Lee, Ji-Hyun Lee, Hong-Gyu Jung, Seong-Whan Lee

While numerous attempts have been made to the few-shot speaker adaptation system, there is still a gap in terms of speaker similarity to the target speaker depending on the amount of data.

VoiceMixer: Adversarial Voice Style Mixup

no code implementations NeurIPS 2021 Sang-Hoon Lee, Ji-Hoon Kim, Hyunseung Chung, Seong-Whan Lee

This insufficiency leads to the converted speech containing source speech style or losing source speech content.

Disentanglement Voice Conversion

Interpretable Convolutional Neural Networks for Subject-Independent Motor Imagery Classification

no code implementations14 Dec 2021 Ji-Seon Bang, Seong-Whan Lee

Furthermore, we classified EEG with the subject-independent manner to learn robust and generalized EEG features by avoiding subject dependency.

EEG Motor Imagery

Style-Guided Domain Adaptation for Face Presentation Attack Detection

no code implementations28 Mar 2022 Young-Eun Kim, Woo-Jeoung Nam, Kyungseo Min, Seong-Whan Lee

Domain adaptation (DA) or domain generalization (DG) for face presentation attack detection (PAD) has attracted attention recently with its robustness against unseen attack scenarios.

Domain Generalization Face Presentation Attack Detection +1

Prototype-based Domain Generalization Framework for Subject-Independent Brain-Computer Interfaces

no code implementations15 Apr 2022 Serkan Musellim, Dong-Kyun Han, Ji-Hoon Jeong, Seong-Whan Lee

For this purpose, in this paper, we proposed a framework that employs the open-set recognition technique as an auxiliary task to learn subject-specific style features from the source dataset while helping the shared feature extractor with mapping the features of the unseen target dataset as a new unseen domain.

Brain Computer Interface Domain Generalization +2

Decoding Neural Correlation of Language-Specific Imagined Speech using EEG Signals

no code implementations15 Apr 2022 Keon-Woo Lee, Dae-Hyeok Lee, Sung-Jin Kim, Seong-Whan Lee

In this paper, we investigated the neural signals for two groups of native speakers with two tasks with different languages, English and Chinese.

Brain Computer Interface EEG

Few-Shot Object Detection with Proposal Balance Refinement

no code implementations22 Apr 2022 Sueyeon Kim, Woo-Jeoung Nam, Seong-Whan Lee

Few-shot object detection has gained significant attention in recent years as it has the potential to greatly reduce the reliance on large amounts of manually annotated bounding boxes.

Few-Shot Learning Few-Shot Object Detection +3

Gradient Hedging for Intensively Exploring Salient Interpretation beyond Neuron Activation

no code implementations23 May 2022 Woo-Jeoung Nam, Seong-Whan Lee

Hedging is a strategy for reducing the potential risks in various types of investments by adopting an opposite position in a related asset.

Neural Architecture Adaptation for Object Detection by Searching Channel Dimensions and Mapping Pre-trained Parameters

no code implementations17 Jun 2022 Harim Jung, Myeong-Seok Oh, Cheoljong Yang, Seong-Whan Lee

Most object detection frameworks use backbone architectures originally designed for image classification, conventionally with pre-trained parameters on ImageNet.

Classification Image Classification +4

Factorization Approach for Sparse Spatio-Temporal Brain-Computer Interface

no code implementations17 Jun 2022 Byeong-Hoo Lee, Jeong-Hyun Cho, Byoung-Hee Kwon, Seong-Whan Lee

From the results, we demonstrated that factorizing the EEG signal allows the model to extract rich and decisive features under sparse condition.

EEG Motor Imagery

Multi-Contextual Predictions with Vision Transformer for Video Anomaly Detection

no code implementations17 Jun 2022 Joo-Yeon Lee, Woo-Jeoung Nam, Seong-Whan Lee

Video Anomaly Detection(VAD) has been traditionally tackled in two main methodologies: the reconstruction-based approach and the prediction-based one.

Anomaly Detection Video Anomaly Detection

OTPose: Occlusion-Aware Transformer for Pose Estimation in Sparsely-Labeled Videos

1 code implementation20 Jul 2022 Kyung-Min Jin, Gun-Hee Lee, Seong-Whan Lee

We achieve state-of-the-art pose estimation results for PoseTrack2017 and PoseTrack2018 datasets and demonstrate the robustness of our approach to occlusion and motion blur in sparsely annotated video data.

Pose Estimation

HTNet: Anchor-free Temporal Action Localization with Hierarchical Transformers

no code implementations20 Jul 2022 Tae-Kyung Kang, Gun-Hee Lee, Seong-Whan Lee

Temporal action localization (TAL) is a task of identifying a set of actions in a video, which involves localizing the start and end frames and classifying each action instance.

Temporal Action Localization

Spatial Reasoning for Few-Shot Object Detection

no code implementations2 Nov 2022 Geonuk Kim, Hong-Gyu Jung, Seong-Whan Lee

Although modern object detectors rely heavily on a significant amount of training data, humans can easily detect novel objects using a few training examples.

Data Augmentation Few-Shot Object Detection +2

Kinematic-aware Hierarchical Attention Network for Human Pose Estimation in Videos

1 code implementation29 Nov 2022 Kyung-Min Jin, Byoung-Sung Lim, Gun-Hee Lee, Tae-Kyung Kang, Seong-Whan Lee

Previous video-based human pose estimation methods have shown promising results by leveraging aggregated features of consecutive frames.

2D Pose Estimation 3D Human Pose Estimation +1

Towards Voice Reconstruction from EEG during Imagined Speech

1 code implementation2 Jan 2023 Young-Eun Lee, Seo-Hyun Lee, Sang-Ho Kim, Seong-Whan Lee

Translating imagined speech from human brain activity into voice is a challenging and absorbing research issue that can provide new means of human communication via brain signals.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion

1 code implementation25 May 2023 Ha-Yeong Choi, Sang-Hoon Lee, Seong-Whan Lee

To address the above problem, this paper presents decoupled denoising diffusion models (DDDMs) with disentangled representations, which can control the style for each attribute in generative models.

Denoising Style Transfer +1

HiddenSinger: High-Quality Singing Voice Synthesis via Neural Audio Codec and Latent Diffusion Models

no code implementations12 Jun 2023 Ji-Sang Hwang, Sang-Hoon Lee, Seong-Whan Lee

To alleviate the challenges posed by model complexity in singing voice synthesis, we propose HiddenSinger, a high-quality singing voice synthesis system using a neural audio codec and latent diffusion models.

Denoising Singing Voice Synthesis +1

Diff-E: Diffusion-based Learning for Decoding Imagined Speech EEG

1 code implementation26 Jul 2023 Soowon Kim, Young-Eun Lee, Seo-Hyun Lee, Seong-Whan Lee

Decoding EEG signals for imagined speech is a challenging task due to the high-dimensional nature of the data and low signal-to-noise ratio.

Denoising EEG +1

HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer

no code implementations30 Jul 2023 Sang-Hoon Lee, Ha-Yeong Choi, Hyung-Seok Oh, Seong-Whan Lee

With a hierarchical adaptive structure, the model can adapt to a novel voice style and convert speech progressively.

Style Transfer Variational Inference

DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training

1 code implementation31 Jul 2023 Hyung-Seok Oh, Sang-Hoon Lee, Seong-Whan Lee

Expressive text-to-speech systems have undergone significant advancements owing to prosody modeling, but conventional methods can still be improved.

Denoising Expressive Speech Synthesis

Local-Global Temporal Fusion Network with an Attention Mechanism for Multiple and Multiclass Arrhythmia Classification

no code implementations3 Aug 2023 Yun Kwan Kim, Minji Lee, Kunwook Jo, Hee Seok Song, Seong-Whan Lee

To check the generalization ability of the proposed method, an AFDB-trained model was tested on the MITDB, and superior performance was attained compared with that of a state-of-the-art model.

Arrhythmia Detection Temporal Information Extraction

DeepHealthNet: Adolescent Obesity Prediction System Based on a Deep Learning Framework

no code implementations28 Aug 2023 Ji-Hoon Jeong, In-Gyu Lee, Sung-Kyung Kim, Tae-Eui Kam, Seong-Whan Lee, Euijong Lee

Childhood and adolescent obesity rates are a global concern because obesity is associated with chronic diseases and long-term health risks.

Data Augmentation

NeuroInspect: Interpretable Neuron-based Debugging Framework through Class-conditional Visualizations

1 code implementation11 Oct 2023 Yeong-Joon Ju, Ji-Hoon Park, Seong-Whan Lee

We validate the effectiveness of our framework by addressing false correlations and improving inferences for classes with the worst performance in real-world settings.

counterfactual Decision Making +1

Sample Dominance Aware Framework via Non-Parametric Estimation for Spontaneous Brain-Computer Interface

no code implementations13 Nov 2023 Byeong-Hoo Lee, Byoung-Hee Kwon, Seong-Whan Lee

In this study, we introduce the concept of sample dominance as a measure of EEG signal inconsistency and propose a method to modulate its effect on network training.

Brain Computer Interface EEG

Brain-Driven Representation Learning Based on Diffusion Model

no code implementations14 Nov 2023 Soowon Kim, Seo-Hyun Lee, Young-Eun Lee, Ji-Won Lee, Ji-Ha Park, Seong-Whan Lee

Interpreting EEG signals linked to spoken language presents a complex challenge, given the data's intricate temporal and spatial attributes, as well as the various noise factors.

Denoising EEG +1

Multi-Signal Reconstruction Using Masked Autoencoder From EEG During Polysomnography

no code implementations14 Nov 2023 Young-Seok Kweon, Gi-Hwan Shin, Heon-Gyu Kwak, Ha-Na Jo, Seong-Whan Lee

Polysomnography (PSG) is an indispensable diagnostic tool in sleep medicine, essential for identifying various sleep disorders.

EEG

Enhanced Generative Adversarial Networks for Unseen Word Generation from EEG Signals

no code implementations14 Nov 2023 Young-Eun Lee, Seo-Hyun Lee, Soowon Kim, Jung-Sun Lee, Deok-Seon Kim, Seong-Whan Lee

Recent advances in brain-computer interface (BCI) technology, particularly based on generative adversarial networks (GAN), have shown great promise for improving decoding performance for BCI.

Brain Computer Interface Data Augmentation +3

Neurophysiological Response Based on Auditory Sense for Brain Modulation Using Monaural Beat

no code implementations15 Nov 2023 Ha-Na Jo, Young-Seok Kweon, Gi-Hwan Shin, Heon-Gyu Kwak, Seong-Whan Lee

For analysis, we calculated the power spectral density (PSD) of EEG for each session and compared them in frequency, time, and five brain regions.

EEG

Impact of Nap on Performance in Different Working Memory Tasks Using EEG

no code implementations15 Nov 2023 Gi-Hwan Shin, Young-Seok Kweon, Heon-Gyu Kwak, Ha-Na Jo, Seong-Whan Lee

Electroencephalography (EEG) has been widely used to study the relationship between naps and working memory, yet the effects of naps on distinct working memory tasks remain unclear.

EEG

Sparse Multitask Learning for Efficient Neural Representation of Motor Imagery and Execution

no code implementations10 Dec 2023 Hye-Bin Shin, Kang Yin, Seong-Whan Lee

In the quest for efficient neural network models for neural data interpretation and user intent classification in brain-computer interfaces (BCIs), learning meaningful sparse representations of the underlying neural subspaces is crucial.

Efficient Neural Network intent-classification +2

Neural Speech Embeddings for Speech Synthesis Based on Deep Generative Networks

no code implementations10 Dec 2023 Seo-Hyun Lee, Young-Eun Lee, Soowon Kim, Byung-Kwan Ko, Jun-Young Kim, Seong-Whan Lee

Brain-to-speech technology represents a fusion of interdisciplinary applications encompassing fields of artificial intelligence, brain-computer interfaces, and speech synthesis.

Representation Learning Speech Synthesis

Towards Better Visualizing the Decision Basis of Networks via Unfold and Conquer Attribution Guidance

no code implementations21 Dec 2023 Jung-Ho Hong, Woo-Jeoung Nam, Kyu-Sung Jeon, Seong-Whan Lee

Revealing the transparency of Deep Neural Networks (DNNs) has been widely studied to describe the decision mechanisms of network inner structures.

DurFlex-EVC: Duration-Flexible Emotional Voice Conversion with Parallel Generation

1 code implementation16 Jan 2024 Hyung-Seok Oh, Sang-Hoon Lee, Deok-Hyeon Cho, Seong-Whan Lee

Emotional voice conversion (EVC) seeks to modify the emotional tone of a speaker's voice while preserving the original linguistic content and the speaker's unique vocal characteristics.

Disentanglement Self-Supervised Learning +1

TranSentence: Speech-to-speech Translation via Language-agnostic Sentence-level Speech Encoding without Language-parallel Data

no code implementations17 Jan 2024 Seung-bin Kim, Sang-Hoon Lee, Seong-Whan Lee

With this method, despite training exclusively on the target language's monolingual data, we can generate target language speech in the inference stage using language-agnostic speech embedding from the source language speech.

Sentence Speech-to-Speech Translation +1

AM-SORT: Adaptable Motion Predictor with Historical Trajectory Embedding for Multi-Object Tracking

no code implementations25 Jan 2024 Vitaliy Kim, Gunho Jung, Seong-Whan Lee

AM-SORT is a novel extension of the SORT-series trackers that supersedes the Kalman Filter with the transformer architecture as a motion predictor.

Multi-Object Tracking

TIFu: Tri-directional Implicit Function for High-Fidelity 3D Character Reconstruction

no code implementations25 Jan 2024 Byoungsung Lim, Seong-Whan Lee

Recent advances in implicit function-based approaches have shown promising results in 3D human reconstruction from a single RGB image.

3D Human Reconstruction 3D Reconstruction +1

Edge Conditional Node Update Graph Neural Network for Multi-variate Time Series Anomaly Detection

no code implementations25 Jan 2024 Hayoung Jo, Seong-Whan Lee

Moreover, the graph attention mechanism, commonly used to infer unknown graph structures, could constrain the diversity of source node representations.

Anomaly Detection Graph Attention +2

Appearance Debiased Gaze Estimation via Stochastic Subject-Wise Adversarial Learning

no code implementations25 Jan 2024 Suneung Kim, Woo-Jeoung Nam, Seong-Whan Lee

In this paper, we address these challenges and propose a novel framework: Stochastic subject-wise Adversarial gaZE learning (SAZE), which trains a network to generalize the appearance of subjects.

Gaze Estimation

Explaining generative diffusion models via visual analysis for interpretable decision-making process

no code implementations16 Feb 2024 Ji-Hoon Park, Yeong-Joon Ju, Seong-Whan Lee

To address this issue, we propose the three research questions to interpret the diffusion process from the perspective of the visual concepts generated by the model and the region where the model attends in each time step.

Decision Making Denoising

TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression

no code implementations3 Apr 2024 Ho-Joong Kim, Jung-Ho Hong, Heejo Kong, Seong-Whan Lee

In this paper, we investigate that the normalized coordinate expression is a key factor as reliance on hand-crafted components in query-based detectors for temporal action detection (TAD).

Action Detection object-detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.