Search Results for author: Sangyoun Lee

Found 68 papers, 30 papers with code

GenCLIP: Generalizing CLIP Prompts for Zero-shot Anomaly Detection

no code implementations21 Apr 2025 Donghyeong Kim, Chaewon Park, Suhwan Cho, Hyeonjeong Lim, Minseok Kang, Jungho Lee, Sangyoun Lee

Zero-shot anomaly detection (ZSAD) aims to identify anomalies in unseen categories by leveraging CLIP's zero-shot capabilities to match text prompts with visual features.

Anomaly Detection Specificity +1

CoMoGaussian: Continuous Motion-Aware Gaussian Splatting from Motion-Blurred Images

1 code implementation7 Mar 2025 Jungho Lee, Donghyeong Kim, Dogyoon Lee, Suhwan Cho, Minhyeok Lee, Wonjoon Lee, Taeoh Kim, Dongyoon Wee, Sangyoun Lee

3D Gaussian Splatting (3DGS) has gained significant attention for their high-quality novel view rendering, motivating research to address real-world challenges.

3DGS 3D Scene Reconstruction

CoCoGaussian: Leveraging Circle of Confusion for Gaussian Splatting from Defocused Images

no code implementations20 Dec 2024 Jungho Lee, Suhwan Cho, Taeoh Kim, Ho-Deok Jang, Minhyeok Lee, Geonho Cha, Dongyoon Wee, Dogyoon Lee, Sangyoun Lee

While conventional methods depend on sharp images for accurate scene reconstruction, real-world scenarios are often affected by defocus blur due to finite depth of field, making it essential to account for realistic 3D scene representation.

3DGS

Elevating Flow-Guided Video Inpainting with Reference Generation

1 code implementation12 Dec 2024 Suhwan Cho, Seoung Wug Oh, Sangyoun Lee, Joon-Young Lee

Powered by a strong generative model, our method not only significantly enhances frame-level quality for object removal but also synthesizes new content in the missing areas based on user-provided text prompts.

2k Video Inpainting

Effective SAM Combination for Open-Vocabulary Semantic Segmentation

no code implementations22 Nov 2024 Minhyeok Lee, Suhwan Cho, Jungho Lee, Sunghun Yang, Heeseung Choi, Ig-Jae Kim, Sangyoun Lee

Open-vocabulary semantic segmentation aims to assign pixel-level labels to images across an unlimited range of classes.

Decoder Language Modeling +4

Transforming Static Images Using Generative Models for Video Salient Object Detection

1 code implementation21 Nov 2024 Suhwan Cho, Minhyeok Lee, Jungho Lee, Sangyoun Lee

This ability allows the model to generate plausible optical flows, preserving semantic integrity while reflecting the independent motion of scene elements.

 Ranked #1 on Video Salient Object Detection on DAVSOD-easy35 (using extra training data)

object-detection Salient Object Detection +2

Multi-Scale Feature Prediction with Auxiliary-Info for Neural Image Compression

no code implementations19 Sep 2024 Chajin Shin, Sangjin Lee, Sangyoun Lee

Finally, we introduce Auxiliary info-guided Parameter Estimation (APE) module, which predicts the approximation of the latent vector and estimates the probability distribution of these residuals.

Image Compression parameter estimation +1

Video Diffusion Models are Strong Video Inpainter

no code implementations21 Aug 2024 Minhyeok Lee, Suhwan Cho, Chajin Shin, Jungho Lee, Sunghun Yang, Sangyoun Lee

However, it has limitations such as the inaccuracy of optical flow prediction and the propagation of noise over time.

Optical Flow Estimation Video Inpainting

WWW: Where, Which and Whatever Enhancing Interpretability in Multimodal Deepfake Detection

1 code implementation6 Aug 2024 Juho Jung, Sangyoun Lee, Jooeon Kang, Yunjin Na

All current benchmarks for multimodal deepfake detection manipulate entire frames using various generation techniques, resulting in oversaturated detection accuracies exceeding 94% at the video-level classification.

DeepFake Detection Face Swapping

Enhancing Temporal Action Localization: Advanced S6 Modeling with Recurrent Mechanism

1 code implementation18 Jul 2024 Sangyoun Lee, Juho Jung, Changdae Oh, Sunghee Yun

Temporal Action Localization (TAL) is a critical task in video analysis, identifying precise start and end times of actions.

Temporal Action Localization

Improving Unsupervised Video Object Segmentation via Fake Flow Generation

1 code implementation16 Jul 2024 Suhwan Cho, Minhyeok Lee, Jungho Lee, Donghyeong Kim, Seunghoon Lee, Sungmin Woo, Sangyoun Lee

Unsupervised video object segmentation (VOS), also known as video salient object detection, aims to detect the most prominent object in a video at the pixel level.

Object object-detection +6

ProDepth: Boosting Self-Supervised Multi-Frame Monocular Depth with Probabilistic Fusion

1 code implementation12 Jul 2024 Sungmin Woo, Wonjoon Lee, Woo Jin Kim, Dogyoon Lee, Sangyoun Lee

Self-supervised multi-frame monocular depth estimation relies on the geometric consistency between successive frames under the assumption of a static scene.

Decoder Depth Prediction +2

Sparse-DeRF: Deblurred Neural Radiance Fields from Sparse View

no code implementations9 Jul 2024 Dogyoon Lee, Donghyeong Kim, Jungho Lee, Minhyeok Lee, Seunghoon Lee, Sangyoun Lee

Recent studies construct deblurred neural radiance fields~(DeRF) using dozens of blurry images, which are not practical scenarios if only a limited number of blurry images are available.

Deblurring Image Deblurring +1

CRiM-GS: Continuous Rigid Motion-Aware Gaussian Splatting from Motion-Blurred Images

no code implementations4 Jul 2024 Jungho Lee, Donghyeong Kim, Dogyoon Lee, Suhwan Cho, Minhyeok Lee, Sangyoun Lee

3D Gaussian Splatting (3DGS) has gained significant attention for their high-quality novel view rendering, motivating research to address real-world challenges.

3DGS 3D Scene Reconstruction

Separate and Reconstruct: Asymmetric Encoder-Decoder for Speech Separation

1 code implementation10 Jun 2024 Ui-Hyeop Shin, Sangyoun Lee, Taehan Kim, Hyung-Min Park

To achieve this, an asymmetric strategy is presented in which the encoder and decoder are partitioned to perform distinct processing in separation tasks.

Chunking Speech Separation

SMURF: Continuous Dynamics for Motion-Deblurring Radiance Fields

1 code implementation12 Mar 2024 Jungho Lee, Dogyoon Lee, Minhyeok Lee, Donghyung Kim, Sangyoun Lee

Neural radiance fields (NeRF) has attracted considerable attention for their exceptional ability in synthesizing novel views with high fidelity.

Deblurring NeRF

FIMP: Future Interaction Modeling for Multi-Agent Motion Prediction

no code implementations29 Jan 2024 Sungmin Woo, Minjung Kim, Donghyeong Kim, Sungjun Jang, Sangyoun Lee

Multi-agent motion prediction is a crucial concern in autonomous driving, yet it remains a challenge owing to the ambiguous intentions of dynamic agents and their intricate interactions.

Decoder Motion Forecasting +2

Synchronizing Vision and Language: Bidirectional Token-Masking AutoEncoder for Referring Image Segmentation

no code implementations29 Nov 2023 Minhyeok Lee, Dogyoon Lee, Jungho Lee, Suhwan Cho, Heeseung Choi, Ig-Jae Kim, Sangyoun Lee

While these methods match language features with image features to effectively identify likely target objects, they often struggle to correctly understand contextual information in complex and ambiguous sentences and scenes.

Image Segmentation Semantic Segmentation

Treating Motion as Option with Output Selection for Unsupervised Video Object Segmentation

1 code implementation26 Sep 2023 Suhwan Cho, Minhyeok Lee, Jungho Lee, MyeongAh Cho, Sangyoun Lee

Unsupervised video object segmentation (VOS) is a task that aims to detect the most salient object in a video without external guidance about the object.

Object Optical Flow Estimation +3

Adaptive Graph Convolution Module for Salient Object Detection

no code implementations17 Mar 2023 Yongwoo Lee, Minhyeok Lee, Suhwan Cho, Sangyoun Lee

Salient object detection (SOD) is a task that involves identifying and segmenting the most visually prominent object in an image.

Object object-detection +2

Tsanet: Temporal and Scale Alignment for Unsupervised Video Object Segmentation

no code implementations8 Mar 2023 Seunghoon Lee, Suhwan Cho, Dogyoon Lee, Minhyeok Lee, Sangyoun Lee

In recent works, two approaches for UVOS have been discussed that can be divided into: appearance and appearance-motion-based methods, which have limitations respectively.

Decoder Object +4

One-Shot Video Inpainting

no code implementations28 Feb 2023 Sangjin Lee, Suhwan Cho, Sangyoun Lee

Usually, a video sequence and object segmentation masks for all frames are required as the input for this task.

Object Segmentation +4

Two-stream Decoder Feature Normality Estimating Network for Industrial Anomaly Detection

no code implementations20 Feb 2023 Chaewon Park, Minhyeok Lee, Suhwan Cho, Donghyeong Kim, Sangyoun Lee

Image reconstruction-based anomaly detection has recently been in the spotlight because of the difficulty of constructing anomaly datasets.

Anomaly Detection Decoder +2

Look Around for Anomalies: Weakly-Supervised Anomaly Detection via Context-Motion Relational Learning

no code implementations CVPR 2023 MyeongAh Cho, Minjung Kim, Sangwon Hwang, Chaewon Park, Kyungjae Lee, Sangyoun Lee

Furthermore, as the relationship between context and motion is important in order to identify the anomalies in complex and diverse scenes, we propose a Context--Motion Interrelation Module (CoMo), which models the relationship between the appearance of the surroundings and motion, rather than utilizing only temporal dependencies or motion information.

Relational Reasoning Supervised Anomaly Detection +1

Feature Disentanglement Learning with Switching and Aggregation for Video-based Person Re-Identification

no code implementations16 Dec 2022 Minjung Kim, MyeongAh Cho, Sangyoun Lee

In video person re-identification (Re-ID), the network must consistently extract features of the target person from successive frames.

Disentanglement Video-Based Person Re-Identification

Occluded Person Re-Identification via Relational Adaptive Feature Correction Learning

no code implementations9 Dec 2022 Minjung Kim, MyeongAh Cho, Heansung Lee, Suhwan Cho, Sangyoun Lee

Occluded person re-identification (Re-ID) in images captured by multiple cameras is challenging because the target person is occluded by pedestrians or objects, especially in crowded scenes.

Occluded Person Re-Identification

Leveraging Spatio-Temporal Dependency for Skeleton-Based Action Recognition

1 code implementation ICCV 2023 Jungho Lee, Minhyeok Lee, Suhwan Cho, Sungmin Woo, Sungjun Jang, Sangyoun Lee

In this paper, we propose the Spatio-Temporal Curve Network (STC-Net) to effectively leverage the spatio-temporal dependency of the human skeleton.

Action Recognition Skeleton Based Action Recognition

DP-NeRF: Deblurred Neural Radiance Field with Physical Scene Priors

1 code implementation CVPR 2023 Dogyoon Lee, Minhyeok Lee, Chajin Shin, Sangyoun Lee

The few studies that have investigated NeRF for blurred images have not considered geometric and appearance consistency in 3D space, which is one of the most important factors in 3D reconstruction.

3D Reconstruction NeRF +1

Boundary-aware Camouflaged Object Detection via Deformable Point Sampling

no code implementations22 Nov 2022 Minhyeok Lee, Suhwan Cho, Chaewon Park, Dogyoon Lee, Jungho Lee, Sangyoun Lee

The proposed DPS-Net utilizes a Deformable Point Sampling transformer (DPS transformer) that can effectively capture sparse local boundary information of significant object boundaries in COD using a deformable point sampling method.

Object object-detection +2

FAPM: Fast Adaptive Patch Memory for Real-time Industrial Anomaly Detection

1 code implementation14 Nov 2022 Donghyeong Kim, Chaewon Park, Suhwan Cho, Sangyoun Lee

Feature embedding-based methods have shown exceptional performance in detecting industrial anomalies by comparing features of target images with normal images.

Ranked #46 on Anomaly Detection on MVTec AD (using extra training data)

Anomaly Detection

Unsupervised Video Object Segmentation via Prototype Memory Network

1 code implementation8 Sep 2022 Minhyeok Lee, Suhwan Cho, Seunghoon Lee, Chaewon Park, Sangyoun Lee

The proposed model effectively extracts the RGB and motion information by extracting superpixel-based component prototypes from the input RGB images and optical flow maps.

Object Optical Flow Estimation +4

Pixel-Level Equalized Matching for Video Object Segmentation

no code implementations4 Sep 2022 Suhwan Cho, Woo Jin Kim, MyeongAh Cho, Seunghoon Lee, Minhyeok Lee, Chaewon Park, Sangyoun Lee

Feature similarity matching, which transfers the information of the reference frame to the query frame, is a key component in semi-supervised video object segmentation.

Object Semantic Segmentation +2

Expanded Adaptive Scaling Normalization for End to End Image Compression

1 code implementation5 Aug 2022 Chajin Shin, Hyeongmin Lee, Hanbin Son, Sangjin Lee, Dogyoon Lee, Sangyoun Lee

Then, we increase the receptive field to make the adaptive rescaling module consider the spatial correlation.

Image Compression

NIR-to-VIS Face Recognition via Embedding Relations and Coordinates of the Pairwise Features

no code implementations4 Aug 2022 MyeongAh Cho, Tae-young Chun, g Taeoh Kim, Sangyoun Lee

With the proposed module, we achieve 14. 81% rank-1 accuracy and 15. 47% verification rate of 0. 1% FAR improvements compare to two baseline models.

Face Recognition Relation +1

N-RPN: Hard Example Learning for Region Proposal Networks

no code implementations3 Aug 2022 MyeongAh Cho, Tae-young Chung, Hyeongmin Lee, Sangyoun Lee

The region proposal task is to generate a set of candidate regions that contain an object.

Region Proposal

SPSN: Superpixel Prototype Sampling Network for RGB-D Salient Object Detection

1 code implementation16 Jul 2022 Minhyeok Lee, Chaewon Park, Suhwan Cho, Sangyoun Lee

However, despite advances in deep learning-based methods, RGB-D SOD is still challenging due to the large domain gap between an RGB image and the depth map and low-quality depth maps.

object-detection RGB-D Salient Object Detection +2

Exploring Temporally Dynamic Data Augmentation for Video Recognition

no code implementations30 Jun 2022 Taeoh Kim, Jinhyung Kim, Minho Shim, Sangdoo Yun, Myunggu Kang, Dongyoon Wee, Sangyoun Lee

The magnitude of augmentation operations on each frame is changed by an effective mechanism, Fourier Sampling that parameterizes diverse, smooth, and realistic temporal variations.

Action Segmentation Image Augmentation +3

Exploring Discontinuity for Video Frame Interpolation

1 code implementation CVPR 2023 Sangjin Lee, Hyeongmin Lee, Chajin Shin, Hanbin Son, Sangyoun Lee

Lastly, we propose loss functions to give supervisions of the discontinuous motion areas which can be applied along with FTM and D-map.

Data Augmentation Video Frame Interpolation

RandomSEMO: Normality Learning Of Moving Objects For Video Anomaly Detection

no code implementations13 Feb 2022 Chaewon Park, Minhyeok Lee, MyeongAh Cho, Sangyoun Lee

Moreover, MOLoss urges the model to focus on learning normal objects captured within RandomSEMO by amplifying the loss on the pixels near the moving objects.

Anomaly Detection Superpixels +1

Saliency Detection via Global Context Enhanced Feature Fusion and Edge Weighted Loss

no code implementations13 Oct 2021 Chaewon Park, Minhyeok Lee, MyeongAh Cho, Sangyoun Lee

1) Indiscriminately integrating the encoder feature, which contains spatial information for multiple objects, and the decoder feature, which contains global information of the salient object, is likely to convey unnecessary details of non-salient objects to the decoder, hindering saliency detection.

Decoder Object +4

Pixel-Level Bijective Matching for Video Object Segmentation

1 code implementation4 Oct 2021 Suhwan Cho, Heansung Lee, Minjung Kim, Sungjun Jang, Sangyoun Lee

Before finding the best matches for the query frame pixels, the optimal matches for the reference frame pixels are first considered to prevent each reference frame pixel from being overly referenced.

Object Semantic Segmentation +2

MKConv: Multidimensional Feature Representation for Point Cloud Analysis

no code implementations27 Jul 2021 Sungmin Woo, Dogyoon Lee, Sangwon Hwang, Woojin Kim, Sangyoun Lee

In this paper, we present Multidimensional Kernel Convolution (MKConv), a novel convolution operator that learns to transform the point feature representation from a vector to a multidimensional matrix.

3D Part Segmentation 3D Point Cloud Classification

EdgeConv with Attention Module for Monocular Depth Estimation

no code implementations16 Jun 2021 Minhyeok Lee, Sangwon Hwang, Chaewon Park, Sangyoun Lee

Monocular depth estimation is an especially important task in robotics and autonomous driving, where 3D structural information is essential.

Autonomous Driving Monocular Depth Estimation

Robust Lane Detection via Expanded Self Attention

1 code implementation14 Feb 2021 Minhyeok Lee, Junhyeop Lee, Dogyoon Lee, Woojin Kim, Sangwon Hwang, Sangyoun Lee

Modern deep learning methods achieve high performance in lane detection, but it is still difficult to accurately detect lanes in challenging situations such as congested roads and extreme lighting conditions.

Decoder Lane Detection

Test-Time Adaptation for Out-of-distributed Image Inpainting

no code implementations2 Feb 2021 Chajin Shin, Taeoh Kim, Sangjin Lee, Sangyoun Lee

From this test-time adaptation, our network can exploit externally learned image priors from the pre-trained features as well as the internal prior of the test image explicitly.

Image Inpainting Test-time Adaptation +1

A NIR-to-VIS face recognition via part adaptive and relation attention module

no code implementations1 Feb 2021 Rushuang Xu, MyeongAh Cho, Sangyoun Lee

In the face recognition application scenario, we need to process facial images captured in various conditions, such as at night by near-infrared (NIR) surveillance cameras.

Face Recognition Heterogeneous Face Recognition +2

Multi-object tracking with self-supervised associating network

no code implementations26 Oct 2020 Tae-young Chung, Heansung Lee, Myeong Ah Cho, Suhwan Cho, Sangyoun Lee

So in this paper, we propose a novel self-supervised learning method using a lot of short videos which has no human labeling, and improve the tracking performance through the re-identification network trained in the self-supervised manner to solve the lack of training data problem.

Multi-Object Tracking Object +1

Unsupervised Video Anomaly Detection via Normalizing Flows with Implicit Latent Features

no code implementations15 Oct 2020 MyeongAh Cho, Taeoh Kim, Woo Jin Kim, Suhwan Cho, Sangyoun Lee

For the complex distribution of normal scenes, we suggest normal density estimation of ITAE features through normalizing flow (NF)-based generative models to learn the tractable likelihoods and identify anomalies using out of distribution detection.

Anomaly Detection Decoder +4

Smoother Network Tuning and Interpolation for Continuous-level Image Processing

no code implementations5 Oct 2020 Hyeongmin Lee, Taeoh Kim, Hanbin Son, Sangwook Baek, Minsu Cheon, Sangyoun Lee

Extensive results for various image processing tasks indicate that the performance of FTN is comparable in multiple continuous levels, and is significantly smoother and lighter than that of other frameworks.

PMVOS: Pixel-Level Matching-Based Video Object Segmentation

no code implementations18 Sep 2020 Suhwan Cho, Heansung Lee, Sungmin Woo, Sungjun Jang, Sangyoun Lee

Semi-supervised video object segmentation (VOS) aims to segment arbitrary target objects in video when the ground truth segmentation mask of the initial frame is provided.

Object One-shot visual object segmentation +3

Learning Temporally Invariant and Localizable Features via Data Augmentation for Video Recognition

1 code implementation13 Aug 2020 Taeoh Kim, Hyeongmin Lee, MyeongAh Cho, Ho Seong Lee, Dong Heon Cho, Sangyoun Lee

Based on our novel temporal data augmentation algorithms, video recognition performances are improved using only a limited amount of training data compared to the spatial-only data augmentation algorithms, including the 1st Visual Inductive Priors (VIPriors) for data-efficient action recognition challenge.

Action Recognition Data Augmentation +1

Extrapolative-Interpolative Cycle-Consistency Learning for Video Frame Extrapolation

no code implementations27 May 2020 Sangjin Lee, Hyeongmin Lee, Taeoh Kim, Sangyoun Lee

Unlike previous studies that usually have been focused on the design of modules or construction of networks, we propose a novel Extrapolative-Interpolative Cycle (EIC) loss using pre-trained frame interpolation module to improve extrapolation performance.

False Positive Removal for 3D Vehicle Detection with Penetrated Point Classifier

no code implementations27 May 2020 Sungmin Woo, Sangwon Hwang, Woojin Kim, Junhyeop Lee, Dogyoon Lee, Sangyoun Lee

Recently, researchers have been leveraging LiDAR point cloud for higher accuracy in 3D vehicle detection.

Regularized Adaptation for Stable and Efficient Continuous-Level Learning on Image Processing Networks

no code implementations11 Mar 2020 Hyeongmin Lee, Taeoh Kim, Hanbin Son, Sangwook Baek, Minsu Cheon, Sangyoun Lee

In this paper, we propose a novel continuous-level learning framework using a Filter Transition Network (FTN) which is a non-linear module that easily adapt to new levels, and is regularized to prevent undesirable side-effects.

Relational Deep Feature Learning for Heterogeneous Face Recognition

no code implementations2 Mar 2020 MyeongAh Cho, Taeoh Kim, Ig-Jae Kim, Kyungjae Lee, Sangyoun Lee

Due to the lack of databases, HFR methods usually exploit the pre-trained features on a large-scale visual database that contain general facial information.

Face Recognition Heterogeneous Face Recognition

CRVOS: Clue Refining Network for Video Object Segmentation

1 code implementation10 Feb 2020 Suhwan Cho, MyeongAh Cho, Tae-young Chung, Heansung Lee, Sangyoun Lee

The encoder-decoder based methods for semi-supervised video object segmentation (Semi-VOS) have received extensive attention due to their superior performances.

Decoder Object +5

Cannot find the paper you are looking for? You can Submit a new open access paper.