no code implementations • 21 Apr 2025 • Donghyeong Kim, Chaewon Park, Suhwan Cho, Hyeonjeong Lim, Minseok Kang, Jungho Lee, Sangyoun Lee
Zero-shot anomaly detection (ZSAD) aims to identify anomalies in unseen categories by leveraging CLIP's zero-shot capabilities to match text prompts with visual features.
1 code implementation • 7 Mar 2025 • Jungho Lee, Donghyeong Kim, Dogyoon Lee, Suhwan Cho, Minhyeok Lee, Wonjoon Lee, Taeoh Kim, Dongyoon Wee, Sangyoun Lee
3D Gaussian Splatting (3DGS) has gained significant attention for their high-quality novel view rendering, motivating research to address real-world challenges.
1 code implementation • 5 Mar 2025 • Suhwan Cho, Seunghoon Lee, Minhyeok Lee, Jungho Lee, Sangyoun Lee
This reference is then utilized by a dedicated propagation module to track and segment the object across the entire video.
Ranked #1 on
Referring Video Object Segmentation
on Ref-DAVIS17
no code implementations • 20 Dec 2024 • Jungho Lee, Suhwan Cho, Taeoh Kim, Ho-Deok Jang, Minhyeok Lee, Geonho Cha, Dongyoon Wee, Dogyoon Lee, Sangyoun Lee
While conventional methods depend on sharp images for accurate scene reconstruction, real-world scenarios are often affected by defocus blur due to finite depth of field, making it essential to account for realistic 3D scene representation.
1 code implementation • 12 Dec 2024 • Suhwan Cho, Seoung Wug Oh, Sangyoun Lee, Joon-Young Lee
Powered by a strong generative model, our method not only significantly enhances frame-level quality for object removal but also synthesizes new content in the missing areas based on user-provided text prompts.
Ranked #1 on
Video Inpainting
on HQVI (2K)
no code implementations • 2 Dec 2024 • Sunghun Yang, Minhyeok Lee, Suhwan Cho, Jungho Lee, Sangyoun Lee
For static area, the Masked Static (MS) module enhances temporal consistency by focusing on stable regions.
no code implementations • 22 Nov 2024 • Minhyeok Lee, Suhwan Cho, Jungho Lee, Sunghun Yang, Heeseung Choi, Ig-Jae Kim, Sangyoun Lee
Open-vocabulary semantic segmentation aims to assign pixel-level labels to images across an unlimited range of classes.
1 code implementation • 21 Nov 2024 • Suhwan Cho, Minhyeok Lee, Jungho Lee, Sangyoun Lee
This ability allows the model to generate plausible optical flows, preserving semantic integrity while reflecting the independent motion of scene elements.
Ranked #1 on
Video Salient Object Detection
on DAVSOD-easy35
(using extra training data)
no code implementations • 19 Sep 2024 • Chajin Shin, Sangjin Lee, Sangyoun Lee
Finally, we introduce Auxiliary info-guided Parameter Estimation (APE) module, which predicts the approximation of the latent vector and estimates the probability distribution of these residuals.
no code implementations • 21 Aug 2024 • Minhyeok Lee, Suhwan Cho, Chajin Shin, Jungho Lee, Sunghun Yang, Sangyoun Lee
However, it has limitations such as the inaccuracy of optical flow prediction and the propagation of noise over time.
1 code implementation • 6 Aug 2024 • Juho Jung, Sangyoun Lee, Jooeon Kang, Yunjin Na
All current benchmarks for multimodal deepfake detection manipulate entire frames using various generation techniques, resulting in oversaturated detection accuracies exceeding 94% at the video-level classification.
1 code implementation • 18 Jul 2024 • Sangyoun Lee, Juho Jung, Changdae Oh, Sunghee Yun
Temporal Action Localization (TAL) is a critical task in video analysis, identifying precise start and end times of actions.
Ranked #1 on
Temporal Action Localization
on HACS
1 code implementation • 16 Jul 2024 • Suhwan Cho, Minhyeok Lee, Jungho Lee, Donghyeong Kim, Seunghoon Lee, Sungmin Woo, Sangyoun Lee
Unsupervised video object segmentation (VOS), also known as video salient object detection, aims to detect the most prominent object in a video at the pixel level.
Ranked #1 on
Unsupervised Video Object Segmentation
on FBMS test
1 code implementation • 12 Jul 2024 • Sungmin Woo, Wonjoon Lee, Woo Jin Kim, Dogyoon Lee, Sangyoun Lee
Self-supervised multi-frame monocular depth estimation relies on the geometric consistency between successive frames under the assumption of a static scene.
no code implementations • 9 Jul 2024 • Dogyoon Lee, Donghyeong Kim, Jungho Lee, Minhyeok Lee, Seunghoon Lee, Sangyoun Lee
Recent studies construct deblurred neural radiance fields~(DeRF) using dozens of blurry images, which are not practical scenarios if only a limited number of blurry images are available.
no code implementations • 4 Jul 2024 • Jungho Lee, Donghyeong Kim, Dogyoon Lee, Suhwan Cho, Minhyeok Lee, Sangyoun Lee
3D Gaussian Splatting (3DGS) has gained significant attention for their high-quality novel view rendering, motivating research to address real-world challenges.
1 code implementation • 10 Jun 2024 • Ui-Hyeop Shin, Sangyoun Lee, Taehan Kim, Hyung-Min Park
To achieve this, an asymmetric strategy is presented in which the encoder and decoder are partitioned to perform distinct processing in separation tasks.
Ranked #1 on
Speech Separation
on WSJ0-2mix
1 code implementation • 12 Mar 2024 • Jungho Lee, Dogyoon Lee, Minhyeok Lee, Donghyung Kim, Sangyoun Lee
Neural radiance fields (NeRF) has attracted considerable attention for their exceptional ability in synthesizing novel views with high fidelity.
no code implementations • 29 Jan 2024 • Sungmin Woo, Minjung Kim, Donghyeong Kim, Sungjun Jang, Sangyoun Lee
Multi-agent motion prediction is a crucial concern in autonomous driving, yet it remains a challenge owing to the ambiguous intentions of dynamic agents and their intricate interactions.
no code implementations • 29 Nov 2023 • Minhyeok Lee, Dogyoon Lee, Jungho Lee, Suhwan Cho, Heeseung Choi, Ig-Jae Kim, Sangyoun Lee
While these methods match language features with image features to effectively identify likely target objects, they often struggle to correctly understand contextual information in complex and ambiguous sentences and scenes.
1 code implementation • 26 Sep 2023 • Suhwan Cho, Minhyeok Lee, Jungho Lee, MyeongAh Cho, Sangyoun Lee
Unsupervised video object segmentation (VOS) is a task that aims to detect the most salient object in a video without external guidance about the object.
Ranked #3 on
Unsupervised Video Object Segmentation
on FBMS test
no code implementations • 17 Mar 2023 • Yongwoo Lee, Minhyeok Lee, Suhwan Cho, Sangyoun Lee
Salient object detection (SOD) is a task that involves identifying and segmenting the most visually prominent object in an image.
1 code implementation • CVPR 2024 • Minhyeok Lee, Suhwan Cho, Dogyoon Lee, Chaewon Park, Jungho Lee, Sangyoun Lee
Unsupervised video object segmentation aims to segment the most prominent object in a video sequence.
no code implementations • 8 Mar 2023 • Seunghoon Lee, Suhwan Cho, Dogyoon Lee, Minhyeok Lee, Sangyoun Lee
In recent works, two approaches for UVOS have been discussed that can be divided into: appearance and appearance-motion-based methods, which have limitations respectively.
no code implementations • 28 Feb 2023 • Sangjin Lee, Suhwan Cho, Sangyoun Lee
Usually, a video sequence and object segmentation masks for all frames are required as the input for this task.
no code implementations • 20 Feb 2023 • Chaewon Park, Minhyeok Lee, Suhwan Cho, Donghyeong Kim, Sangyoun Lee
Image reconstruction-based anomaly detection has recently been in the spotlight because of the difficulty of constructing anomaly datasets.
no code implementations • CVPR 2023 • MyeongAh Cho, Minjung Kim, Sangwon Hwang, Chaewon Park, Kyungjae Lee, Sangyoun Lee
Furthermore, as the relationship between context and motion is important in order to identify the anomalies in complex and diverse scenes, we propose a Context--Motion Interrelation Module (CoMo), which models the relationship between the appearance of the surroundings and motion, rather than utilizing only temporal dependencies or motion information.
no code implementations • 16 Dec 2022 • Minjung Kim, MyeongAh Cho, Sangyoun Lee
In video person re-identification (Re-ID), the network must consistently extract features of the target person from successive frames.
no code implementations • 9 Dec 2022 • Minjung Kim, MyeongAh Cho, Heansung Lee, Suhwan Cho, Sangyoun Lee
Occluded person re-identification (Re-ID) in images captured by multiple cameras is challenging because the target person is occluded by pedestrians or objects, especially in crowded scenes.
1 code implementation • ICCV 2023 • Jungho Lee, Minhyeok Lee, Suhwan Cho, Sungmin Woo, Sungjun Jang, Sangyoun Lee
In this paper, we propose the Spatio-Temporal Curve Network (STC-Net) to effectively leverage the spatio-temporal dependency of the human skeleton.
1 code implementation • CVPR 2024 • Suhwan Cho, Minhyeok Lee, Seunghoon Lee, Dogyoon Lee, Heeseung Choi, Ig-Jae Kim, Sangyoun Lee
Unsupervised video object segmentation (VOS) aims to detect and segment the most salient object in videos.
Ranked #2 on
Unsupervised Video Object Segmentation
on FBMS test
1 code implementation • CVPR 2023 • Dogyoon Lee, Minhyeok Lee, Chajin Shin, Sangyoun Lee
The few studies that have investigated NeRF for blurred images have not considered geometric and appearance consistency in 3D space, which is one of the most important factors in 3D reconstruction.
no code implementations • 22 Nov 2022 • Minhyeok Lee, Suhwan Cho, Chaewon Park, Dogyoon Lee, Jungho Lee, Sangyoun Lee
The proposed DPS-Net utilizes a Deformable Point Sampling transformer (DPS transformer) that can effectively capture sparse local boundary information of significant object boundaries in COD using a deformable point sampling method.
1 code implementation • 14 Nov 2022 • Donghyeong Kim, Chaewon Park, Suhwan Cho, Sangyoun Lee
Feature embedding-based methods have shown exceptional performance in detecting industrial anomalies by comparing features of target images with normal images.
Ranked #46 on
Anomaly Detection
on MVTec AD
(using extra training data)
1 code implementation • 8 Sep 2022 • Minhyeok Lee, Suhwan Cho, Seunghoon Lee, Chaewon Park, Sangyoun Lee
The proposed model effectively extracts the RGB and motion information by extracting superpixel-based component prototypes from the input RGB images and optical flow maps.
Ranked #9 on
Unsupervised Video Object Segmentation
on FBMS test
no code implementations • 4 Sep 2022 • Suhwan Cho, Woo Jin Kim, MyeongAh Cho, Seunghoon Lee, Minhyeok Lee, Chaewon Park, Sangyoun Lee
Feature similarity matching, which transfers the information of the reference frame to the query frame, is a key component in semi-supervised video object segmentation.
2 code implementations • 4 Sep 2022 • Suhwan Cho, Minhyeok Lee, Seunghoon Lee, Chaewon Park, Donghyeong Kim, Sangyoun Lee
Unsupervised video object segmentation (VOS) aims to detect the most salient object in a video sequence at the pixel level.
Ranked #6 on
Unsupervised Video Object Segmentation
on FBMS test
1 code implementation • ICCV 2023 • Jungho Lee, Minhyeok Lee, Dogyoon Lee, Sangyoun Lee
Graph convolutional networks (GCNs) are the most commonly used methods for skeleton-based action recognition and have achieved remarkable performance.
1 code implementation • 5 Aug 2022 • Chajin Shin, Hyeongmin Lee, Hanbin Son, Sangjin Lee, Dogyoon Lee, Sangyoun Lee
Then, we increase the receptive field to make the adaptive rescaling module consider the spatial correlation.
no code implementations • 4 Aug 2022 • MyeongAh Cho, Tae-young Chun, g Taeoh Kim, Sangyoun Lee
With the proposed module, we achieve 14. 81% rank-1 accuracy and 15. 47% verification rate of 0. 1% FAR improvements compare to two baseline models.
no code implementations • 3 Aug 2022 • MyeongAh Cho, Tae-young Chung, Hyeongmin Lee, Sangyoun Lee
The region proposal task is to generate a set of candidate regions that contain an object.
1 code implementation • 16 Jul 2022 • Minhyeok Lee, Chaewon Park, Suhwan Cho, Sangyoun Lee
However, despite advances in deep learning-based methods, RGB-D SOD is still challenging due to the large domain gap between an RGB image and the depth map and low-quality depth maps.
Ranked #4 on
RGB-D Salient Object Detection
on NJU2K
1 code implementation • 14 Jul 2022 • Suhwan Cho, Heansung Lee, Minhyeok Lee, Chaewon Park, Sungjun Jang, Minjung Kim, Sangyoun Lee
Semi-supervised video object segmentation (VOS) aims to densely track certain designated objects in videos.
no code implementations • 30 Jun 2022 • Taeoh Kim, Jinhyung Kim, Minho Shim, Sangdoo Yun, Myunggu Kang, Dongyoon Wee, Sangyoun Lee
The magnitude of augmentation operations on each frame is changed by an effective mechanism, Fourier Sampling that parameterizes diverse, smooth, and realistic temporal variations.
1 code implementation • CVPR 2023 • Sangjin Lee, Hyeongmin Lee, Chajin Shin, Hanbin Son, Sangyoun Lee
Lastly, we propose loss functions to give supervisions of the discontinuous motion areas which can be applied along with FTM and D-map.
no code implementations • 13 Feb 2022 • Chaewon Park, Minhyeok Lee, MyeongAh Cho, Sangyoun Lee
Moreover, MOLoss urges the model to focus on learning normal objects captured within RandomSEMO by amplifying the loss on the pixels near the moving objects.
no code implementations • 13 Oct 2021 • Chaewon Park, Minhyeok Lee, MyeongAh Cho, Sangyoun Lee
1) Indiscriminately integrating the encoder feature, which contains spatial information for multiple objects, and the decoder feature, which contains global information of the salient object, is likely to convey unnecessary details of non-salient objects to the decoder, hindering saliency detection.
Ranked #1 on
RGB Salient Object Detection
on PASCAL-S
1 code implementation • 4 Oct 2021 • Suhwan Cho, Heansung Lee, Minjung Kim, Sungjun Jang, Sangyoun Lee
Before finding the best matches for the query frame pixels, the optimal matches for the reference frame pixels are first considered to prevent each reference frame pixel from being overly referenced.
no code implementations • 27 Jul 2021 • Sungmin Woo, Dogyoon Lee, Sangwon Hwang, Woojin Kim, Sangyoun Lee
In this paper, we present Multidimensional Kernel Convolution (MKConv), a novel convolution operator that learns to transform the point feature representation from a vector to a multidimensional matrix.
Ranked #16 on
3D Part Segmentation
on ShapeNet-Part
no code implementations • 16 Jun 2021 • Minhyeok Lee, Sangwon Hwang, Chaewon Park, Sangyoun Lee
Monocular depth estimation is an especially important task in robotics and autonomous driving, where 3D structural information is essential.
1 code implementation • 16 Jun 2021 • Chaewon Park, MyeongAh Cho, Minhyeok Lee, Sangyoun Lee
Video anomaly detection has gained significant attention due to the increasing requirements of automatic monitoring for surveillance videos.
Anomaly Detection In Surveillance Videos
Optical Flow Estimation
+1
1 code implementation • 14 Feb 2021 • Minhyeok Lee, Junhyeop Lee, Dogyoon Lee, Woojin Kim, Sangwon Hwang, Sangyoun Lee
Modern deep learning methods achieve high performance in lane detection, but it is still difficult to accurately detect lanes in challenging situations such as congested roads and extreme lighting conditions.
Ranked #49 on
Lane Detection
on CULane
1 code implementation • CVPR 2021 • Dogyoon Lee, Jaeha Lee, Junhyeop Lee, Hyeongmin Lee, Minhyeok Lee, Sungmin Woo, Sangyoun Lee
Data augmentation is an effective regularization strategy to alleviate the overfitting, which is an inherent drawback of the deep neural networks.
Ranked #4 on
3D Point Cloud Classification
on ModelNet40-C
no code implementations • 2 Feb 2021 • Chajin Shin, Taeoh Kim, Sangjin Lee, Sangyoun Lee
From this test-time adaptation, our network can exploit externally learned image priors from the pre-trained features as well as the internal prior of the test image explicitly.
no code implementations • 1 Feb 2021 • Rushuang Xu, MyeongAh Cho, Sangyoun Lee
In the face recognition application scenario, we need to process facial images captured in various conditions, such as at night by near-infrared (NIR) surveillance cameras.
no code implementations • 26 Oct 2020 • Tae-young Chung, Heansung Lee, Myeong Ah Cho, Suhwan Cho, Sangyoun Lee
So in this paper, we propose a novel self-supervised learning method using a lot of short videos which has no human labeling, and improve the tracking performance through the re-identification network trained in the self-supervised manner to solve the lack of training data problem.
no code implementations • 15 Oct 2020 • MyeongAh Cho, Taeoh Kim, Woo Jin Kim, Suhwan Cho, Sangyoun Lee
For the complex distribution of normal scenes, we suggest normal density estimation of ITAE features through normalizing flow (NF)-based generative models to learn the tractable likelihoods and identify anomalies using out of distribution detection.
no code implementations • 5 Oct 2020 • Hyeongmin Lee, Taeoh Kim, Hanbin Son, Sangwook Baek, Minsu Cheon, Sangyoun Lee
Extensive results for various image processing tasks indicate that the performance of FTN is comparable in multiple continuous levels, and is significantly smoother and lighter than that of other frameworks.
no code implementations • 30 Sep 2020 • Hanbin Son, Taeoh Kim, Hyeongmin Lee, Sangyoun Lee
The postprocessing network increases the quality of decoded images using an example-based learning.
no code implementations • 18 Sep 2020 • Suhwan Cho, Heansung Lee, Sungmin Woo, Sungjun Jang, Sangyoun Lee
Semi-supervised video object segmentation (VOS) aims to segment arbitrary target objects in video when the ground truth segmentation mask of the initial frame is provided.
1 code implementation • 13 Aug 2020 • Taeoh Kim, Hyeongmin Lee, MyeongAh Cho, Ho Seong Lee, Dong Heon Cho, Sangyoun Lee
Based on our novel temporal data augmentation algorithms, video recognition performances are improved using only a limited amount of training data compared to the spatial-only data augmentation algorithms, including the 1st Visual Inductive Priors (VIPriors) for data-efficient action recognition challenge.
no code implementations • 27 May 2020 • Sangjin Lee, Hyeongmin Lee, Taeoh Kim, Sangyoun Lee
Unlike previous studies that usually have been focused on the design of modules or construction of networks, we propose a novel Extrapolative-Interpolative Cycle (EIC) loss using pre-trained frame interpolation module to improve extrapolation performance.
no code implementations • 27 May 2020 • Sungmin Woo, Sangwon Hwang, Woojin Kim, Junhyeop Lee, Dogyoon Lee, Sangyoun Lee
Recently, researchers have been leveraging LiDAR point cloud for higher accuracy in 3D vehicle detection.
no code implementations • 11 Mar 2020 • Hyeongmin Lee, Taeoh Kim, Hanbin Son, Sangwook Baek, Minsu Cheon, Sangyoun Lee
In this paper, we propose a novel continuous-level learning framework using a Filter Transition Network (FTN) which is a non-linear module that easily adapt to new levels, and is regularized to prevent undesirable side-effects.
no code implementations • 2 Mar 2020 • MyeongAh Cho, Taeoh Kim, Ig-Jae Kim, Kyungjae Lee, Sangyoun Lee
Due to the lack of databases, HFR methods usually exploit the pre-trained features on a large-scale visual database that contain general facial information.
1 code implementation • 10 Feb 2020 • Suhwan Cho, MyeongAh Cho, Tae-young Chung, Heansung Lee, Sangyoun Lee
The encoder-decoder based methods for semi-supervised video object segmentation (Semi-VOS) have received extensive attention due to their superior performances.
Ranked #60 on
Semi-Supervised Video Object Segmentation
on DAVIS 2016
no code implementations • 7 Jan 2020 • Joosung Lee, Sangwon Hwang, Kyungjae Lee, Woo Jin Kim, Junhyeop Lee, Tae-young Chung, Sangyoun Lee
Visual odometry is an essential key for a localization module in SLAM systems.
1 code implementation • CVPR 2020 • Hyeongmin Lee, Taeoh Kim, Tae-young Chung, Daehyun Pak, Yuseok Ban, Sangyoun Lee
Video frame interpolation is one of the most challenging tasks in video processing research.
Ranked #16 on
Video Frame Interpolation
on MSU Video Frame Interpolation
(LPIPS metric)