no code implementations • 29 May 2025 • Jeonghyeok Do, Sungpyo Kim, Geunhyuk Youk, Jaehyup Lee, Munchurl Kim
PAN-sharpening aims to fuse high-resolution panchromatic (PAN) images with low-resolution multi-spectral (MS) images to generate high-resolution multi-spectral (HRMS) outputs.
no code implementations • 21 Apr 2025 • Minh-Quan Viet Bui, Jongmin Park, Juan Luis Gonzalez Bello, Jaeho Moon, Jihyong Oh, Munchurl Kim
We present MoBGS, a novel deblurring dynamic 3D Gaussian Splatting (3DGS) framework capable of reconstructing sharp and high-quality novel spatio-temporal views from blurry monocular videos in an end-to-end manner.
no code implementations • 28 Mar 2025 • Byeongjun Kwon, Munchurl Kim
Zero-shot depth estimation (DE) models exhibit strong generalization performance as they are trained on large-scale datasets.
no code implementations • CVPR 2025 • Sangwoon Kwak, Joonsoo Kim, Jun Young Jeong, Won-Sik Cheong, Jihyong Oh, Munchurl Kim
3D Gaussian Splatting (3DGS) has made significant strides in scene representation and neural rendering, with intense efforts focused on adapting it for dynamic scenes.
no code implementations • CVPR 2025 • Woojin Lee, Hyugjae Chang, Jaeho Moon, Jaehyup Lee, Munchurl Kim
To overcome this, we propose: (i) Adaptive Bounding Box Scaling (ABBS) that appropriately scales the GT HBoxes to optimize for the size of each predicted RBox, ensuring more accurate prediction for RBoxes' scales; and (ii) a Symmetric Prior Angle (SPA) loss that uses the inherent symmetry of aerial objects for self-supervised learning, addressing the issue in previous methods where learning fails if they consistently make incorrect predictions for all three augmented views (original, rotated, and flipped).
no code implementations • CVPR 2025 • Wonyong Seo, Jihyong Oh, Munchurl Kim
Existing Video Frame interpolation (VFI) models tend to suffer from time-to-location ambiguity when trained with video of non-uniform motions, such as accelerating, decelerating, and changing directions, which often yield blurred interpolated frames. In this paper, we propose (i) a novel motion description map, Bidirectional Motion field (BiM), to effectively describe non-uniform motions; (ii) a BiM-guided Flow Net (BiMFN) with Content-Aware Upsampling Network (CAUN) for precise optical flow estimation; and (iii) Knowledge Distillation for VFI-centric Flow supervision (KDVCF) to supervise the motion estimation of VFI model with VFI-centric teacher flows. The proposed VFI is called a Bidirectional Motion field-guided VFI (BiM-VFI) model. Extensive experiments show that our BiM-VFI model significantly surpasses the recent state-of-the-art VFI methods by 26% and 45% improvements in LPIPS and STLPIPS respectively, yielding interpolated frames with much fewer blurs at arbitrary time instances.
no code implementations • 17 Dec 2024 • Samuel Teodoro, Agus Gunawan, Soo Ye Kim, Jihyong Oh, Munchurl Kim
Recent AI-based video editing has enabled users to edit videos through simple text prompts, significantly simplifying the editing process.
1 code implementation • 16 Dec 2024 • Wonyong Seo, Jihyong Oh, Munchurl Kim
Existing Video Frame interpolation (VFI) models tend to suffer from time-to-location ambiguity when trained with video of non-uniform motions, such as accelerating, decelerating, and changing directions, which often yield blurred interpolated frames.
no code implementations • CVPR 2025 • Jongmin Park, Minh-Quan Viet Bui, Juan Luis Gonzalez Bello, Jaeho Moon, Jihyong Oh, Munchurl Kim
Synthesizing novel views from in-the-wild monocular videos is challenging due to scene dynamics and the lack of multi-view cues.
no code implementations • 11 Dec 2024 • Jaeho Moon, Jeonghwan Yun, Jaehyun Kim, Jaehyup Lee, Munchurl Kim
In our DAKD pipeline, we present a diffusion-based SAR-JointNet that learns to generate realistic SAR images and their labels for segmentation, by effectively modeling joint distribution with balancing two modalities.
no code implementations • CVPR 2025 • Sungpyo Kim, Jeonghyeok Do, Jaehyup Lee, Munchurl Kim
The U-Know-DiffPAN incorporates uncertainty-aware knowledge distillation for effective transfer of feature details from our teacher model to a student one.
1 code implementation • 16 Nov 2024 • Jeonghyeok Do, Jaehyup Lee, Munchurl Kim
Synthetic Aperture Radar (SAR) imagery provides robust environmental and temporal coverage (e. g., during clouds, seasons, day-night cycles), yet its noise and unique structural patterns pose interpretation challenges, especially for non-experts.
1 code implementation • 16 Nov 2024 • Jeonghyeok Do, Munchurl Kim
In zero-shot skeleton-based action recognition, aligning skeleton features with the text features of action labels is essential for accurately predicting unseen actions.
Ranked #1 on
Zero Shot Skeletal Action Recognition
on NTU RGB+D
no code implementations • 22 Aug 2024 • Jooyoung Lee, Se Yoon Jeong, Munchurl Kim
Unlike fixed- or variable-rate image coding, progressive image coding (PIC) aims to compress various qualities of images into a single bitstream, increasing the versatility of bitstream utilization and providing high compression efficiency compared to simulcast compression.
1 code implementation • 14 Mar 2024 • Jeonghyeok Do, Munchurl Kim
We categorize the key skeletal-temporal relations for action recognition into a total of four distinct types.
Ranked #1 on
Human Interaction Recognition
on NTU RGB+D
no code implementations • CVPR 2024 • Geunhyuk Youk, Jihyong Oh, Munchurl Kim
In this paper, we propose a novel flow-guided dynamic filtering (FGDF) and iterative feature refinement with multi-attention (FRMA), which constitutes our VSRDB framework, denoted as FMA-Net.
no code implementations • 21 Dec 2023 • Minh-Quan Viet Bui, Jongmin Park, Jihyong Oh, Munchurl Kim
In response, we propose a novel dynamic deblurring NeRF framework for blurry monocular video, called DyBluRF, consisting of a Base Ray Initialization (BRI) stage and a Motion Decomposition-based Deblurring (MDD) stage.
no code implementations • CVPR 2024 • Jaeho Moon, Juan Luis Gonzalez Bello, Byeongjun Kwon, Munchurl Kim
Subsequently, in the fine training stage, we refine the DE network to learn the detailed depth of the objects from the reprojection loss, while ensuring accurate DE on the moving object regions by employing our regularization loss with a cost-volume-based weighting factor.
no code implementations • CVPR 2024 • Juan Luis Gonzalez Bello, Munchurl Kim
In this paper, we firstly consider view-dependent effects into single image-based novel view synthesis (NVS) problems.
no code implementations • 13 Dec 2023 • Juan Luis Gonzalez Bello, Minh-Quan Viet Bui, Munchurl Kim
Recent advances in neural rendering have shown that, albeit slow, implicit compact models can learn a scene's geometries and view-dependent appearances from multiple views.
no code implementations • ICCV 2023 • Jongmin Park, Jooyoung Lee, Munchurl Kim
Recently, neural network (NN)-based image compression studies have actively been made and has shown impressive performance in comparison to traditional methods.
no code implementations • CVPR 2023 • Agus Gunawan, Soo Ye Kim, Hyeonjun Sim, Jae-Ho Lee, Munchurl Kim
In order to modernize old photos, we propose a novel multi-reference-based old photo modernization (MROPM) framework consisting of a network MROPM-Net and a novel synthetic data generation scheme.
1 code implementation • 8 Nov 2022 • Jooyoung Lee, Seyoon Jeong, Munchurl Kim
For this, we first generate a 3D importance map as the nature of input content to represent the underlying importance of the representation elements.
no code implementations • 7 Nov 2022 • Andrey Ignatov, Radu Timofte, Jin Zhang, Feng Zhang, Gaocheng Yu, Zhe Ma, Hongbin Wang, Minsu Kwon, Haotian Qian, Wentao Tong, Pan Mu, Ziping Wang, Guangjing Yan, Brian Lee, Lei Fei, Huaijin Chen, Hyebin Cho, Byeongjun Kwon, Munchurl Kim, Mingyang Qian, Huixin Ma, Yanan Li, Xiaotao Wang, Lei Lei
In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based bokeh effect rendering approach that can run on modern smartphone GPUs using TensorFlow Lite.
no code implementations • CVPR 2022 • Soo Ye Kim, Jianming Zhang, Simon Niklaus, Yifei Fan, Simon Chen, Zhe Lin, Munchurl Kim
Depth maps are used in a wide range of applications from 3D rendering to 2D image effects such as Bokeh.
no code implementations • 18 May 2022 • Juan Luis Gonzalez Bello, Jaeho Moon, Munchurl Kim
Recently, much attention has been drawn to learning the underlying 3D structures of a scene from monocular videos in a fully self-supervised fashion.
1 code implementation • 19 Nov 2021 • Jihyong Oh, Munchurl Kim
In this paper, we propose a novel joint deblurring and multi-frame interpolation (DeMFI) framework, called DeMFI-Net, which accurately converts blurry videos of lower-frame-rate to sharp videos at higher-frame-rate based on flow-guided attentive-correlation-based feature bolstering (FAC-FB) module and recursive boosting (RB), in terms of multi-frame interpolation (MFI).
1 code implementation • CVPR 2021 • Jaehyup Lee, Soomin Seo, Munchurl Kim
Pan-sharpening is a process of merging a high-resolution (HR) panchromatic (PAN) image and its corresponding low-resolution (LR) multi-spectral (MS) image to create an HR-MS and pan-sharpened image.
no code implementations • 16 Apr 2021 • Dac Tung Vu, Juan Luis Gonzalez, Munchurl Kim
In this work, we propose a novel network architecture consisting of three sub-networks to remove heavy rain from a single image without estimating rain streaks and fog separately.
1 code implementation • ICCV 2021 • Hyeonjun Sim, Jihyong Oh, Munchurl Kim
In this paper, we firstly present a dataset (X4K1000FPS) of 4K videos of 1000 fps with the extreme motion to the research community for video frame interpolation (VFI), and propose an extreme VFI network, called XVFI-Net, that first handles the VFI for 4K videos with large motion.
no code implementations • 29 Mar 2021 • Jihyong Oh, Munchurl Kim
In this paper, we firstly propose a novel GAN-based multi-task learning (MTL) method for SAR target image generation, called PeaceGAN that uses both pose angle and target class information, which makes it possible to produce SAR target images of desired target classes at intended pose angles.
1 code implementation • CVPR 2021 • Juan Luis Gonzalez Bello, Munchurl Kim
Our PLADE-Net is based on a new network architecture with neural positional encoding and a novel loss function that borrows from the closed-form solution of the matting Laplacian to learn pixel-level accurate depth estimation from stereo images.
1 code implementation • 17 Dec 2020 • Soo Ye Kim, Kfir Aberman, Nori Kanazawa, Rahul Garg, Neal Wadhwa, Huiwen Chang, Nikhil Karnad, Munchurl Kim, Orly Liba
Although deep learning has enabled a huge leap forward in image inpainting, current methods are often unable to synthesize realistic high-frequency details.
1 code implementation • CVPR 2021 • Soo Ye Kim, Hyeonjun Sim, Munchurl Kim
Blind super-resolution (SR) methods aim to generate a high quality high resolution image from a low resolution image containing unknown degradations.
Ranked #3 on
Blind Super-Resolution
on DIV2KRK - 2x upscaling
1 code implementation • NeurIPS 2020 • Juan Luis Gonzalez, Munchurl Kim
However, previous methods usually learn forward or backward image synthesis, but not depth estimation, as they cannot effectively neglect occlusions between the target and the reference images.
no code implementations • 7 May 2020 • Codruta O. Ancuti, Cosmin Ancuti, Florin-Alexandru Vasluianu, Radu Timofte, Jing Liu, Haiyan Wu, Yuan Xie, Yanyun Qu, Lizhuang Ma, Ziling Huang, Qili Deng, Ju-Chin Chao, Tsung-Shan Yang, Peng-Wen Chen, Po-Min Hsu, Tzu-Yi Liao, Chung-En Sun, Pei-Yuan Wu, Jeonghyeok Do, Jongmin Park, Munchurl Kim, Kareem Metwaly, Xuelu Li, Tiantong Guo, Vishal Monga, Mingzhao Yu, Venkateswararao Cherukuri, Shiue-Yuan Chuang, Tsung-Nan Lin, David Lee, Jerome Chang, Zhan-Han Wang, Yu-Bang Chang, Chang-Hong Lin, Yu Dong, Hong-Yu Zhou, Xiangzhen Kong, Sourya Dipta Das, Saikat Dutta, Xuan Zhao, Bing Ouyang, Dennis Estrada, Meiqi Wang, Tianqi Su, Siyi Chen, Bangyong Sun, Vincent Whannou de Dravo, Zhe Yu, Pratik Narang, Aryan Mehra, Navaneeth Raghunath, Murari Mandal
We focus on the proposed solutions and their results evaluated on NH-Haze, a novel dataset consisting of 55 pairs of real haze free and nonhomogeneous hazy images recorded outdoor.
no code implementations • ICLR 2020 • Juan Luis Gonzalez Bello, Munchurl Kim
Our proposed network architecture, the monster-net, is devised with a novel t-shaped adaptive kernel with globally and locally adaptive dilation, which can efficiently incorporate global camera shift into and handle local 3D geometries of the target image's pixels for the synthesis of naturally looking 3D panned views when a 2-D input image is given.
1 code implementation • 30 Dec 2019 • Jooyoung Lee, Seunghyun Cho, Munchurl Kim
In order to show the effectiveness of our proposed JointIQ-Net, extensive experiments have been performed, and showed that the JointIQ-Net achieves a remarkable performance improvement in coding efficiency in terms of both PSNR and MS-SSIM, compared to the previous learned image compression methods and the conventional codecs such as VVC Intra (VTM 7. 1), BPG, and JPEG2000.
1 code implementation • 16 Dec 2019 • Soo Ye Kim, Jihyong Oh, Munchurl Kim
In this paper, we first propose a joint VFI-SR framework for up-scaling the spatio-temporal resolution of videos from 2K 30 fps to 4K 60 fps.
no code implementations • 2 Oct 2019 • Juan Luis Gonzalez Bello, Munchurl Kim
Our proposed network architecture, the monster-net, is devised with a novel "t-shaped" adaptive kernel with globally and locally adaptive dilation, which can efficiently incorporate global camera shift into and handle local 3D geometries of the target image's pixels for the synthesis of naturally looking 3D panned views when a 2-D input image is given.
no code implementations • 25 Sep 2019 • Taimoor Tariq, Munchurl Kim
In this paper, to get more insight, we link basic human visual perception to characteristics of learned deep CNN representations as a novel and first attempt to interpret them.
no code implementations • 20 Sep 2019 • Juan Luis Gonzalez Bello, Munchurl Kim
The 3D-zoom operation is the positive translation of the camera in the Z-axis, perpendicular to the image plane.
1 code implementation • 10 Sep 2019 • Soo Ye Kim, Jihyong Oh, Munchurl Kim
Joint learning of super-resolution (SR) and inverse tone-mapping (ITM) has been explored recently, to convert legacy low resolution (LR) standard dynamic range (SDR) videos to high resolution (HR) high dynamic range (HDR) videos for the growing need of UHD HDR TV/broadcasting applications.
no code implementations • 13 Jun 2019 • Jae-Seok Choi, Yongwoo Kim, Munchurl Kim
Our proposed S3 loss can be very effectively utilized for pan-sharpening with various types of CNN structures, resulting in significant visual improvements on PS images with suppressed artifacts.
1 code implementation • ICCV 2019 • Soo Ye Kim, Jihyong Oh, Munchurl Kim
Joint SR and ITM is an intricate task, where high frequency details must be restored for SR, jointly with the local contrast, for ITM.
no code implementations • 5 Apr 2019 • Woonsung Park, Munchurl Kim
Learned from the lesson of the conventional video coding, a B-frame coding structure is incorporated in our BP-DVC Net.
no code implementations • 30 Mar 2019 • Taimoor Tariq, Juan Luis Gonzalez, Munchurl Kim
We identify regions in input images, based on the underlying spatial frequency, which are not generally well reconstructed during Super-Resolution but are most important in terms of visual sensitivity.
no code implementations • 20 Mar 2019 • Juan Luis Gonzalez Bello, Munchurl Kim
Convolutional neural networks (CNN) have shown state-of-the-art results for low-level computer vision problems such as stereo and monocular disparity estimations, but still, have much room to further improve their performance in terms of accuracy, numbers of parameters, etc.
no code implementations • 14 Feb 2019 • Soomin Seo, Sehwan Ki, Munchurl Kim
Recently, due to the strength of deep convolutional neural networks (CNN), many CNN-based image quality assessment (IQA) models have been studied.
1 code implementation • 21 Dec 2018 • Soo Ye Kim, Jeongyeon Lim, Taeyoung Na, Munchurl Kim
In video super-resolution, the spatio-temporal coherence between, and among the frames must be exploited appropriately for accurate prediction of the high resolution frames.
no code implementations • ECCV 2020 • Taimoor Tariq, Okan Tarhan Tursun, Munchurl Kim, Piotr Didyk
In particular, we focus our analysis on fundamental aspects of human perception, such as the contrast sensitivity and orientation selectivity.
no code implementations • 7 Oct 2018 • Juan Luis Gonzalez, Muhammad Sarmad, Hyunjoo J. Lee, Munchurl Kim
We show a supervised end-to-end training of our proposed networks for optical flow and disparity estimations, and an unsupervised end-to-end training for monocular depth and pose estimations.
no code implementations • 3 Oct 2018 • Andrey Ignatov, Radu Timofte, Thang Van Vu, Tung Minh Luu, Trung X. Pham, Cao Van Nguyen, Yongwoo Kim, Jae-Seok Choi, Munchurl Kim, Jie Huang, Jiewen Ran, Chen Xing, Xingguang Zhou, Pengfei Zhu, Mingrui Geng, Yawei Li, Eirikur Agustsson, Shuhang Gu, Luc van Gool, Etienne de Stoutz, Nikolay Kobyshev, Kehui Nie, Yan Zhao, Gen Li, Tong Tong, Qinquan Gao, Liu Hanwen, Pablo Navarrete Michelini, Zhu Dan, Hu Fengshuo, Zheng Hui, Xiumei Wang, Lirui Deng, Rang Meng, Jinghui Qin, Yukai Shi, Wushao Wen, Liang Lin, Ruicheng Feng, Shixiang Wu, Chao Dong, Yu Qiao, Subeesh Vasu, Nimisha Thekke Madam, Praveen Kandula, A. N. Rajagopalan, Jie Liu, Cheolkon Jung
This paper reviews the first challenge on efficient perceptual image enhancement with the focus on deploying deep learning models on smartphones.
no code implementations • 7 Nov 2017 • Jae-Seok Choi, Munchurl Kim
To the best of our knowledge, we are the first to incorporate MU into SR applications and show promising performance results.