Search Results for author: Seungryong Kim

Found 77 papers, 43 papers with code

DASC: Dense Adaptive Self-Correlation Descriptor for Multi-Modal and Multi-Spectral Correspondence

no code implementations CVPR 2015 Seungryong Kim, Dongbo Min, Bumsub Ham, Seungchul Ryu, Minh N. Do, Kwanghoon Sohn

To further improve the matching quality and runtime efficiency, we propose a patch-wise receptive field pooling, in which a sampling pattern is optimized with a discriminative learning.

Optical Flow Estimation

Deep Self-Convolutional Activations Descriptor for Dense Cross-Modal Correspondence

no code implementations21 Mar 2016 Seungryong Kim, Dongbo Min, Stephen Lin, Kwanghoon Sohn

We present a novel descriptor, called deep self-convolutional activations (DeSCA), designed for establishing dense correspondences between images taken under different imaging modalities, such as different spectral ranges or lighting conditions.

DASC: Robust Dense Descriptor for Multi-modal and Multi-spectral Correspondence Estimation

no code implementations27 Apr 2016 Seungryong Kim, Dongbo Min, Bumsub Ham, Minh N. Do, Kwanghoon Sohn

In this paper, we propose a novel dense descriptor, called dense adaptive self-correlation (DASC), to estimate multi-modal and multi-spectral dense correspondences.

FCSS: Fully Convolutional Self-Similarity for Dense Semantic Correspondence

1 code implementation CVPR 2017 Seungryong Kim, Dongbo Min, Bumsub Ham, Sangryul Jeon, Stephen Lin, Kwanghoon Sohn

The sampling patterns of local structure and the self-similarity measure are jointly learned within the proposed network in an end-to-end and multi-scale manner.

Object Semantic correspondence +1

DCTM: Discrete-Continuous Transformation Matching for Semantic Flow

no code implementations ICCV 2017 Seungryong Kim, Dongbo Min, Stephen Lin, Kwanghoon Sohn

In this way, our approach draws solutions from the continuous space of affine transformations in a manner that can be computed efficiently through constant-time edge-aware filtering and a proposed affine-varying CNN-based descriptor.

Semantic correspondence

PARN: Pyramidal Affine Regression Networks for Dense Semantic Correspondence

no code implementations ECCV 2018 Sangryul Jeon, Seungryong Kim, Dongbo Min, Kwanghoon Sohn

To the best of our knowledge, it is the first work that attempts to estimate dense affine transformation fields in a coarse-to-fine manner within deep networks.

regression Semantic correspondence

Recurrent Transformer Networks for Semantic Correspondence

1 code implementation NeurIPS 2018 Seungryong Kim, Stephen Lin, Sangryul Jeon, Dongbo Min, Kwanghoon Sohn

Our networks accomplish this through an iterative process of estimating spatial transformations between the input images and using these transformations to generate aligned convolutional activations.

General Classification Semantic correspondence

Semantic Attribute Matching Networks

no code implementations CVPR 2019 Seungryong Kim, Dongbo Min, Somi Jeong, Sunok Kim, Sangryul Jeon, Kwanghoon Sohn

SAM-Net accomplishes this through an iterative process of establishing reliable correspondences by reducing the attribute discrepancy between the images and synthesizing attribute transferred images using the learned correspondences.

Attribute

Context-Aware Emotion Recognition Networks

1 code implementation ICCV 2019 Jiyoung Lee, Seungryong Kim, Sunok Kim, Jungin Park, Kwanghoon Sohn

We present deep networks for context-aware emotion recognition, called CAER-Net, that exploit not only human facial expression but also context information in a joint and boosting manner.

Emotion Classification Emotion Recognition in Context

Joint Learning of Semantic Alignment and Object Landmark Detection

no code implementations ICCV 2019 Sangryul Jeon, Dongbo Min, Seungryong Kim, Kwanghoon Sohn

Based on the key insight that the two tasks can mutually provide supervisions to each other, our networks accomplish this through a joint loss function that alternatively imposes a consistency constraint between the two tasks, thereby boosting the performance and addressing the lack of training data in a principled manner.

Object

Cylindrical Convolutional Networks for Joint Object Detection and Viewpoint Estimation

no code implementations CVPR 2020 Sunghun Joung, Seungryong Kim, Hanjae Kim, Minsu Kim, Ig-Jae Kim, Junghyun Cho, Kwanghoon Sohn

To overcome this limitation, we introduce a learnable module, cylindrical convolutional networks (CCNs), that exploit cylindrical representation of a convolutional kernel defined in the 3D space.

Object object-detection +2

Volumetric Transformer Networks

no code implementations ECCV 2020 Seungryong Kim, Sabine Süsstrunk, Mathieu Salzmann

We design our VTN as an encoder-decoder network, with modules dedicated to letting the information flow across the feature channels, to account for the dependencies between the semantic parts.

Fine-Grained Image Recognition Image Retrieval +1

Adaptive confidence thresholding for monocular depth estimation

1 code implementation ICCV 2021 Hyesong Choi, Hunsang Lee, Sunkyung Kim, Sunok Kim, Seungryong Kim, Kwanghoon Sohn, Dongbo Min

To cope with the prediction error of the confidence map itself, we also leverage the threshold network that learns the threshold dynamically conditioned on the pseudo depth maps.

Monocular Depth Estimation Stereo Matching

Online Exemplar Fine-Tuning for Image-to-Image Translation

no code implementations18 Nov 2020 Taewon Kang, Soohyun Kim, Sunwoo Kim, Seungryong Kim

Existing techniques to solve exemplar-based image-to-image translation within deep convolutional neural networks (CNNs) generally require a training phase to optimize the network parameters on domain-specific and task-specific benchmarks, thus having limited applicability and generalization ability.

Image-to-Image Translation Translation

Cross-Domain Grouping and Alignment for Domain Adaptive Semantic Segmentation

1 code implementation15 Dec 2020 Minsu Kim, Sunghun Joung, Seungryong Kim, Jungin Park, Ig-Jae Kim, Kwanghoon Sohn

Existing techniques to adapt semantic segmentation networks across the source and target domains within deep convolutional neural networks (CNNs) deal with all the samples from the two domains in a global or category-aware manner.

Clustering Domain Adaptation +2

On the confidence of stereo matching in a deep-learning era: a quantitative evaluation

1 code implementation2 Jan 2021 Matteo Poggi, Seungryong Kim, Fabio Tosi, Sunok Kim, Filippo Aleotti, Dongbo Min, Kwanghoon Sohn, Stefano Mattoccia

Stereo matching is one of the most popular techniques to estimate dense depth maps by finding the disparity between matching pixels on two, synchronized and rectified images.

Stereo Matching

Modeling Object Dissimilarity for Deep Saliency Prediction

1 code implementation8 Apr 2021 Bahar Aydemir, Deblina Bhattacharjee, Tong Zhang, Seungryong Kim, Mathieu Salzmann, Sabine Süsstrunk

Saliency prediction has made great strides over the past two decades, with current techniques modeling low-level information, such as color, intensity and size contrasts, and high-level ones, such as attention and gaze direction for entire objects.

Object Saliency Prediction

CATs: Cost Aggregation Transformers for Visual Correspondence

1 code implementation NeurIPS 2021 Seokju Cho, Sunghwan Hong, Sangryul Jeon, Yunsung Lee, Kwanghoon Sohn, Seungryong Kim

We propose a novel cost aggregation network, called Cost Aggregation Transformers (CATs), to find dense correspondences between semantically similar images with additional challenges posed by large intra-class appearance and geometric variations.

Semantic correspondence

Deep Matching Prior: Test-Time Optimization for Dense Correspondence

1 code implementation ICCV 2021 Sunghwan Hong, Seungryong Kim

Conventional techniques to establish dense correspondences across visually or semantically similar images focused on designing a task-specific matching prior, which is difficult to model.

 Ranked #1 on Dense Pixel Correspondence Estimation on HPatches (using extra training data)

Dense Pixel Correspondence Estimation Geometric Matching

Learning Canonical 3D Object Representation for Fine-Grained Recognition

no code implementations ICCV 2021 Sunghun Joung, Seungryong Kim, Minsu Kim, Ig-Jae Kim, Kwanghoon Sohn

By incorporating 3D shape and appearance jointly in a deep representation, our method learns the discriminative representation of the object and achieves competitive performance on fine-grained image recognition and vehicle re-identification.

3D Shape Reconstruction Fine-Grained Image Recognition +3

Deep Translation Prior: Test-time Training for Photorealistic Style Transfer

1 code implementation12 Dec 2021 Sunwoo Kim, Soohyun Kim, Seungryong Kim

Recent techniques to solve photorealistic style transfer within deep convolutional neural networks (CNNs) generally require intensive training from large-scale datasets, thus having limited applicability and poor generalization ability to unseen images or styles.

Style Transfer Translation

Call for Customized Conversation: Customized Conversation Grounding Persona and Knowledge

2 code implementations16 Dec 2021 Yoonna Jang, Jungwoo Lim, Yuna Hur, Dongsuk Oh, Suhyune Son, Yeonsoo Lee, Donghoon Shin, Seungryong Kim, Heuiseok Lim

Humans usually have conversations by making use of prior knowledge about a topic and background information of the people whom they are talking to.

Cost Aggregation Is All You Need for Few-Shot Segmentation

2 code implementations22 Dec 2021 Sunghwan Hong, Seokju Cho, Jisu Nam, Seungryong Kim

We introduce a novel cost aggregation network, dubbed Volumetric Aggregation with Transformers (VAT), to tackle the few-shot segmentation task by using both convolutions and transformers to efficiently handle high dimensional correlation maps between query and support.

Few-Shot Semantic Segmentation Inductive Bias +2

Memory-guided Image De-raining Using Time-Lapse Data

no code implementations6 Jan 2022 Jaehoon Cho, Seungryong Kim, Kwanghoon Sohn

To address this problem, we propose a novel network architecture based on a memory network that explicitly helps to capture long-term rain streak information in the time-lapse data.

AggMatch: Aggregating Pseudo Labels for Semi-Supervised Learning

no code implementations25 Jan 2022 Jiwon Kim, Kwangrok Ryoo, Gyuseong Lee, Seokju Cho, Junyoung Seo, Daehwan Kim, Hansang Cho, Seungryong Kim

In this paper, we address this limitation with a novel SSL framework for aggregating pseudo labels, called AggMatch, which refines initial pseudo labels by using different confident instances.

Pseudo Label

CATs++: Boosting Cost Aggregation with Convolutions and Transformers

1 code implementation14 Feb 2022 Seokju Cho, Sunghwan Hong, Seungryong Kim

Cost aggregation is a highly important process in image matching tasks, which aims to disambiguate the noisy matching scores.

Semantic correspondence

InstaFormer: Instance-Aware Image-to-Image Translation with Transformer

1 code implementation CVPR 2022 Soohyun Kim, Jongbeom Baek, JiHye Park, Gyeongnyeon Kim, Seungryong Kim

By augmenting such tokens with an instance-level feature extracted from the content feature with respect to bounding box information, our framework is capable of learning an interaction between object instances and the global image, thus boosting the instance-awareness.

Image-to-Image Translation Translation

Semi-Supervised Learning of Semantic Correspondence with Pseudo-Labels

no code implementations CVPR 2022 Jiwon Kim, Kwangrok Ryoo, Junyoung Seo, Gyuseong Lee, Daehwan Kim, Hansang Cho, Seungryong Kim

In this paper, we present a simple, but effective solution for semantic correspondence that learns the networks in a semi-supervised manner by supplementing few ground-truth correspondences via utilization of a large amount of confident correspondences as pseudo-labels, called SemiMatch.

Data Augmentation Semantic correspondence +1

Joint Learning of Feature Extraction and Cost Aggregation for Semantic Correspondence

no code implementations5 Apr 2022 Jiwon Kim, Youngjo Min, Mira Kim, Seungryong Kim

In this paper, we propose a novel framework for jointly learning feature extraction and cost aggregation for semantic correspondence.

Semantic correspondence

AE-NeRF: Auto-Encoding Neural Radiance Fields for 3D-Aware Object Manipulation

no code implementations28 Apr 2022 Mira Kim, Jaehoon Ko, Kyusun Cho, Junmyeong Choi, Daewon Choi, Seungryong Kim

We propose a novel framework for 3D-aware object manipulation, called Auto-Encoding Neural Radiance Fields (AE-NeRF).

Attribute Disentanglement

Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation

1 code implementation22 Jul 2022 Sunghwan Hong, Seokju Cho, Jisu Nam, Stephen Lin, Seungryong Kim

However, the tokenization of a correlation map for transformer processing can be detrimental, because the discontinuity at token boundaries reduces the local context available near the token edges and decreases inductive bias.

Few-Shot Semantic Segmentation Inductive Bias +1

ConMatch: Semi-Supervised Learning with Confidence-Guided Consistency Regularization

1 code implementation18 Aug 2022 Jiwon Kim, Youngjo Min, Daehwan Kim, Gyuseong Lee, Junyoung Seo, Kwangrok Ryoo, Seungryong Kim

We present a novel semi-supervised learning framework that intelligently leverages the consistency regularization between the model's predictions from two strongly-augmented views of an image, weighted by a confidence of pseudo-label, dubbed ConMatch.

Pseudo Label

LANIT: Language-Driven Image-to-Image Translation for Unlabeled Data

1 code implementation CVPR 2023 JiHye Park, Sunwoo Kim, Soohyun Kim, Seokju Cho, Jaejun Yoo, Youngjung Uh, Seungryong Kim

Existing techniques for image-to-image translation commonly have suffered from two critical problems: heavy reliance on per-sample domain annotation and/or inability of handling multiple attributes per image.

Translation Unsupervised Image-To-Image Translation

Integrative Feature and Cost Aggregation with Transformers for Dense Correspondence

no code implementations19 Sep 2022 Sunghwan Hong, Seokju Cho, Seungryong Kim, Stephen Lin

The current state-of-the-art are Transformer-based approaches that focus on either feature descriptors or cost volume aggregation.

Geometric Matching Semantic correspondence

MIDMs: Matching Interleaved Diffusion Models for Exemplar-based Image Translation

1 code implementation22 Sep 2022 Junyoung Seo, Gyuseong Lee, Seokju Cho, Jiyoung Lee, Seungryong Kim

Specifically, we formulate a diffusion-based matching-and-generation framework that interleaves cross-domain matching and diffusion steps in the latent space by iteratively feeding the intermediate warp into the noising process and denoising it to generate a translated image.

Denoising Translation

Improving Sample Quality of Diffusion Models Using Self-Attention Guidance

4 code implementations ICCV 2023 Susung Hong, Gyuseong Lee, Wooseok Jang, Seungryong Kim

Denoising diffusion models (DDMs) have attracted attention for their exceptional generation quality and diversity.

Denoising Image Generation

Towards Flexible Inductive Bias via Progressive Reparameterization Scheduling

no code implementations4 Oct 2022 Yunsung Lee, Gyuseong Lee, Kwangrok Ryoo, Hyojun Go, JiHye Park, Seungryong Kim

In addition, through Fourier analysis of feature maps, the model's response patterns according to signal frequency changes, we observe which inductive bias is advantageous for each data scale.

Inductive Bias Scheduling

Neural Matching Fields: Implicit Representation of Matching Fields for Visual Correspondence

1 code implementation6 Oct 2022 Sunghwan Hong, Jisu Nam, Seokju Cho, Susung Hong, Sangryul Jeon, Dongbo Min, Seungryong Kim

Existing pipelines of semantic correspondence commonly include extracting high-level semantic features for the invariance against intra-class variations and background clutters.

Semantic correspondence

3D GAN Inversion with Pose Optimization

1 code implementation13 Oct 2022 Jaehoon Ko, Kyusun Cho, Daewon Choi, Kwangrok Ryoo, Seungryong Kim

With the recent advances in NeRF-based 3D aware GANs quality, projecting an image into the latent space of these 3D-aware GANs has a natural advantage over 2D GAN inversion: not only does it allow multi-view consistent editing of the projected image, but it also enables 3D reconstruction and novel view synthesis when given only a single image.

3D Reconstruction Image Reconstruction +1

Controllable Style Transfer via Test-time Training of Implicit Neural Representation

1 code implementation14 Oct 2022 Sunwoo Kim, Youngjo Min, Younghun Jung, Seungryong Kim

We propose a controllable style transfer framework based on Implicit Neural Representation that pixel-wisely controls the stylized output via test-time training.

Model Optimization Style Transfer

SplitNet: Learnable Clean-Noisy Label Splitting for Learning with Noisy Labels

no code implementations20 Nov 2022 Daehwan Kim, Kwangrok Ryoo, Hansang Cho, Seungryong Kim

To address this, some methods were proposed to automatically split clean and noisy labels, and learn a semi-supervised learner in a Learning with Noisy Labels (LNL) framework.

Learning with noisy labels

MaskingDepth: Masked Consistency Regularization for Semi-supervised Monocular Depth Estimation

1 code implementation21 Dec 2022 Jongbeom Baek, Gyeongnyeon Kim, Seonghoon Park, Honggyu An, Matteo Poggi, Seungryong Kim

We propose MaskingDepth, a novel semi-supervised learning framework for monocular depth estimation to mitigate the reliance on large ground-truth depth quantities.

Data Augmentation Domain Adaptation +5

DiffFace: Diffusion-based Face Swapping with Facial Guidance

1 code implementation27 Dec 2022 Kihong Kim, Yunho Kim, Seokju Cho, Junyoung Seo, Jisu Nam, Kychul Lee, Seungryong Kim, Kwanghee Lee

In this paper, we propose a diffusion-based face swapping framework for the first time, called DiffFace, composed of training ID conditional DDPM, sampling with facial guidance, and a target-preserving blending.

Face Swapping

GeCoNeRF: Few-shot Neural Radiance Fields via Geometric Consistency

1 code implementation26 Jan 2023 Min-Seop Kwak, Jiuhn Song, Seungryong Kim

We present a novel framework to regularize Neural Radiance Field (NeRF) in a few-shot setting with a geometry-aware consistency regularization.

Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation

1 code implementation14 Mar 2023 Junyoung Seo, Wooseok Jang, Min-Seop Kwak, Hyeonsu Kim, Jaehoon Ko, Junho Kim, Jin-Hwa Kim, Jiyoung Lee, Seungryong Kim

Text-to-3D generation has shown rapid progress in recent days with the advent of score distillation, a methodology of using pretrained text-to-2D diffusion models to optimize neural radiance field (NeRF) in the zero-shot setting.

Single-View 3D Reconstruction Text to 3D

Few-shot Neural Radiance Fields Under Unconstrained Illumination

no code implementations21 Mar 2023 SeokYeong Lee, Junyong Choi, Seungryong Kim, Ig-Jae Kim, Junghyun Cho

In this paper, we introduce a new challenge for synthesizing novel view images in practical environments with limited input multi-view images and varying lighting conditions.

Novel View Synthesis

CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation

3 code implementations21 Mar 2023 Seokju Cho, Heeseong Shin, Sunghwan Hong, Seungjun An, Seungjun Lee, Anurag Arnab, Paul Hongsuck Seo, Seungryong Kim

However, the problem of transferring these capabilities learned from image-level supervision to the pixel-level task of segmentation and addressing arbitrary unseen categories at inference makes this task challenging.

Image Segmentation Open Vocabulary Semantic Segmentation +3

Debiasing Scores and Prompts of 2D Diffusion for View-consistent Text-to-3D Generation

1 code implementation NeurIPS 2023 Susung Hong, Donghoon Ahn, Seungryong Kim

In this work, we explore existing frameworks for score-distilling text-to-3D generation and identify the main causes of the view inconsistency problem -- the embedded bias of 2D diffusion models.

Language Modelling Text to 3D

PartMix: Regularization Strategy to Learn Part Discovery for Visible-Infrared Person Re-identification

no code implementations CVPR 2023 Minsu Kim, Seungryong Kim, Jungin Park, Seongheon Park, Kwanghoon Sohn

Modern data augmentation using a mixture-based technique can regularize the models from overfitting to the training data in various computer vision applications, but a proper data augmentation technique tailored for the part-based Visible-Infrared person Re-IDentification (VI-ReID) models remains unexplored.

Contrastive Learning Data Augmentation +1

Panoramic Image-to-Image Translation

no code implementations11 Apr 2023 Soohyun Kim, Junho Kim, Taekyung Kim, Hwan Heo, Seungryong Kim, Jiyoung Lee, Jin-Hwa Kim

This task is difficult due to the geometric distortion of panoramic images and the lack of a panoramic image dataset with diverse conditions, like weather or time.

Image-to-Image Translation Translation

DirecT2V: Large Language Models are Frame-Level Directors for Zero-Shot Text-to-Video Generation

1 code implementation23 May 2023 Susung Hong, Junyoung Seo, Heeseong Shin, Sunghwan Hong, Seungryong Kim

In the paradigm of AI-generated content (AIGC), there has been increasing attention to transferring knowledge from pre-trained text-to-image (T2I) models to text-to-video (T2V) generation.

Text-to-Video Generation Video Generation +1

DaRF: Boosting Radiance Fields from Sparse Inputs with Monocular Depth Adaptation

1 code implementation30 May 2023 Jiuhn Song, Seonghoon Park, Honggyu An, Seokju Cho, Min-Seop Kwak, SungJin Cho, Seungryong Kim

Employing monocular depth estimation (MDE) networks, pretrained on large-scale RGB-D datasets, with powerful generalization capability would be a key to solving this problem: however, using MDE in conjunction with NeRF comes with a new set of challenges due to various ambiguity problems exhibited by monocular depths.

Monocular Depth Estimation Novel View Synthesis

Diffusion Model for Dense Matching

1 code implementation30 May 2023 Jisu Nam, Gyuseong Lee, Sunwoo Kim, Hyeonsu Kim, Hyoungwon Cho, Seyeon Kim, Seungryong Kim

The objective for establishing dense correspondence between paired images consists of two terms: a data term and a prior term.

Denoising

User-friendly Image Editing with Minimal Text Input: Leveraging Captioning and Injection Techniques

no code implementations5 Jun 2023 Sunwoo Kim, Wooseok Jang, Hyunsu Kim, Junho Kim, Yunjey Choi, Seungryong Kim, Gayeong Lee

From the users' standpoint, prompt engineering is a labor-intensive process, and users prefer to provide a target word for editing instead of a full sentence.

Prompt Engineering Sentence

Domain Generalization Using Large Pretrained Models with Mixture-of-Adapters

1 code implementation17 Oct 2023 Gyuseong Lee, Wooseok Jang, Jin Hyeon Kim, Jaewoo Jung, Seungryong Kim

By using both PEFT and MoA methods, we effectively alleviate the performance deterioration caused by distribution shifts and achieve state-of-the-art performance on diverse DG benchmarks.

Domain Generalization

Match me if you can: Semantic Correspondence Learning with Unpaired Images

no code implementations30 Nov 2023 Jiwon Kim, Byeongho Heo, Sangdoo Yun, Seungryong Kim, Dongyoon Han

Recent approaches for semantic correspondence have focused on obtaining high-quality correspondences using a complicated network, refining the ambiguous or noisy matching points.

Semantic correspondence

Self-Evolving Neural Radiance Fields

1 code implementation2 Dec 2023 Jaewoo Jung, Jisang Han, Jiwon Kang, Seongchan Kim, Min-Seop Kwak, Seungryong Kim

We formulate few-shot NeRF into a teacher-student framework to guide the network to learn a more robust representation of the scene by training the student with additional pseudo labels generated from the teacher.

3D Reconstruction Novel View Synthesis

Unifying Correspondence, Pose and NeRF for Pose-Free Novel View Synthesis from Stereo Pairs

1 code implementation12 Dec 2023 Sunghwan Hong, Jaewoo Jung, Heeseong Shin, Jiaolong Yang, Seungryong Kim, Chong Luo

This work delves into the task of pose-free novel view synthesis from stereo pairs, a challenging and pioneering task in 3D vision.

Novel View Synthesis Pose Estimation

Universal Noise Annotation: Unveiling the Impact of Noisy annotation on Object Detection

1 code implementation21 Dec 2023 Kwangrok Ryoo, Yeonsik Jo, Seungjun Lee, Mira Kim, Ahra Jo, Seung Hwan Kim, Seungryong Kim, Soonyoung Lee

For object detection task with noisy labels, it is important to consider not only categorization noise, as in image classification, but also localization noise, missing annotations, and bogus bounding boxes.

Image Classification Object +2

Context Enhanced Transformer for Single Image Object Detection

no code implementations22 Dec 2023 Seungjun An, Seonghoon Park, Gyeongnyeon Kim, JeongYeol Baek, Byeongwon Lee, Seungryong Kim

With the increasing importance of video data in real-world applications, there is a rising need for efficient object detection methods that utilize temporal information.

Object object-detection +1

Retrieval-Augmented Score Distillation for Text-to-3D Generation

1 code implementation5 Feb 2024 Junyoung Seo, Susung Hong, Wooseok Jang, Inès Hyeonsu Kim, Minseop Kwak, Doyup Lee, Seungryong Kim

We leverage the retrieved asset to incorporate its geometric prior in the variational objective and adapt the diffusion model's 2D prior toward view consistency, achieving drastic improvements in both geometry and fidelity of generated scenes.

Retrieval Text to 3D

DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization

1 code implementation15 Feb 2024 Jisu Nam, Heesu Kim, Dongjae Lee, Siyoon Jin, Seungryong Kim, Seunggyu Chang

The objective of text-to-image (T2I) personalization is to customize a diffusion model to a user-provided reference concept, generating diverse images of the concept aligned with the target prompts.

Denoising

Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation

no code implementations21 Feb 2024 Kihong Kim, Haneol Lee, JiHye Park, Seyeon Kim, Kwanghee Lee, Seungryong Kim, Jaejun Yoo

Generating high-quality videos that synthesize desired realistic content is a challenging task due to their intricate high-dimensionality and complexity of videos.

Video Generation Video Reconstruction

LatentSwap: An Efficient Latent Code Mapping Framework for Face Swapping

no code implementations28 Feb 2024 Changho Choi, Minho Kim, Junhyeok Lee, Hyoung-Kyu Song, Younggeun Kim, Seungryong Kim

We show that our framework is applicable to other generators such as StyleNeRF, paving a way to 3D-aware face swapping and is also compatible with other downstream StyleGAN2 generator tasks.

Face Swapping

HeteroSwitch: Characterizing and Taming System-Induced Data Heterogeneity in Federated Learning

no code implementations7 Mar 2024 Gyudong Kim, Mehdi Ghasemi, Soroush Heidari, Seungryong Kim, Young Geun Kim, Sarma Vrudhula, Carole-Jean Wu

Such fragmentation introduces a new type of data heterogeneity in FL, namely \textit{system-induced data heterogeneity}, as each device generates distinct data depending on its hardware and software configurations.

Domain Generalization Fairness +1

Relaxing Accurate Initialization Constraint for 3D Gaussian Splatting

1 code implementation14 Mar 2024 Jaewoo Jung, Jisang Han, Honggyu An, Jiwon Kang, Seonghoon Park, Seungryong Kim

Through extensive analysis of SfM initialization in the frequency domain and analysis of a 1D regression task with multiple 1D Gaussians, we propose a novel optimization strategy dubbed RAIN-GS (Relaxing Accurate Initialization Constraint for 3D Gaussian Splatting), that successfully trains 3D Gaussians from random point clouds.

3D Reconstruction Novel View Synthesis

Unifying Feature and Cost Aggregation with Transformers for Semantic and Visual Correspondence

no code implementations17 Mar 2024 Sunghwan Hong, Seokju Cho, Seungryong Kim, Stephen Lin

In this work, we first show that feature aggregation and cost aggregation exhibit distinct characteristics and reveal the potential for substantial benefits stemming from the judicious use of both aggregation processes.

Geometric Matching

Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance

1 code implementation26 Mar 2024 Donghoon Ahn, Hyoungwon Cho, Jaewon Min, Wooseok Jang, Jungwoo Kim, SeonHwa Kim, Hyun Hee Park, Kyong Hwan Jin, Seungryong Kim

These techniques are often not applicable in unconditional generation or in various downstream tasks such as image restoration.

Deblurring Denoising +2

Guided Semantic Flow

no code implementations ECCV 2020 Sangryul Jeon, Dongbo Min, Seungryong Kim, Jihwan Choe, Kwanghoon Sohn

Establishing dense semantic correspondences requires dealing with large geometric variations caused by the unconstrained setting of images.

Semantic correspondence

Cannot find the paper you are looking for? You can Submit a new open access paper.