Search Results for author: In So Kweon

Found 185 papers, 60 papers with code

Global-and-Local Relative Position Embedding for Unsupervised Video Summarization

no code implementations ECCV 2020 Yunjae Jung, Donghyeon Cho, Sanghyun Woo, In So Kweon

In order to summarize a content video properly, it is important to grasp the sequential structure of video as well as the long-term dependency between frames.

Computational Efficiency Position +1

360 in the Wild: Dataset for Depth Prediction and View Synthesis

no code implementations27 Jun 2024 Kibaek Park, Francois Rameau, Jaesik Park, In So Kweon

The large abundance of perspective camera datasets facilitated the emergence of novel learning-based strategies for various tasks, such as camera localization, single image depth estimation, or view synthesis.

Camera Localization Depth Estimation +1

Exploring the Spectrum of Visio-Linguistic Compositionality and Recognition

1 code implementation13 Jun 2024 Youngtaek Oh, Pyunghwan Ahn, Jinhyung Kim, Gwangmo Song, Soonyoung Lee, In So Kweon, Junmo Kim

Vision and language models (VLMs) such as CLIP have showcased remarkable zero-shot recognition abilities yet face challenges in visio-linguistic compositionality, particularly in linguistic comprehension and fine-grained image-text alignment.

Retrieval Zero-Shot Learning

Enhancing Temporal Consistency in Video Editing by Reconstructing Videos with 3D Gaussian Splatting

no code implementations4 Jun 2024 Inkyu Shin, Qihang Yu, Xiaohui Shen, In So Kweon, Kuk-Jin Yoon, Liang-Chieh Chen

In the second stage, we leverage the reconstruction ability developed in the first stage to impose the temporal constraints on the video diffusion model.

Video Editing Video Reconstruction

MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark

no code implementations CVPR 2024 Sanghyun Woo, KwanYong Park, Inkyu Shin, Myungchul Kim, In So Kweon

Multi-target multi-camera tracking is a crucial task that involves identifying and tracking individuals over time using video streams from multiple cameras.

Anomaly Detection Human Detection +1

Towards Understanding Dual BN In Hybrid Adversarial Training

no code implementations28 Mar 2024 Chenshuang Zhang, Chaoning Zhang, Kang Zhang, Axi Niu, Junmo Kim, In So Kweon

There is a growing concern about applying batch normalization (BN) in adversarial training (AT), especially when the model is trained on both adversarial samples and clean samples (termed Hybrid-AT).

ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object

1 code implementation CVPR 2024 Chenshuang Zhang, Fei Pan, Junmo Kim, In So Kweon, Chengzhi Mao

In this work, we introduce generative model as a data source for synthesizing hard images that benchmark deep models' robustness.

Benchmarking

DifAugGAN: A Practical Diffusion-style Data Augmentation for GAN-based Single Image Super-resolution

no code implementations30 Nov 2023 Axi Niu, Kang Zhang, Joshua Tian Jin Tee, Trung X. Pham, Jinqiu Sun, Chang D. Yoo, In So Kweon, Yanning Zhang

It is well known the adversarial optimization of GAN-based image super-resolution (SR) methods makes the preceding SR model generate unpleasant and undesirable artifacts, leading to large distortion.

Attribute Data Augmentation +1

Blurry Video Compression: A Trade-off between Visual Enhancement and Data Compression

no code implementations8 Nov 2023 Dawit Mureja Argaw, Junsik Kim, In So Kweon

Existing video compression (VC) methods primarily aim to reduce the spatial and temporal redundancies between consecutive frames in a video while preserving its quality.

Data Compression Video Compression

Long-range Multimodal Pretraining for Movie Understanding

no code implementations ICCV 2023 Dawit Mureja Argaw, Joon-Young Lee, Markus Woodson, In So Kweon, Fabian Caba Heilbron

While great progress has been attained, there is still a need for a pretrained multimodal model that can perform well in the ever-growing set of movie understanding tasks the community has been establishing.

ACDMSR: Accelerated Conditional Diffusion Models for Single Image Super-Resolution

no code implementations3 Jul 2023 Axi Niu, Pham Xuan Trung, Kang Zhang, Jinqiu Sun, Yu Zhu, In So Kweon, Yanning Zhang

To speed up inference and further enhance the performance, our research revisits diffusion models in image super-resolution and proposes a straightforward yet significant diffusion model-based super-resolution method called ACDMSR (accelerated conditional diffusion model for image super-resolution).

Denoising Image Super-Resolution +1

Learning from Multi-Perception Features for Real-Word Image Super-resolution

no code implementations26 May 2023 Axi Niu, Kang Zhang, Trung X. Pham, Pei Wang, Jinqiu Sun, In So Kweon, Yanning Zhang

Currently, there are two popular approaches for addressing real-world image super-resolution problems: degradation-estimation-based and blind-based methods.

Image Super-Resolution

Attack-SAM: Towards Attacking Segment Anything Model With Adversarial Examples

no code implementations1 May 2023 Chenshuang Zhang, Chaoning Zhang, Taegoo Kang, Donghun Kim, Sung-Ho Bae, In So Kweon

Beyond the basic goal of mask removal, we further investigate and find that it is possible to generate any desired mask by the adversarial attack.

Adversarial Attack Adversarial Robustness

Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation

no code implementations10 Apr 2023 Inkyu Shin, Dahun Kim, Qihang Yu, Jun Xie, Hong-Seok Kim, Bradley Green, In So Kweon, Kuk-Jin Yoon, Liang-Chieh Chen

The meta architecture of the proposed Video-kMaX consists of two components: within clip segmenter (for clip-level segmentation) and cross-clip associater (for association beyond clips).

Segmentation Video Panoptic Segmentation +1

Complementary Random Masking for RGB-Thermal Semantic Segmentation

1 code implementation30 Mar 2023 Ukcheol Shin, Kyunghyun Lee, In So Kweon, Jean Oh

Also, the proposed self-distillation loss encourages the network to extract complementary and meaningful representations from a single modality or complementary masked modalities.

Scene Understanding Semantic Segmentation +1

Hindi as a Second Language: Improving Visually Grounded Speech with Semantically Similar Samples

no code implementations30 Mar 2023 Hyeonggon Ryu, Arda Senocak, In So Kweon, Joon Son Chung

The objective of this work is to explore the learning of visually grounded speech models (VGS) from multilingual perspective.

Cross-Modal Retrieval Retrieval

TTA-COPE: Test-Time Adaptation for Category-Level Object Pose Estimation

no code implementations CVPR 2023 Taeyeop Lee, Jonathan Tremblay, Valts Blukis, Bowen Wen, Byeong-Uk Lee, Inkyu Shin, Stan Birchfield, In So Kweon, Kuk-Jin Yoon

Unlike previous unsupervised domain adaptation methods for category-level object pose estimation, our approach processes the test data in a sequential, online manner, and it does not require access to the source domain at runtime.

Object Pose Estimation +2

A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI

no code implementations23 Mar 2023 Chenshuang Zhang, Chaoning Zhang, Sheng Zheng, Mengchun Zhang, Maryam Qamar, Sung-Ho Bae, In So Kweon

This work conducts a survey on audio diffusion model, which is complementary to existing surveys that either lack the recent progress of diffusion-based speech synthesis or highlight an overall picture of applying diffusion model in multiple fields.

Speech Enhancement Speech Synthesis +1

Self-Sufficient Framework for Continuous Sign Language Recognition

no code implementations21 Mar 2023 Youngjoon Jang, Youngtaek Oh, Jae Won Cho, Myungchul Kim, Dong-Jin Kim, In So Kweon, Joon Son Chung

The goal of this work is to develop self-sufficient framework for Continuous Sign Language Recognition (CSLR) that addresses key issues of sign language recognition.

Pseudo Label Sign Language Recognition

Text-to-image Diffusion Models in Generative AI: A Survey

no code implementations14 Mar 2023 Chenshuang Zhang, Chaoning Zhang, Mengchun Zhang, In So Kweon

This survey reviews text-to-image diffusion models in the context that diffusion models have emerged to be popular for a wide range of generative tasks.

text-guided-image-editing

EcoTTA: Memory-Efficient Continual Test-time Adaptation via Self-distilled Regularization

1 code implementation CVPR 2023 Junha Song, Jungsoo Lee, In So Kweon, Sungha Choi

Second, our novel self-distilled regularization controls the output of the meta networks not to deviate significantly from the output of the frozen original networks, thereby preserving well-trained knowledge from the source domain.

Image Classification Semantic Segmentation +1

Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data

no code implementations26 Jan 2023 Dong-Jin Kim, Tae-Hyun Oh, Jinsoo Choi, In So Kweon

We present a novel data-efficient semi-supervised framework to improve the generalization of image captioning models.

Relational Captioning Sentence

ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

13 code implementations CVPR 2023 Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, Saining Xie

This co-design of self-supervised learning techniques and architectural improvement results in a new model family called ConvNeXt V2, which significantly improves the performance of pure ConvNets on various recognition benchmarks, including ImageNet classification, COCO detection, and ADE20K segmentation.

Object Detection Representation Learning +2

Deep Depth Estimation From Thermal Image

1 code implementation CVPR 2023 Ukcheol Shin, Jinsun Park, In So Kweon

Secondly, we conduct an exhaustive validation process of monocular and stereo depth estimation algorithms designed on visible spectrum bands to benchmark their performance in the thermal image domain.

Autonomous Driving Self-Driving Cars +1

Single View Scene Scale Estimation Using Scale Field

no code implementations CVPR 2023 Byeong-Uk Lee, Jianming Zhang, Yannick Hold-Geoffroy, In So Kweon

In this paper, we propose a single image scale estimation method based on a novel scale field representation.

Spacetime Surface Regularization for Neural Dynamic Scene Reconstruction

no code implementations ICCV 2023 Jaesung Choe, Christopher Choy, Jaesik Park, In So Kweon, Anima Anandkumar

We propose an algorithm, 4DRegSDF, for the spacetime surface regularization to improve the fidelity of neural rendering and reconstruction in dynamic scenes.

Neural Rendering

Bridging Images and Videos: A Simple Learning Framework for Large Vocabulary Video Object Detection

no code implementations20 Dec 2022 Sanghyun Woo, KwanYong Park, Seoung Wug Oh, In So Kweon, Joon-Young Lee

First, no tracking supervisions are in LVIS, which leads to inconsistent learning of detection (with LVIS and TAO) and tracking (only with TAO).

Video Object Detection

Tracking by Associating Clips

no code implementations20 Dec 2022 Sanghyun Woo, KwanYong Park, Seoung Wug Oh, In So Kweon, Joon-Young Lee

The tracking-by-detection paradigm today has become the dominant method for multi-object tracking and works by detecting objects in each frame and then performing data association across frames.

Chunking Management +2

Learning Classifiers of Prototypes and Reciprocal Points for Universal Domain Adaptation

no code implementations16 Dec 2022 Sungsu Hur, Inkyu Shin, KwanYong Park, Sanghyun Woo, In So Kweon

To successfully train our framework, we collect the partial, confident target samples that are classified as known or unknown through on our proposed multi-criteria selection.

Universal Domain Adaptation

Test-time Adaptation in the Dynamic World with Compound Domain Knowledge Management

no code implementations16 Dec 2022 Junha Song, KwanYong Park, Inkyu Shin, Sanghyun Woo, Chaoning Zhang, In So Kweon

In addition, to prevent overfitting of the TTA model, we devise novel regularization which modulates the adaptation rates using domain-similarity between the source and the current target domain.

Denoising Image Classification +4

MATE: Masked Autoencoders are Online 3D Test-Time Learners

1 code implementation ICCV 2023 M. Jehanzeb Mirza, Inkyu Shin, Wei Lin, Andreas Schriebl, Kunyang Sun, Jaesung Choe, Horst Possegger, Mateusz Kozinski, In So Kweon, Kun-Jin Yoon, Horst Bischof

Our MATE is the first Test-Time-Training (TTT) method designed for 3D data, which makes deep networks trained for point cloud classification robust to distribution shifts occurring in test data.

3D Object Classification Point Cloud Classification

Signing Outside the Studio: Benchmarking Background Robustness for Continuous Sign Language Recognition

1 code implementation1 Nov 2022 Youngjoon Jang, Youngtaek Oh, Jae Won Cho, Dong-Jin Kim, Joon Son Chung, In So Kweon

Most existing Continuous Sign Language Recognition (CSLR) benchmarks have fixed backgrounds and are filmed in studios with a static monochromatic background.

Benchmarking Disentanglement +1

One-Shot Neural Fields for 3D Object Understanding

no code implementations21 Oct 2022 Valts Blukis, Taeyeop Lee, Jonathan Tremblay, Bowen Wen, In So Kweon, Kuk-Jin Yoon, Dieter Fox, Stan Birchfield

At test-time, we build the representation from a single RGB input image observing the scene from only one viewpoint.

3D Reconstruction Decoder +3

Moving from 2D to 3D: volumetric medical image classification for rectal cancer staging

1 code implementation13 Sep 2022 Joohyung Lee, Jieun Oh, Inkyu Shin, You-sung Kim, Dae Kyung Sohn, Tae-sung Kim, In So Kweon

In this study, we present a volumetric convolutional neural network to accurately discriminate T2 from T3 stage rectal cancer with rectal MR volumes.

Image Classification Medical Image Classification

Per-Clip Video Object Segmentation

1 code implementation CVPR 2022 KwanYong Park, Sanghyun Woo, Seoung Wug Oh, In So Kweon, Joon-Young Lee

In this per-clip inference scheme, we update the memory with an interval and simultaneously process a set of consecutive frames (i. e. clip) between the memory updates.

Object Segmentation +3

Generative Bias for Robust Visual Question Answering

1 code implementation CVPR 2023 Jae Won Cho, Dong-Jin Kim, Hyeonggon Ryu, In So Kweon

In this work, in order to better learn the bias a target VQA model suffers from, we propose a generative method to train the bias model directly from the target model, called GenB.

Knowledge Distillation Question Answering +1

A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond

no code implementations30 Jul 2022 Chaoning Zhang, Chenshuang Zhang, Junha Song, John Seon Keun Yi, Kang Zhang, In So Kweon

Masked autoencoders are scalable vision learners, as the title of MAE \cite{he2022masked}, which suggests that self-supervised learning (SSL) in vision might undertake a similar trajectory as in NLP.

Contrastive Learning Denoising +1

Decoupled Adversarial Contrastive Learning for Self-supervised Adversarial Robustness

2 code implementations22 Jul 2022 Chaoning Zhang, Kang Zhang, Chenshuang Zhang, Axi Niu, Jiu Feng, Chang D. Yoo, In So Kweon

Adversarial training (AT) for robust representation learning and self-supervised learning (SSL) for unsupervised representation learning are two active research fields.

Adversarial Robustness Contrastive Learning +3

DRL-ISP: Multi-Objective Camera ISP with Deep Reinforcement Learning

no code implementations7 Jul 2022 Ukcheol Shin, Kyunghyun Lee, In So Kweon

In this paper, we propose a multi-objective camera ISP framework that utilizes Deep Reinforcement Learning (DRL) and camera ISP toolbox that consist of network-based and conventional ISP tools.

Denoising Image Restoration +5

Dual Temperature Helps Contrastive Learning Without Many Negative Samples: Towards Understanding and Simplifying MoCo

2 code implementations CVPR 2022 Chaoning Zhang, Kang Zhang, Trung X. Pham, Axi Niu, Zhinan Qiao, Chang D. Yoo, In So Kweon

Contrastive learning (CL) is widely known to require many negative samples, 65536 in MoCo for instance, for which the performance of a dictionary-free framework is often inferior because the negative sample size (NSS) is limited by its mini-batch size (MBS).

Contrastive Learning

Investigating Top-$k$ White-Box and Transferable Black-box Attack

no code implementations30 Mar 2022 Chaoning Zhang, Philipp Benz, Adil Karjauv, Jae Won Cho, Kang Zhang, In So Kweon

It is widely reported that stronger I-FGSM transfers worse than simple FGSM, leading to a popular belief that transferability is at odds with the white-box attack strength.

Long-term Video Frame Interpolation via Feature Propagation

no code implementations CVPR 2022 Dawit Mureja Argaw, In So Kweon

We argue that when there is a large gap between inputs, instead of estimating imprecise motion that will eventually lead to inaccurate interpolation, we can safely propagate from one side of the input up to a reliable time frame using the other input as a reference.

Motion Estimation Video Frame Interpolation

Audio-Visual Fusion Layers for Event Type Aware Video Recognition

no code implementations12 Feb 2022 Arda Senocak, Junsik Kim, Tae-Hyun Oh, Hyeonggon Ryu, DIngzeyu Li, In So Kweon

Human brain is continuously inundated with the multisensory information and their complex interactions coming from the outside world at any given moment.

Multi-Task Learning Video Recognition +1

Fast Adversarial Training with Noise Augmentation: A Unified Perspective on RandStart and GradAlign

no code implementations11 Feb 2022 Axi Niu, Kang Zhang, Chaoning Zhang, Chenshuang Zhang, In So Kweon, Chang D. Yoo, Yanning Zhang

The former works only for a relatively small perturbation 8/255 with the l_\infty constraint, and GradAlign improves it by extending the perturbation size to 16/255 (with the l_\infty constraint) but at the cost of being 3 to 4 times slower.

Data Augmentation

Learning Sound Localization Better From Semantically Similar Samples

no code implementations7 Feb 2022 Arda Senocak, Hyeonggon Ryu, Junsik Kim, In So Kweon

Thus, these semantically correlated pairs, "hard positives", are mistakenly grouped as negatives.

Contrastive Learning

Maximizing Self-supervision from Thermal Image for Effective Self-supervised Learning of Depth and Ego-motion

1 code implementation12 Jan 2022 Ukcheol Shin, Kyunghyun Lee, Byeong-Uk Lee, In So Kweon

Based on the analysis, we propose an effective thermal image mapping method that significantly increases image information, such as overall structure, contrast, and details, while preserving temporal consistency.

Depth Estimation Self-Supervised Learning

MC-Calib: A generic and robust calibration toolbox for multi-camera systems

1 code implementation Computer Vision and Image Understanding 2022 Francois Rameau, Jinsun Park, Oleksandr Bailo, In So Kweon

In this paper, we present MC-Calib, a novel and robust toolbox dedicated to the calibration of complex synchronized multi-camera systems using an arbitrary number of fiducial marker-based patterns.

Camera Calibration

Investigating Top-k White-Box and Transferable Black-Box Attack

no code implementations CVPR 2022 Chaoning Zhang, Philipp Benz, Adil Karjauv, Jae Won Cho, Kang Zhang, In So Kweon

It is widely reported that stronger I-FGSM transfers worse than simple FGSM, leading to a popular belief that transferability is at odds with the white-box attack strength.

Facial Depth and Normal Estimation using Single Dual-Pixel Camera

no code implementations25 Nov 2021 Minjun Kang, Jaesung Choe, Hyowon Ha, Hae-Gon Jeon, Sunghoon Im, In So Kweon, Kuk-Jin Yoon

Many mobile manufacturers recently have adopted Dual-Pixel (DP) sensors in their flagship models for faster auto-focus and aesthetic image captures.

UDA-COPE: Unsupervised Domain Adaptation for Category-level Object Pose Estimation

no code implementations CVPR 2022 Taeyeop Lee, Byeong-Uk Lee, Inkyu Shin, Jaesung Choe, Ukcheol Shin, In So Kweon, Kuk-Jin Yoon

Inspired by recent multi-modal UDA techniques, the proposed method exploits a teacher-student self-supervised learning scheme to train a pose estimation network without using target domain pose labels.

6D Pose Estimation using RGBD Object +2

Deep Point Cloud Reconstruction

no code implementations ICLR 2022 Jaesung Choe, Byeongin Joung, Francois Rameau, Jaesik Park, In So Kweon

In particular, we further improve the performance of transformer by a newly proposed module called amplified positional encoding.

Denoising Point Cloud Completion +2

Self-Supervised Real-time Video Stabilization

no code implementations10 Nov 2021 Jinsoo Choi, Jaesik Park, In So Kweon

Videos are a popular media form, where online video streaming has recently gathered much popularity.

Video Stabilization

Attentive and Contrastive Learning for Joint Depth and Motion Field Estimation

no code implementations ICCV 2021 Seokju Lee, Francois Rameau, Fei Pan, In So Kweon

Experiments on KITTI, Cityscapes, and Waymo Open Dataset demonstrate the relevance of our approach and show that our method outperforms state-of-the-art algorithms for the tasks of self-supervised monocular depth estimation, object motion segmentation, monocular scene flow estimation, and visual odometry.

Contrastive Learning Monocular Depth Estimation +5

Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to CNNs

1 code implementation6 Oct 2021 Philipp Benz, Soomin Ham, Chaoning Zhang, Adil Karjauv, In So Kweon

Thus, it is critical for the community to know whether the newly proposed ViT and MLP-Mixer are also vulnerable to adversarial attacks.

Adversarial Attack Adversarial Robustness

Early Stop And Adversarial Training Yield Better surrogate Model: Very Non-Robust Features Harm Adversarial Transferability

no code implementations29 Sep 2021 Chaoning Zhang, Gyusang Cho, Philipp Benz, Kang Zhang, Chenshuang Zhang, Chan-Hyun Youn, In So Kweon

The transferability of adversarial examples (AE); known as adversarial transferability, has attracted significant attention because it can be exploited for TransferableBlack-box Attacks (TBA).

Attribute

ACP++: Action Co-occurrence Priors for Human-Object Interaction Detection

1 code implementation9 Sep 2021 Dong-Jin Kim, Xiao Sun, Jinsoo Choi, Stephen Lin, In So Kweon

A common problem in the task of human-object interaction (HOI) detection is that numerous HOI classes have only a small number of labeled examples, resulting in training sets with a long-tailed distribution.

Human-Object Interaction Detection

Category-Level Metric Scale Object Shape and Pose Estimation

no code implementations1 Sep 2021 Taeyeop Lee, Byeong-Uk Lee, Myungchul Kim, In So Kweon

Our framework has two branches: the Metric Scale Object Shape branch (MSOS) and the Normalized Object Coordinate Space branch (NOCS).

Object object-detection +2

VolumeFusion: Deep Depth Fusion for 3D Scene Reconstruction

no code implementations ICCV 2021 Jaesung Choe, Sunghoon Im, Francois Rameau, Minjun Kang, In So Kweon

To reconstruct a 3D scene from a set of calibrated views, traditional multi-view stereo techniques rely on two distinct stages: local depth maps computation and global depth maps fusion.

3D Reconstruction 3D Scene Reconstruction +1

Learning Open-World Object Proposals without Learning to Classify

4 code implementations15 Aug 2021 Dahun Kim, Tsung-Yi Lin, Anelia Angelova, In So Kweon, Weicheng Kuo

In this paper, we identify that the problem is that the binary classifiers in existing proposal methods tend to overfit to the training categories.

Object object-detection +4

Correlate-and-Excite: Real-Time Stereo Matching via Guided Cost Volume Excitation

1 code implementation12 Aug 2021 Antyanta Bangunharcana, Jae Won Cho, Seokju Lee, In So Kweon, Kyung-Soo Kim, Soohyun Kim

Volumetric deep learning approach towards stereo matching aggregates a cost volume computed from input left and right images using 3D convolutions.

Stereo Matching

LabOR: Labeling Only if Required for Domain Adaptive Semantic Segmentation

no code implementations ICCV 2021 Inkyu Shin, Dong-Jin Kim, Jae Won Cho, Sanghyun Woo, KwanYong Park, In So Kweon

In order to find the uncertain points, we generate an inconsistency mask using the proposed adaptive pixel selector and we label these segment-based regions to achieve near supervised performance with only a small fraction (about 2. 2%) ground truth points, which we call "Segment based Pixel-Labeling (SPL)".

Semantic Segmentation Unsupervised Domain Adaptation

MCDAL: Maximum Classifier Discrepancy for Active Learning

1 code implementation23 Jul 2021 Jae Won Cho, Dong-Jin Kim, Yunjae Jung, In So Kweon

Recent state-of-the-art active learning methods have mostly leveraged Generative Adversarial Networks (GAN) for sample acquisition; however, GAN is usually known to suffer from instability and sensitivity to hyper-parameters.

Active Learning Classification +3

Unsupervised Domain Adaptation for Video Semantic Segmentation

no code implementations23 Jul 2021 Inkyu Shin, KwanYong Park, Sanghyun Woo, In So Kweon

In this work, we present a new video extension of this task, namely Unsupervised Domain Adaptation for Video Semantic Segmentation.

Semantic Segmentation Unsupervised Domain Adaptation +1

Learning to Associate Every Segment for Video Panoptic Segmentation

no code implementations CVPR 2021 Sanghyun Woo, Dahun Kim, Joon-Young Lee, In So Kweon

Temporal correspondence - linking pixels or objects across frames - is a fundamental supervisory signal for the video models.

Ranked #6 on Video Panoptic Segmentation on Cityscapes-VPS (using extra training data)

Video Panoptic Segmentation

DASO: Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced Semi-Supervised Learning

1 code implementation CVPR 2022 Youngtaek Oh, Dong-Jin Kim, In So Kweon

The capability of the traditional semi-supervised learning (SSL) methods is far from real-world application due to severely biased pseudo-labels caused by (1) class imbalance and (2) class distribution mismatch between labeled and unlabeled data.

imbalanced classification Pseudo Label +1

Restoration of Video Frames from a Single Blurred Image with Motion Understanding

no code implementations19 Apr 2021 Dawit Mureja Argaw, Junsik Kim, Francois Rameau, Chaoning Zhang, In So Kweon

We formulate video restoration from a single blurred image as an inverse problem by setting clean image sequence and their respective motion as latent factors, and the blurred image as an observation.

Decoder Video Restoration

Depth Completion using Plane-Residual Representation

no code implementations CVPR 2021 Byeong-Uk Lee, Kyunghyun Lee, In So Kweon

The basic framework of depth completion is to predict a pixel-wise dense depth map using very sparse input data.

Depth Completion Depth Estimation +2

Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task through Knowledge Distillation

no code implementations13 Apr 2021 Jae Won Cho, Dong-Jin Kim, Jinsoo Choi, Yunjae Jung, In So Kweon

In this work, we address the issues of missing modalities that have arisen from the Visual Question Answer-Difference prediction task and find a novel method to solve the task at hand.

Knowledge Distillation Visual Question Answering (VQA)

Universal Adversarial Training with Class-Wise Perturbations

no code implementations7 Apr 2021 Philipp Benz, Chaoning Zhang, Adil Karjauv, In So Kweon

The SOTA universal adversarial training (UAT) method optimizes a single perturbation for all training samples in the mini-batch.

Adversarial Robustness

Volumetric Propagation Network: Stereo-LiDAR Fusion for Long-Range Depth Estimation

no code implementations24 Mar 2021 Jaesung Choe, Kyungdon Joo, Tooba Imtiaz, In So Kweon

The key idea of our network is to exploit sparse and accurate point clouds as a cue for guiding correspondences of stereo images in a unified 3D volume space.

Depth Completion Sensor Fusion +3

Stereo Object Matching Network

no code implementations23 Mar 2021 Jaesung Choe, Kyungdon Joo, Francois Rameau, In So Kweon

This paper presents a stereo object matching method that exploits both 2D contextual information from images as well as 3D object-level information.

3D Object Detection Depth Estimation +2

Motion-blurred Video Interpolation and Extrapolation

no code implementations4 Mar 2021 Dawit Mureja Argaw, Junsik Kim, Francois Rameau, In So Kweon

Abrupt motion of camera or objects in a scene result in a blurry video, and therefore recovering high quality video requires two types of enhancements: visual enhancement and temporal upsampling.

Deblurring Optical Flow Estimation

Optical Flow Estimation from a Single Motion-blurred Image

no code implementations4 Mar 2021 Dawit Mureja Argaw, Junsik Kim, Francois Rameau, Jae Won Cho, In So Kweon

A flow estimator network is then used to estimate optical flow from the decoded features in a coarse-to-fine manner.

Deblurring Optical Flow Estimation +1

A Survey On Universal Adversarial Attack

1 code implementation2 Mar 2021 Chaoning Zhang, Philipp Benz, Chenguo Lin, Adil Karjauv, Jing Wu, In So Kweon

The intriguing phenomenon of adversarial examples has attracted significant attention in machine learning and what might be more surprising to the community is the existence of universal adversarial perturbations (UAPs), i. e. a single perturbation to fool the target DNN for most images.

Adversarial Attack

Universal Adversarial Perturbations Through the Lens of Deep Steganography: Towards A Fourier Perspective

no code implementations12 Feb 2021 Chaoning Zhang, Philipp Benz, Adil Karjauv, In So Kweon

We perform task-specific and joint analysis and reveal that (a) frequency is a key factor that influences their performance based on the proposed entropy metric for quantifying the frequency distribution; (b) their success can be attributed to a DNN being highly sensitive to high-frequency content.

Decoder

Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency

1 code implementation4 Feb 2021 Seokju Lee, Sunghoon Im, Stephen Lin, In So Kweon

We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion and depth in a monocular camera setup without supervision.

Instance Segmentation Monocular Depth Estimation +5

Data-Free Universal Adversarial Perturbation and Black-Box Attack

no code implementations ICCV 2021 Chaoning Zhang, Philipp Benz, Adil Karjauv, In So Kweon

For a more practical universal attack, our investigation of untargeted UAP focuses on alleviating the dependence on the original training samples, from removing the need for sample labels to limiting the sample size.

Learning Representations by Contrasting Clusters While Bootstrapping Instances

no code implementations1 Jan 2021 Junsoo Lee, Hojoon Lee, Inkyu Shin, Jaekyoung Bae, In So Kweon, Jaegul Choo

Learning visual representations using large-scale unlabelled images is a holy grail for most of computer vision tasks.

Clustering Contrastive Learning +5

Towards Robust Data Hiding Against (JPEG) Compression: A Pseudo-Differentiable Deep Learning Approach

1 code implementation30 Dec 2020 Chaoning Zhang, Adil Karjauv, Philipp Benz, In So Kweon

Recently, deep learning has shown large success in data hiding, while non-differentiability of JPEG makes it challenging to train a deep pipeline for improving robustness against lossy compression.

The Devil is in the Boundary: Exploiting Boundary Representation for Basis-based Instance Segmentation

no code implementations26 Nov 2020 Myungchul Kim, Sanghyun Woo, Dahun Kim, In So Kweon

In this work, we propose Boundary Basis based Instance Segmentation(B2Inst) to learn a global boundary representation that can complement existing global-mask-based methods that are often lacking high-frequency details.

Instance Segmentation Scene Understanding +2

Robustness May Be at Odds with Fairness: An Empirical Study on Class-wise Accuracy

no code implementations26 Oct 2020 Philipp Benz, Chaoning Zhang, Adil Karjauv, In So Kweon

Adversarial training is the most widely used technique for improving adversarial robustness to strong white-box attacks.

Adversarial Robustness Autonomous Driving +1

Dense Relational Image Captioning via Multi-task Triple-Stream Networks

1 code implementation8 Oct 2020 Dong-Jin Kim, Tae-Hyun Oh, Jinsoo Choi, In So Kweon

To this end, we propose the multi-task triple-stream network (MTTSNet) which consists of three recurrent units responsible for each POS which is trained by jointly predicting the correct captions and POS for each word.

Graph Generation Object +4

Revisiting Batch Normalization for Improving Corruption Robustness

no code implementations7 Oct 2020 Philipp Benz, Chaoning Zhang, Adil Karjauv, In So Kweon

We find that simply estimating and adapting the BN statistics on a few (32 for instance) representation samples, without retraining the model, improves the corruption robustness by a large margin on several benchmark datasets with a wide range of model architectures.

CD-UAP: Class Discriminative Universal Adversarial Perturbation

no code implementations7 Oct 2020 Chaoning Zhang, Philipp Benz, Tooba Imtiaz, In So Kweon

Since the proposed attack generates a universal adversarial perturbation that is discriminative to targeted and non-targeted classes, we term it class discriminative universal adversarial perturbation (CD-UAP).

Double Targeted Universal Adversarial Perturbations

1 code implementation7 Oct 2020 Philipp Benz, Chaoning Zhang, Tooba Imtiaz, In So Kweon

This universal perturbation attacks one targeted source class to sink class, while having a limited adversarial effect on other non-targeted source classes, for avoiding raising suspicions.

Autonomous Driving

Detecting Human-Object Interactions with Action Co-occurrence Priors

1 code implementation17 Jul 2020 Dong-Jin Kim, Xiao Sun, Jinsoo Choi, Stephen Lin, In So Kweon

A common problem in human-object interaction (HOI) detection task is that numerous HOI classes have only a small number of labeled examples, resulting in training sets with a long-tailed distribution.

Human-Object Interaction Detection

Video Panoptic Segmentation

1 code implementation CVPR 2020 Dahun Kim, Sanghyun Woo, Joon-Young Lee, In So Kweon

In this paper, we propose and explore a new video extension of this task, called video panoptic segmentation.

Ranked #7 on Video Panoptic Segmentation on Cityscapes-VPS (using extra training data)

Instance Segmentation Segmentation +5

Instance-wise Depth and Motion Learning from Monocular Videos

1 code implementation19 Dec 2019 Seokju Lee, Sunghoon Im, Stephen Lin, In So Kweon

We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion and depth in a monocular camera setup without supervision.

Instance Segmentation Monocular Depth Estimation +3

Learning Residual Flow as Dynamic Motion from Stereo Videos

no code implementations16 Sep 2019 Seokju Lee, Sunghoon Im, Stephen Lin, In So Kweon

Based on rigid projective geometry, the estimated stereo depth is used to guide the camera motion estimation, and the depth and camera motion are used to guide the residual flow estimation.

Depth And Camera Motion Motion Estimation +4

Deep Iterative Frame Interpolation for Full-frame Video Stabilization

2 code implementations5 Sep 2019 Jinsoo Choi, In So Kweon

We present a novel deep approach to video stabilization which can generate video frames without cropping and low distortion.

Video Stabilization

Image Captioning with Very Scarce Supervised Data: Adversarial Semi-Supervised Learning Approach

no code implementations IJCNLP 2019 Dong-Jin Kim, Jinsoo Choi, Tae-Hyun Oh, In So Kweon

To this end, our proposed semi-supervised learning method assigns pseudo-labels to unpaired samples via Generative Adversarial Networks to learn the joint distribution of image and caption.

Image Captioning

Propose-and-Attend Single Shot Detector

no code implementations30 Jul 2019 Ho-Deok Jang, Sanghyun Woo, Philipp Benz, Jinsun Park, In So Kweon

We present a simple yet effective prediction module for a one-stage detector.

Learning Loss for Active Learning

7 code implementations CVPR 2019 Donggeun Yoo, In So Kweon

In this paper, we propose a novel active learning method that is simple but task-agnostic, and works efficiently with the deep networks.

Active Learning Image Classification +3

Deep Blind Video Decaptioning by Temporal Aggregation and Recurrence

1 code implementation CVPR 2019 Dahun Kim, Sanghyun Woo, Joon-Young Lee, In So Kweon

Blind video decaptioning is a problem of automatically removing text overlays and inpainting the occluded parts in videos without any input masks.

Decoder Video Denoising +2

DPSNet: End-to-end Deep Plane Sweep Stereo

1 code implementation ICLR 2019 Sunghoon Im, Hae-Gon Jeon, Stephen Lin, In So Kweon

The cost volume is constructed using a differentiable warping process that allows for end-to-end training of the network.

Optical Flow Estimation

Discriminative Feature Learning for Unsupervised Video Summarization

1 code implementation24 Nov 2018 Yunjae Jung, Donghyeon Cho, Dahun Kim, Sanghyun Woo, In So Kweon

The proposed variance loss allows a network to predict output scores for each frame with high discrepancy which enables effective feature learning and significantly improves model performance.

Supervised Video Summarization Unsupervised Video Summarization

Self-Supervised Video Representation Learning with Space-Time Cubic Puzzles

no code implementations24 Nov 2018 Dahun Kim, Donghyeon Cho, In So Kweon

Self-supervised tasks such as colorization, inpainting and zigsaw puzzle have been utilized for visual representation learning for still images, when the number of labeled images is limited or absent at all.

Colorization Representation Learning +2

LinkNet: Relational Embedding for Scene Graph

3 code implementations NeurIPS 2018 Sanghyun Woo, Dahun Kim, Donghyeon Cho, In So Kweon

In this paper, we present a method that improves scene graph generation by explicitly modeling inter-dependency among the entire object instances.

Graph Generation Scene Graph Generation

CBAM: Convolutional Block Attention Module

31 code implementations ECCV 2018 Sanghyun Woo, Jongchan Park, Joon-Young Lee, In So Kweon

We propose Convolutional Block Attention Module (CBAM), a simple yet effective attention module for feed-forward convolutional neural networks.

General Classification Image Classification +1

BAM: Bottleneck Attention Module

10 code implementations17 Jul 2018 Jongchan Park, Sanghyun Woo, Joon-Young Lee, In So Kweon

In this work, we focus on the effect of attention in general deep neural networks.

Neural Architecture Search

Globally Optimal Inlier Set Maximization for Atlanta Frame Estimation

no code implementations CVPR 2018 Kyungdon Joo, Tae-Hyun Oh, In So Kweon, Jean-Charles Bazin

In this work, we describe man-made structures via an appropriate structure assumption, called Atlanta world, which contains a vertical direction (typically the gravity direction) and a set of horizontal directions orthogonal to the vertical direction.

Distort-and-Recover: Color Enhancement using Deep Reinforcement Learning

no code implementations CVPR 2018 Jongchan Park, Joon-Young Lee, Donggeun Yoo, In So Kweon

In addition, we present a 'distort-and-recover' training scheme which only requires high-quality reference images for training instead of input and retouched image pairs.

reinforcement-learning Reinforcement Learning (RL)

Robust Depth Estimation from Auto Bracketed Images

no code implementations CVPR 2018 Sunghoon Im, Hae-Gon Jeon, In So Kweon

As demand for advanced photographic applications on hand-held devices grows, these electronics require the capture of high quality depth.

Depth Estimation Stereo Matching +1

Learning to Localize Sound Source in Visual Scenes

no code implementations CVPR 2018 Arda Senocak, Tae-Hyun Oh, Junsik Kim, Ming-Hsuan Yang, In So Kweon

We show that even with a few supervision, false conclusion is able to be corrected and the source of sound in a visual scene can be localized effectively.

Sound Source Localization

Learning Image Representations by Completing Damaged Jigsaw Puzzles

no code implementations6 Feb 2018 Dahun Kim, Donghyeon Cho, Donggeun Yoo, In So Kweon

The recovery of the aforementioned damage pushes the network to obtain robust and general-purpose representations.

Colorization Representation Learning +2

Intelligent Assistant for People with Low Vision Abilities

1 code implementation PSIVT 2017 Oleksandr Bogdan, Oleg Yurchenko, Oleksandr Bailo, Francois Rameau, Donggeun Yoo, In So Kweon

This paper proposes a wearable system for visually impaired people that can be utilized to obtain an extensive feedback about their surrounding environment.

Question Answering

Light-weight place recognition and loop detection using road markings

1 code implementation20 Oct 2017 Oleksandr Bailo, Francois Rameau, In So Kweon

In this paper, we propose an efficient algorithm for robust place recognition and loop detection using camera information only.

VPGNet: Vanishing Point Guided Network for Lane and Road Marking Detection and Recognition

3 code implementations ICCV 2017 Seokju Lee, Junsik Kim, Jae Shin Yoon, Seunghak Shin, Oleksandr Bailo, Namil Kim, Tae-Hee Lee, Hyun Seok Hong, Seung-Hoon Han, In So Kweon

In this paper, we propose a unified end-to-end trainable multi-task network that jointly handles lane and road marking detection and recognition that is guided by a vanishing point under adverse weather conditions.

Lane Detection

Deltille Grids for Geometric Camera Calibration

no code implementations ICCV 2017 Hyowon Ha, Michal Perdoch, Hatem Alismail, In So Kweon, Yaser Sheikh

The recent proliferation of high resolution cameras presents an opportunity to achieve unprecedented levels of precision in visual 3D reconstruction.

3D Reconstruction Camera Calibration

StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection

no code implementations18 Sep 2017 Sanghyun Woo, Soonmin Hwang, In So Kweon

One-stage object detectors such as SSD or YOLO already have shown promising accuracy with small memory footprint and fast speed.

Gradient-based Camera Exposure Control for Outdoor Mobile Platforms

no code implementations24 Aug 2017 Inwook Shim, Tae-Hyun Oh, Joon-Young Lee, Jinwook Choi, Dong-Geol Choi, In So Kweon

We introduce a novel method to automatically adjust camera exposure for image processing and computer vision applications on mobile robot platforms.

Pedestrian Detection Stereo Matching +2

Two-Phase Learning for Weakly Supervised Object Localization

no code implementations ICCV 2017 Dahun Kim, Donghyeon Cho, Donggeun Yoo, In So Kweon

Weakly supervised semantic segmentation and localiza- tion have a problem of focusing only on the most important parts of an image since they use only image-level annota- tions.

Object Segmentation +5

Noise Robust Depth From Focus Using a Ring Difference Filter

no code implementations CVPR 2017 Jaeheung Surh, Hae-Gon Jeon, Yunwon Park, Sunghoon Im, Hyowon Ha, In So Kweon

With the result from the FM, the role of a DfF pipeline is to determine and recalculate unreliable measurements while enhancing those that are reliable.

A Unified Approach of Multi-scale Deep and Hand-crafted Features for Defocus Estimation

1 code implementation CVPR 2017 Jinsun Park, Yu-Wing Tai, Donghyeon Cho, In So Kweon

In this paper, we introduce robust and synergetic hand-crafted features and a simple but efficient deep feature from a convolutional neural network (CNN) architecture for defocus estimation.

Defocus Estimation Image Generation

Contextually Customized Video Summaries via Natural Language

no code implementations6 Feb 2017 Jinsoo Choi, Tae-Hyun Oh, In So Kweon

Despite the challenging baselines, our method still manages to show comparable or even exceeding performance.

Action-Driven Object Detection with Top-Down Visual Attentions

no code implementations20 Dec 2016 Donggeun Yoo, Sunggyun Park, Kyunghyun Paeng, Joon-Young Lee, In So Kweon

In this paper, we present an "action-driven" detection mechanism using our "top-down" visual attention model.

Object object-detection +1

Refining Geometry from Depth Sensors using IR Shading Images

no code implementations18 Aug 2016 Gyeongmin Choe, Jaesik Park, Yu-Wing Tai, In So Kweon

To resolve the ambiguity in our model between the normals and distances, we utilize an initial 3D mesh from the Kinect fusion and multi-view information to reliably estimate surface details that were not captured and reconstructed by the Kinect fusion.

3D Display Calibration by Visual Pattern Analysis

no code implementations23 Jun 2016 Hyoseok Hwang, Hyun Sung Chang, Dongkyung Nam, In So Kweon

Experimental results demonstrate that our method is quite accurate, about a half order of magnitude higher than prior work; is efficient, spending less than 2 s for computation; and is robust to noise, working well in the SNR regime as low as 6 dB.

Simultaneous Estimation of Near IR BRDF and Fine-Scale Surface Geometry

no code implementations CVPR 2016 Gyeongmin Choe, Srinivasa G. Narasimhan, In So Kweon

Near-Infrared (NIR) images of most materials exhibit less texture or albedo variations making them beneficial for vision tasks such as intrinsic image decomposition and structured light depth estimation.

Depth Estimation Intrinsic Image Decomposition +1

Efficient and Robust Color Consistency for Community Photo Collections

no code implementations CVPR 2016 Jaesik Park, Yu-Wing Tai, Sudipta N. Sinha, In So Kweon

We present a robust low-rank matrix factorization method to estimate the unknown parameters of this model.

High-Quality Depth From Uncalibrated Small Motion Clip

1 code implementation CVPR 2016 Hyowon Ha, Sunghoon Im, Jaesik Park, Hae-Gon Jeon, In So Kweon

We propose a novel approach that generates a high-quality depth map from a set of images captured with a small viewpoint variation, namely small motion clip.

Camera Calibration Vocal Bursts Intensity Prediction

Video-Story Composition via Plot Analysis

no code implementations CVPR 2016 Jinsoo Choi, Tae-Hyun Oh, In So Kweon

Inspired by plot analysis of written stories, our method generates a sequence of video clips ordered in such a way that it reflects plot dynamics and content coherency.

Optical Flow Estimation Patch Matching

Globally Optimal Manhattan Frame Estimation in Real-Time

no code implementations CVPR 2016 Kyungdon Joo, Tae-Hyun Oh, Junsik Kim, In So Kweon

Given a set of surface normals, we pose a Manhattan Frame (MF) estimation problem as a consensus set maximization that maximizes the number of inliers over the rotation search space.

Video Stabilization

Robust and Globally Optimal Manhattan Frame Estimation in Near Real Time

no code implementations12 May 2016 Kyungdon Joo, Tae-Hyun Oh, Junsik Kim, In So Kweon

Most man-made environments, such as urban and indoor scenes, consist of a set of parallel and orthogonal planar structures.

Clustering Video Stabilization