Search Results for author: Minsu Cho

Found 107 papers, 50 papers with code

In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation

1 code implementation9 Aug 2024 Dahyun Kang, Minsu Cho

We present lazy visual grounding, a two-stage approach of unsupervised object mask discovery followed by object grounding, for open-vocabulary semantic segmentation.

Object Open Vocabulary Semantic Segmentation +5

Online Temporal Action Localization with Memory-Augmented Transformer

no code implementations6 Aug 2024 Youngkil Song, Dongkeun Kim, Minsu Cho, Suha Kwak

We also propose a novel action localization method that observes the current input segment to predict the end time of the ongoing action and accesses the memory queue to estimate the start time of the action.

Temporal Action Localization

3D Geometric Shape Assembly via Efficient Point Cloud Matching

no code implementations15 Jul 2024 Nahyuk Lee, Juhong Min, Junha Lee, SeungWook Kim, Kanghee Lee, Jaesik Park, Minsu Cho

Building upon PMT, we introduce a new framework, dubbed Proxy Match TransformeR (PMTR), for the geometric assembly task.

Burst Image Super-Resolution with Base Frame Selection

no code implementations25 Jun 2024 Sanghyun Kim, Min Jung Lee, Woohyeok Kim, Deunsol Jung, Jaesung Rim, Sunghyun Cho, Minsu Cho

In this work, we explore using burst shots with non-uniform exposures to confront real-world practical scenarios by introducing a new benchmark dataset, dubbed Non-uniformly Exposed Burst Image (NEBI), that includes the burst frames at varying exposure times to obtain a broader range of irradiance and motion characteristics within a scene.

Burst Image Super-Resolution

Multi-view Image Prompted Multi-view Diffusion for Improved 3D Generation

no code implementations26 Apr 2024 SeungWook Kim, Yichun Shi, Kejie Li, Minsu Cho, Peng Wang

Using image as prompts for 3D generation demonstrate particularly strong performances compared to using text prompts alone, for images provide a more intuitive guidance for the 3D generation process.

3D Generation

Learning SO(3)-Invariant Semantic Correspondence via Local Shape Transform

no code implementations CVPR 2024 Chunghyun Park, SeungWook Kim, Jaesik Park, Minsu Cho

Establishing accurate 3D correspondences between shapes stands as a pivotal challenge with profound implications for computer vision and robotics.

Decoder Semantic correspondence

Enhancing 3D Fidelity of Text-to-3D using Cross-View Correspondences

no code implementations CVPR 2024 SeungWook Kim, Kejie Li, Xueqing Deng, Yichun Shi, Minsu Cho, Peng Wang

Leveraging multi-view diffusion models as priors for 3D optimization have alleviated the problem of 3D consistency, e. g., the Janus face problem or the content drift problem, in zero-shot text-to-3D models.

Common Sense Reasoning Text to 3D

Contrastive Mean-Shift Learning for Generalized Category Discovery

no code implementations CVPR 2024 Sua Choi, Dahyun Kang, Minsu Cho

We address the problem of generalized category discovery (GCD) that aims to partition a partially labeled collection of images; only a small part of the collection is labeled and the total number of target classes is unknown.

Clustering Contrastive Learning +1

Learning Correlation Structures for Vision Transformers

no code implementations CVPR 2024 Manjin Kim, Paul Hongsuck Seo, Cordelia Schmid, Minsu Cho

We introduce a new attention mechanism, dubbed structural self-attention (StructSA), that leverages rich correlation patterns naturally emerging in key-query interactions of attention.

Action Classification Action Recognition +2

NVS-Adapter: Plug-and-Play Novel View Synthesis from a Single Image

1 code implementation12 Dec 2023 Yoonwoo Jeong, Jinwoo Lee, Chiheon Kim, Minsu Cho, Doyup Lee

Transfer learning of large-scale Text-to-Image (T2I) models has recently shown impressive potential for Novel View Synthesis (NVS) of diverse objects from a single image.

Novel View Synthesis Transfer Learning

Activity Grammars for Temporal Action Segmentation

1 code implementation NeurIPS 2023 Dayoung Gong, Joonseok Lee, Deunsol Jung, Suha Kwak, Minsu Cho

Sequence prediction on temporal data requires the ability to understand compositional structures of multi-level semantics beyond individual and contextual properties.

Action Segmentation Segmentation

Towards More Practical Group Activity Detection: A New Benchmark and Model

no code implementations5 Dec 2023 Dongkeun Kim, Youngkil Song, Minsu Cho, Suha Kwak

Group activity detection (GAD) is the task of identifying members of each group and classifying the activity of the group at the same time in a video.

Action Detection Activity Detection

Efficient Semantic Matching with Hypercolumn Correlation

no code implementations7 Nov 2023 SeungWook Kim, Juhong Min, Minsu Cho

Recent studies show that leveraging the match-wise relationships within the 4D correlation map yields significant improvements in establishing semantic correspondences - but at the cost of increased computation and latency.

Generalized Neural Sorting Networks with Error-Free Differentiable Swap Functions

no code implementations11 Oct 2023 Jungtaek Kim, Jeongbeen Yoon, Minsu Cho

Sorting is a fundamental operation of all computer systems, having been a long-standing significant research topic.

PriViT: Vision Transformers for Fast Private Inference

1 code implementation6 Oct 2023 Naren Dhyani, Jianqiao Mo, Minsu Cho, Ameya Joshi, Siddharth Garg, Brandon Reagen, Chinmay Hegde

The Vision Transformer (ViT) architecture has emerged as the backbone of choice for state-of-the-art deep models for computer vision applications.

Image Classification

Distilling Self-Supervised Vision Transformers for Weakly-Supervised Few-Shot Classification & Segmentation

no code implementations CVPR 2023 Dahyun Kang, Piotr Koniusz, Minsu Cho, Naila Murray

For this mixed setup, we propose to improve the pseudo-labels using a pseudo-label enhancer that was trained using the available ground-truth pixel-level labels.

Few-Shot Image Classification Pseudo Label +1

Stable and Consistent Prediction of 3D Characteristic Orientation via Invariant Residual Learning

no code implementations20 Jun 2023 SeungWook Kim, Chunghyun Park, Yoonwoo Jeong, Jaesik Park, Minsu Cho

Learning to predict reliable characteristic orientations of 3D point clouds is an important yet challenging problem, as different point clouds of the same class may have largely varying appearances.

Relational Context Learning for Human-Object Interaction Detection

1 code implementation CVPR 2023 Sanghyun Kim, Deunsol Jung, Minsu Cho

Recent state-of-the-art methods for HOI detection typically build on transformer architectures with two decoder branches, one for human-object pair detection and the other for interaction classification.

Decoder Human-Object Interaction Detection +3

Devil's on the Edges: Selective Quad Attention for Scene Graph Generation

no code implementations CVPR 2023 Deunsol Jung, Sanghyun Kim, Won Hwa Kim, Minsu Cho

The edge selection module selects relevant object pairs, i. e., edges in the scene graph, which helps contextual reasoning, and the quad attention module then updates the edge features using both edge-to-node and edge-to-edge cross-attentions to capture contextual information between objects and object pairs.

Graph Generation Object +1

Learning Rotation-Equivariant Features for Visual Correspondence

no code implementations CVPR 2023 Jongmin Lee, Byungjin Kim, SeungWook Kim, Minsu Cho

The resultant features and their orientations are further processed by group aligning, a novel invariant mapping technique that shifts the group-equivariant features by their orientations along the group dimension.

Camera Pose Estimation Pose Estimation +1

Generalizable Implicit Neural Representations via Instance Pattern Composers

1 code implementation CVPR 2023 Chiheon Kim, Doyup Lee, Saehoon Kim, Minsu Cho, Wook-Shin Han

Despite recent advances in implicit neural representations (INRs), it remains challenging for a coordinate-based multi-layer perceptron (MLP) of INRs to learn a common representation across data instances and generalize it for unseen instances.

Meta-Learning

Few-shot Metric Learning: Online Adaptation of Embedding for Retrieval

no code implementations14 Nov 2022 Deunsol Jung, Dahyun Kang, Suha Kwak, Minsu Cho

Metric learning aims to build a distance metric typically by learning an effective embedding function that maps similar objects into nearby points in its embedding space.

Image Retrieval Meta-Learning +2

Soft-Landing Strategy for Alleviating the Task Discrepancy Problem in Temporal Action Localization Tasks

1 code implementation CVPR 2023 Hyolim Kang, Hanjung Kim, Joungbin An, Minsu Cho, Seon Joo Kim

Temporal Action Localization (TAL) methods typically operate on top of feature sequences from a frozen snippet encoder that is pretrained with the Trimmed Action Classification (TAC) tasks, resulting in a task discrepancy problem.

Action Classification Computational Efficiency +1

Sequential Brick Assembly with Efficient Constraint Satisfaction

no code implementations3 Oct 2022 Seokjun Ahn, Jungtaek Kim, Minsu Cho, Jaesik Park

The assembly problem is challenging since the number of possible structures increases exponentially with the number of available bricks, complicating the physical constraints to satisfy across bricks.

Bayesian Optimization Position

PeRFception: Perception using Radiance Fields

1 code implementation24 Aug 2022 Yoonwoo Jeong, Seungjoo Shin, Junha Lee, Christopher Choy, Animashree Anandkumar, Minsu Cho, Jaesik Park

The recent progress in implicit 3D representation, i. e., Neural Radiance Fields (NeRFs), has made accurate and photorealistic 3D reconstruction possible in a differentiable manner.

3D Reconstruction Segmentation

Towards Sequence-Level Training for Visual Tracking

2 code implementations11 Aug 2022 Minji Kim, Seungkwan Lee, Jungseul Ok, Bohyung Han, Minsu Cho

Despite the extensive adoption of machine learning on the task of visual object tracking, recent learning-based approaches have largely overlooked the fact that visual tracking is a sequence-level task in its nature; they rely heavily on frame-level training, which inevitably induces inconsistency between training and testing in terms of both data distributions and task objectives.

Data Augmentation Reinforcement Learning (RL) +1

Revisiting Self-Distillation

no code implementations17 Jun 2022 Minh Pham, Minsu Cho, Ameya Joshi, Chinmay Hegde

We first show that even with a highly accurate teacher, self-distillation allows a student to surpass the teacher in all cases.

Knowledge Distillation Model Compression

Self-Supervised Learning of Image Scale and Orientation

1 code implementation15 Jun 2022 Jongmin Lee, Yoonwoo Jeong, Minsu Cho

We study the problem of learning to assign a characteristic pose, i. e., scale and orientation, for an image region of interest.

Camera Pose Estimation Pose Estimation +1

Peripheral Vision Transformer

1 code implementation14 Jun 2022 Juhong Min, Yucheng Zhao, Chong Luo, Minsu Cho

We propose to incorporate peripheral position encoding to the multi-head self-attention layers to let the network learn to partition the visual field into diverse peripheral regions given training data.

Image Classification

Draft-and-Revise: Effective Image Generation with Contextual RQ-Transformer

no code implementations9 Jun 2022 Doyup Lee, Chiheon Kim, Saehoon Kim, Minsu Cho, Wook-Shin Han

After code stacks in the sequence are randomly masked, Contextual RQ-Transformer is trained to infill the masked code stacks based on the unmasked contexts of the image.

Conditional Image Generation Text-to-Image Generation

Learning to Assemble Geometric Shapes

1 code implementation24 May 2022 Jinhwi Lee, Jungtaek Kim, Hyunsoo Chung, Jaesik Park, Minsu Cho

Assembling parts into an object is a combinatorial problem that arises in a variety of contexts in the real world and involves numerous applications in science and engineering.

TransforMatcher: Match-to-Match Attention for Semantic Correspondence

1 code implementation CVPR 2022 SeungWook Kim, Juhong Min, Minsu Cho

Establishing correspondences between images remains a challenging task, especially under large appearance changes due to different viewpoints or intra-class variations.

Semantic correspondence

Smooth-Reduce: Leveraging Patches for Improved Certified Robustness

no code implementations12 May 2022 Ameya Joshi, Minh Pham, Minsu Cho, Leonid Boytsov, Filipe Condessa, J. Zico Kolter, Chinmay Hegde

Randomized smoothing (RS) has been shown to be a fast, scalable technique for certifying the robustness of deep neural network classifiers.

Self-Taught Metric Learning without Labels

no code implementations CVPR 2022 Sungyeon Kim, Dongwon Kim, Minsu Cho, Suha Kwak

At the heart of our framework lies an algorithm that investigates contexts of data on the embedding space to predict their class-equivalence relations as pseudo labels.

Metric Learning

Self-Supervised Equivariant Learning for Oriented Keypoint Detection

1 code implementation CVPR 2022 Jongmin Lee, Byungjin Kim, Minsu Cho

Detecting robust keypoints from an image is an integral part of many computer vision problems, and the characteristic orientation and scale of keypoints play an important role for keypoint description and matching.

Camera Pose Estimation Keypoint Detection +2

Detector-Free Weakly Supervised Group Activity Recognition

no code implementations CVPR 2022 Dongkeun Kim, Jinsung Lee, Minsu Cho, Suha Kwak

Group activity recognition is the task of understanding the activity conducted by a group of people as a whole in a multi-person video.

Group Activity Recognition

Reflection and Rotation Symmetry Detection via Equivariant Learning

1 code implementation CVPR 2022 Ahyun Seo, Byungjin Kim, Suha Kwak, Minsu Cho

The inherent challenge of detecting symmetries stems from arbitrary orientations of symmetry patterns; a reflection symmetry mirrors itself against an axis with a specific orientation while a rotation symmetry matches its rotated copy with a specific orientation.

Symmetry Detection

Integrative Few-Shot Learning for Classification and Segmentation

1 code implementation CVPR 2022 Dahyun Kang, Minsu Cho

We introduce the integrative task of few-shot classification and segmentation (FS-CS) that aims to both classify and segment target objects in a query image when the target classes are given with a few examples.

Classification Few-Shot Classification and Segmentation +3

Autoregressive Image Generation using Residual Quantization

3 code implementations CVPR 2022 Doyup Lee, Chiheon Kim, Saehoon Kim, Minsu Cho, Wook-Shin Han

However, we postulate that previous VQ cannot shorten the code sequence and generate high-fidelity images together in terms of the rate-distortion trade-off.

Conditional Image Generation Quantization +1

Selective Network Linearization for Efficient Private Inference

1 code implementation4 Feb 2022 Minsu Cho, Ameya Joshi, Siddharth Garg, Brandon Reagen, Chinmay Hegde

To reduce PI latency we propose a gradient-based algorithm that selectively linearizes ReLUs while maintaining prediction accuracy.

Contrastive Regularization for Semi-Supervised Learning

no code implementations17 Jan 2022 Doyup Lee, Sungwoong Kim, Ildoo Kim, Yeongjae Cheon, Minsu Cho, Wook-Shin Han

Consistency regularization on label predictions becomes a fundamental technique in semi-supervised learning, but it still requires a large number of training iterations for high performance.

Semi-Supervised Image Classification

Fast Point Transformer

1 code implementation CVPR 2022 Chunghyun Park, Yoonwoo Jeong, Minsu Cho, Jaesik Park

The recent success of neural networks enables a better interpretation of 3D point clouds, but processing a large-scale 3D scene remains a challenging problem.

3D Semantic Segmentation Computational Efficiency +1

Semi-supervised Domain Adaptation via Sample-to-Sample Self-Distillation

1 code implementation29 Nov 2021 Jeongbeen Yoon, Dahyun Kang, Minsu Cho

Semi-supervised domain adaptation (SSDA) is to adapt a learner to a new domain with only a small set of labeled samples when a large labeled dataset is given on a source domain.

Domain Adaptation Semi-supervised Domain Adaptation

Brick-by-Brick: Combinatorial Construction with Deep Reinforcement Learning

no code implementations NeurIPS 2021 Hyunsoo Chung, Jungtaek Kim, Boris Knyazev, Jinhwi Lee, Graham W. Taylor, Jaesik Park, Minsu Cho

Discovering a solution in a combinatorial space is prevalent in many real-world problems but it is also challenging due to diverse complex constraints and the vast number of possible combinations.

Object reinforcement-learning +1

Differentiable Spline Approximations

no code implementations NeurIPS 2021 Minsu Cho, Aditya Balu, Ameya Joshi, Anjana Deva Prasad, Biswajit Khara, Soumik Sarkar, Baskar Ganapathysubramanian, Adarsh Krishnamurthy, Chinmay Hegde

Overall, we show that leveraging this redesigned Jacobian in the form of a differentiable "layer" in predictive models leads to improved performance in diverse applications such as image segmentation, 3D point cloud reconstruction, and finite element analysis.

3D Point Cloud Reconstruction BIG-bench Machine Learning +3

Efficient Point Transformer for Large-scale 3D Scene Understanding

no code implementations29 Sep 2021 Chunghyun Park, Yoonwoo Jeong, Minsu Cho, Jaesik Park

Although sparse convolution is efficient and scalable for large 3D scenes, the quantization artifacts impair geometric details and degrade prediction accuracy.

3D Semantic Segmentation Quantization +1

Rotation-Equivariant Keypoint Detection

no code implementations29 Sep 2021 Jongmin Lee, Byungjin Kim, Minsu Cho

Therefore, we propose a rotation-invariant keypoint detection method using rotation-equivariant CNNs.

Keypoint Detection Translation

Visual TransforMatcher: Efficient Match-to-Match Attention for Visual Correspondence

no code implementations29 Sep 2021 Seung Wook Kim, Juhong Min, Minsu Cho

Establishing correspondences between images remains a challenging task, especially under large appearance changes due to different viewpoints and intra-class variations.

Convolutional Hough Matching Networks for Robust and Efficient Visual Correspondence

1 code implementation11 Sep 2021 Juhong Min, SeungWook Kim, Minsu Cho

To validate the proposed techniques, we develop the neural network with CHM layers that perform convolutional matching in the space of translation and scaling.

Geometric Matching Translation

Deep Hough Voting for Robust Global Registration

no code implementations ICCV 2021 Junha Lee, SeungWook Kim, Minsu Cho, Jaesik Park

We then construct a set of triplets of correspondences to cast votes on the 6D Hough space, representing the transformation parameters in sparse tensors.

Point Cloud Registration

Self-Calibrating Neural Radiance Fields

1 code implementation ICCV 2021 Yoonwoo Jeong, Seokjun Ahn, Christopher Choy, Animashree Anandkumar, Minsu Cho, Jaesik Park

We also propose a new geometric loss function, viz., projected ray distance loss, to incorporate geometric consistency for complex non-linear camera models.

Learning to Discover Reflection Symmetry via Polar Matching Convolution

no code implementations ICCV 2021 Ahyun Seo, Woohyeon Shim, Minsu Cho

The task of reflection symmetry detection remains challenging due to significant variations and ambiguities of symmetry patterns in the wild.

Self-Supervised Learning Symmetry Detection

Relational Embedding for Few-Shot Classification

1 code implementation ICCV 2021 Dahyun Kang, Heeseung Kwon, Juhong Min, Minsu Cho

We propose to address the problem of few-shot classification by meta-learning "what to observe" and "where to attend" in a relational perspective.

Classification Few-Shot Image Classification +1

Sphynx: ReLU-Efficient Network Design for Private Inference

no code implementations17 Jun 2021 Minsu Cho, Zahra Ghodsi, Brandon Reagen, Siddharth Garg, Chinmay Hegde

The emergence of deep learning has been accompanied by privacy concerns surrounding users' data and service providers' models.

Hypercorrelation Squeeze for Few-Shot Segmentation

1 code implementation4 Apr 2021 Juhong Min, Dahyun Kang, Minsu Cho

Few-shot semantic segmentation aims at learning to segment a target object from a query image using only a few annotated support images of the target class.

Feature Correlation Few-Shot Semantic Segmentation +1

Convolutional Hough Matching Networks

1 code implementation CVPR 2021 Juhong Min, Minsu Cho

Despite advances in feature representation, leveraging geometric relations is crucial for establishing reliable visual correspondences under large variations of images.

Geometric Matching Semantic correspondence +1

Embedding Transfer with Label Relaxation for Improved Metric Learning

2 code implementations CVPR 2021 Sungyeon Kim, Dongwon Kim, Minsu Cho, Suha Kwak

Our method exploits pairwise similarities between samples in the source embedding space as the knowledge, and transfers them through a loss used for learning target embedding models.

Knowledge Distillation Metric Learning

Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition

1 code implementation ICCV 2021 Heeseung Kwon, Manjin Kim, Suha Kwak, Minsu Cho

With a sufficient volume of the neighborhood in space and time, it effectively captures long-term interaction and fast motion in the video, leading to robust action recognition.

Ranked #19 on Action Recognition on Something-Something V1 (using extra training data)

Action Recognition Temporal Action Localization +1

Embedding Transfer via Smooth Contrastive Loss

no code implementations1 Jan 2021 Sungyeon Kim, Dongwon Kim, Minsu Cho, Suha Kwak

To this end, we design a new loss called smooth contrastive loss, which pulls together or pushes apart a pair of samples in a target embedding space with strength determined by their semantic similarity in the source embedding space; an analysis of the loss reveals that this property enables more important pairs to contribute more to learning the target embedding space.

Metric Learning Semantic Similarity +1

Hypercorrelation Squeeze for Few-Shot Segmenation

no code implementations ICCV 2021 Juhong Min, Dahyun Kang, Minsu Cho

Few-shot semantic segmentation aims at learning to segment a target object from a query image using only a few annotated support images of the target class.

Feature Correlation Few-Shot Semantic Segmentation +2

Learning Self-Similarity in Space and Time as a Generalized Motion for Action Recognition

1 code implementation1 Jan 2021 Heeseung Kwon, Manjin Kim, Suha Kwak, Minsu Cho

We leverage the whole volume of STSS and let our model learn to extract an effective motion representation from it.

Action Recognition Video Understanding

Pair-based Self-Distillation for Semi-supervised Domain Adaptation

no code implementations1 Jan 2021 Jeongbeen Yoon, Dahyun Kang, Minsu Cho

Semi-supervised domain adaptation (SSDA) is to adapt a learner to a new domain with only a small set of labeled samples when a large labeled dataset is given on a source domain.

Domain Adaptation Semi-supervised Domain Adaptation

Combinatorial Bayesian Optimization with Random Mapping Functions to Convex Polytopes

no code implementations26 Nov 2020 Jungtaek Kim, Seungjin Choi, Minsu Cho

The main idea is to use a random mapping which embeds the combinatorial space into a convex polytope in a continuous space, on which all essential process is performed to determine a solution to the black-box optimization in the combinatorial space.

Bayesian Optimization

CircleGAN: Generative Adversarial Learning across Spherical Circles

1 code implementation NeurIPS 2020 Woohyeon Shim, Minsu Cho

We present a novel discriminator for GANs that improves realness and diversity of generated samples by learning a structured hypersphere embedding space using spherical circles.

Diversity Representation Learning

Fragment Relation Networks for Geometric Shape Assembly

no code implementations NeurIPS Workshop LMCA 2020 Jinhwi Lee, Jungtaek Kim, Hyunsoo Chung, Jaesik Park, Minsu Cho

Our model processes the candidate fragments in a permutation-equivariant manner and can generalize to cases with an arbitrary number of fragments and even with a different target object.

Object Relation

Diversified Mutual Learning for Deep Metric Learning

no code implementations9 Sep 2020 Wonpyo Park, Wonjae Kim, Kihyun You, Minsu Cho

Mutual learning is an ensemble training strategy to improve generalization by transferring individual knowledge to each other while simultaneously training multiple models.

Diversity Metric Learning +1

Learning to Compose Hypercolumns for Visual Correspondence

1 code implementation ECCV 2020 Juhong Min, Jongmin Lee, Jean Ponce, Minsu Cho

Feature representation plays a crucial role in visual correspondence, and recent methods for image matching resort to deeply stacked convolutional layers.

object-detection Semantic correspondence

MotionSqueeze: Neural Motion Feature Learning for Video Understanding

2 code implementations ECCV 2020 Heeseung Kwon, Manjin Kim, Suha Kwak, Minsu Cho

As the frame-by-frame optical flows require heavy computation, incorporating motion information has remained a major computational bottleneck for video understanding.

Action Classification Action Recognition +2

IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos

2 code implementations13 Jul 2020 Gyeongsik Moon, Heeseung Kwon, Kyoung Mu Lee, Minsu Cho

Most current action recognition methods heavily rely on appearance information by taking an RGB sequence of entire image regions as input.

Action Recognition In Videos Pose Estimation +1

Hyperparameter Optimization in Neural Networks via Structured Sparse Recovery

no code implementations7 Jul 2020 Minsu Cho, Mohammadreza Soltani, Chinmay Hegde

In this paper, we study two important problems in the automated design of neural networks -- Hyper-parameter Optimization (HPO), and Neural Architecture Search (NAS) -- through the lens of sparse recovery methods.

Hyperparameter Optimization Neural Architecture Search

ESPN: Extremely Sparse Pruned Networks

1 code implementation28 Jun 2020 Minsu Cho, Ameya Joshi, Chinmay Hegde

Deep neural networks are often highly overparameterized, prohibiting their use in compute-limited systems.

Network Pruning

Combinatorial 3D Shape Generation via Sequential Assembly

3 code implementations16 Apr 2020 Jungtaek Kim, Hyunsoo Chung, Jinhwi Lee, Minsu Cho, Jaesik Park

To alleviate this consequence induced by a huge number of feasible combinations, we propose a combinatorial 3D shape generation framework.

3D Shape Generation Bayesian Optimization

Local-Global Video-Text Interactions for Temporal Grounding

1 code implementation CVPR 2020 Jonghwan Mun, Minsu Cho, Bohyung Han

This paper addresses the problem of text-to-video temporal grounding, which aims to identify the time interval in a video semantically relevant to a text query.

Proxy Anchor Loss for Deep Metric Learning

3 code implementations CVPR 2020 Sungyeon Kim, Dongwon Kim, Minsu Cho, Suha Kwak

The former class can leverage fine-grained semantic relations between data points, but slows convergence in general due to its high training complexity.

Ranked #10 on Metric Learning on CUB-200-2011 (using extra training data)

Fine-Grained Image Classification Fine-Grained Vehicle Classification +1

Freeze the Discriminator: a Simple Baseline for Fine-Tuning GANs

4 code implementations25 Feb 2020 Sangwoo Mo, Minsu Cho, Jinwoo Shin

Generative adversarial networks (GANs) have shown outstanding performance on a wide range of problems in computer vision, graphics, and machine learning, but often require numerous training data and heavy computational resources.

10-shot image generation Image Generation +1

Real-Time Object Tracking via Meta-Learning: Efficient Model Adaptation and One-Shot Channel Pruning

no code implementations25 Nov 2019 Ilchae Jung, Kihyun You, Hyeonwoo Noh, Minsu Cho, Bohyung Han

We propose a novel meta-learning framework for real-time object tracking with efficient model adaptation and channel pruning.

Meta-Learning Object +1

Mining GOLD Samples for Conditional GANs

1 code implementation NeurIPS 2019 Sangwoo Mo, Chiheon Kim, Sungwoong Kim, Minsu Cho, Jinwoo Shin

Conditional generative adversarial networks (cGANs) have gained a considerable attention in recent years due to its class-wise controllability and superior quality for complex generation tasks.

Active Learning

Regularizing Neural Networks via Stochastic Branch Layers

no code implementations3 Oct 2019 Wonpyo Park, Paul Hongsuck Seo, Bohyung Han, Minsu Cho

We introduce a novel stochastic regularization technique for deep neural networks, which decomposes a layer into multiple branches with different parameters and merges stochastically sampled combinations of the outputs from the branches during training.

SPair-71k: A Large-scale Benchmark for Semantic Correspondence

no code implementations28 Aug 2019 Juhong Min, Jongmin Lee, Jean Ponce, Minsu Cho

In this paper, we present a new large-scale benchmark dataset of semantically paired images, SPair-71k, which contains 70, 958 image pairs with diverse variations in viewpoint and scale.

Semantic correspondence

Hyperpixel Flow: Semantic Correspondence with Multi-layer Neural Features

1 code implementation ICCV 2019 Juhong Min, Jongmin Lee, Jean Ponce, Minsu Cho

Establishing visual correspondences under large intra-class variations requires analyzing images at different levels, from features linked to semantics and context to local patterns, while being invariant to instance-specific details.

Semantic correspondence

Instance-aware Image-to-Image Translation

1 code implementation ICLR 2019 Sangwoo Mo, Minsu Cho, Jinwoo Shin

Unsupervised image-to-image translation has gained considerable attention due to the recent impressive progress based on generative adversarial networks (GANs).

Semantic Segmentation Translation +1

Deep Metric Learning Beyond Binary Supervision

1 code implementation CVPR 2019 Sungyeon Kim, Minkyo Seo, Ivan Laptev, Minsu Cho, Suha Kwak

Metric Learning for visual similarity has mostly adopted binary supervision indicating whether a pair of images are of the same class or not.

Image Captioning Image Retrieval +4

Universal Bounding Box Regression and Its Applications

no code implementations15 Apr 2019 Seungkwan Lee, Suha Kwak, Minsu Cho

Bounding-box regression is a popular technique to refine or predict localization boxes in recent object detection approaches.

Object object-detection +3

Relational Knowledge Distillation

3 code implementations CVPR 2019 Wonpyo Park, Dongju Kim, Yan Lu, Minsu Cho

Knowledge distillation aims at transferring knowledge acquired in one model (a teacher) to another model (a student) that is typically smaller.

Knowledge Distillation Metric Learning

InstaGAN: Instance-aware Image-to-Image Translation

1 code implementation28 Dec 2018 Sangwoo Mo, Minsu Cho, Jinwoo Shin

Our comparative evaluation demonstrates the effectiveness of the proposed method on different image datasets, in particular, in the aforementioned challenging cases.

Semantic Segmentation Translation +1

Attentive Semantic Alignment with Offset-Aware Correlation Kernels

no code implementations ECCV 2018 Paul Hongsuck Seo, Jongmin Lee, Deunsol Jung, Bohyung Han, Minsu Cho

Semantic correspondence is the problem of establishing correspondences across images depicting different instances of the same object or scene class.

Semantic correspondence Translation

Multi-Object Tracking With Quadruplet Convolutional Neural Networks

no code implementations CVPR 2017 Jeany Son, Mooyeol Baek, Minsu Cho, Bohyung Han

We propose Quadruplet Convolutional Neural Networks (Quad-CNN) for multi-object tracking, which learn to associate object detections across frames using quadruplet losses.

Multi-Object Tracking Object +1

SCNet: Learning Semantic Correspondence

1 code implementation ICCV 2017 Kai Han, Rafael S. Rezende, Bumsub Ham, Kwan-Yee K. Wong, Minsu Cho, Cordelia Schmid, Jean Ponce

This paper addresses the problem of establishing semantic correspondences between images depicting different instances of the same object or scene category.

Semantic correspondence

Proposal Flow: Semantic Correspondences from Object Proposals

no code implementations21 Mar 2017 Bumsub Ham, Minsu Cho, Cordelia Schmid, Jean Ponce

Finding image correspondences remains a challenging problem in the presence of intra-class variations and large changes in scene layout.

Object

Text-guided Attention Model for Image Captioning

1 code implementation12 Dec 2016 Jonghwan Mun, Minsu Cho, Bohyung Han

Visual attention plays an important role to understand images and demonstrates its effectiveness in generating natural language descriptions of images.

Image Captioning

Thin-Slicing for Pose: Learning to Understand Pose Without Explicit Pose Estimation

no code implementations CVPR 2016 Suha Kwak, Minsu Cho, Ivan Laptev

We address the problem of learning a pose-aware, compact embedding that projects images with similar human poses to be placed close-by in the embedding space.

Action Recognition Image Retrieval +3

Proposal Flow

no code implementations CVPR 2016 Bumsub Ham, Minsu Cho, Cordelia Schmid, Jean Ponce

Finding image correspondences remains a challenging problem in the presence of intra-class variations and large changes in scene layout.~Semantic flow methods are designed to handle images depicting different instances of the same object or scene category.

Object

Robust Image Filtering Using Joint Static and Dynamic Guidance

no code implementations CVPR 2015 Bumsub Ham, Minsu Cho, Jean Ponce

Regularizing images under a guidance signal has been used in various tasks in computer vision and computational photography, particularly for noise reduction and joint upsampling.

Denoising Super-Resolution

Unsupervised Object Discovery and Tracking in Video Collections

no code implementations ICCV 2015 Suha Kwak, Minsu Cho, Ivan Laptev, Jean Ponce, Cordelia Schmid

This paper addresses the problem of automatically localizing dominant objects as spatio-temporal tubes in a noisy collection of videos with minimal or even no supervision.

Object Object Discovery +1

A General Multi-Graph Matching Approach via Graduated Consistency-regularized Boosting

no code implementations20 Feb 2015 Junchi Yan, Minsu Cho, Hongyuan Zha, Xiaokang Yang, Stephen Chu

We propose multi-graph matching methods to incorporate the two aspects by boosting the affinity score, meanwhile gradually infusing the consistency as a regularizer.

Graph Matching

Unsupervised Object Discovery and Localization in the Wild: Part-based Matching with Bottom-up Region Proposals

no code implementations CVPR 2015 Minsu Cho, Suha Kwak, Cordelia Schmid, Jean Ponce

This paper addresses unsupervised discovery and localization of dominant objects from a noisy image collection with multiple object classes.

Object Object Discovery

Finding Matches in a Haystack: A Max-Pooling Strategy for Graph Matching in the Presence of Outliers

no code implementations CVPR 2014 Minsu Cho, Jian Sun, Olivier Duchenne, Jean Ponce

A major challenge in real-world feature matching problems is to tolerate the numerous outliers arising in typical visual tasks.

Graph Matching

Cannot find the paper you are looking for? You can Submit a new open access paper.