1 code implementation • 9 Aug 2024 • Dahyun Kang, Minsu Cho
We present lazy visual grounding, a two-stage approach of unsupervised object mask discovery followed by object grounding, for open-vocabulary semantic segmentation.
Ranked #2 on Open Vocabulary Semantic Segmentation on COCO-Stuff-171 (mIoU metric)
no code implementations • 6 Aug 2024 • Youngkil Song, Dongkeun Kim, Minsu Cho, Suha Kwak
We also propose a novel action localization method that observes the current input segment to predict the end time of the ongoing action and accesses the memory queue to estimate the start time of the action.
no code implementations • 29 Jul 2024 • Jinsung Lee, Taeoh Kim, Inwoong Lee, Minho Shim, Dongyoon Wee, Minsu Cho, Suha Kwak
Video action detection (VAD) aims to detect actors and classify their actions in a video.
no code implementations • 15 Jul 2024 • Nahyuk Lee, Juhong Min, Junha Lee, SeungWook Kim, Kanghee Lee, Jaesik Park, Minsu Cho
Building upon PMT, we introduce a new framework, dubbed Proxy Match TransformeR (PMTR), for the geometric assembly task.
no code implementations • 25 Jun 2024 • Sanghyun Kim, Min Jung Lee, Woohyeok Kim, Deunsol Jung, Jaesung Rim, Sunghyun Cho, Minsu Cho
In this work, we explore using burst shots with non-uniform exposures to confront real-world practical scenarios by introducing a new benchmark dataset, dubbed Non-uniformly Exposed Burst Image (NEBI), that includes the burst frames at varying exposure times to obtain a broader range of irradiance and motion characteristics within a scene.
no code implementations • 26 Apr 2024 • SeungWook Kim, Yichun Shi, Kejie Li, Minsu Cho, Peng Wang
Using image as prompts for 3D generation demonstrate particularly strong performances compared to using text prompts alone, for images provide a more intuitive guidance for the 3D generation process.
no code implementations • CVPR 2024 • Chunghyun Park, SeungWook Kim, Jaesik Park, Minsu Cho
Establishing accurate 3D correspondences between shapes stands as a pivotal challenge with profound implications for computer vision and robotics.
no code implementations • CVPR 2024 • SeungWook Kim, Kejie Li, Xueqing Deng, Yichun Shi, Minsu Cho, Peng Wang
Leveraging multi-view diffusion models as priors for 3D optimization have alleviated the problem of 3D consistency, e. g., the Janus face problem or the content drift problem, in zero-shot text-to-3D models.
no code implementations • CVPR 2024 • Sua Choi, Dahyun Kang, Minsu Cho
We address the problem of generalized category discovery (GCD) that aims to partition a partially labeled collection of images; only a small part of the collection is labeled and the total number of target classes is unknown.
no code implementations • CVPR 2024 • Juhong Min, Shyamal Buch, Arsha Nagrani, Minsu Cho, Cordelia Schmid
This paper addresses the task of video question answering (videoQA) via a decomposed multi-stage, modular reasoning framework.
Ranked #6 on Zero-Shot Video Question Answer on NExT-QA
no code implementations • CVPR 2024 • Manjin Kim, Paul Hongsuck Seo, Cordelia Schmid, Minsu Cho
We introduce a new attention mechanism, dubbed structural self-attention (StructSA), that leverages rich correlation patterns naturally emerging in key-query interactions of attention.
Ranked #4 on Action Recognition on Diving-48
1 code implementation • 12 Dec 2023 • Yoonwoo Jeong, Jinwoo Lee, Chiheon Kim, Minsu Cho, Doyup Lee
Transfer learning of large-scale Text-to-Image (T2I) models has recently shown impressive potential for Novel View Synthesis (NVS) of diverse objects from a single image.
1 code implementation • NeurIPS 2023 • Dayoung Gong, Joonseok Lee, Deunsol Jung, Suha Kwak, Minsu Cho
Sequence prediction on temporal data requires the ability to understand compositional structures of multi-level semantics beyond individual and contextual properties.
no code implementations • 5 Dec 2023 • Dongkeun Kim, Youngkil Song, Minsu Cho, Suha Kwak
Group activity detection (GAD) is the task of identifying members of each group and classifying the activity of the group at the same time in a video.
no code implementations • 7 Nov 2023 • SeungWook Kim, Juhong Min, Minsu Cho
Recent studies show that leveraging the match-wise relationships within the 4D correlation map yields significant improvements in establishing semantic correspondences - but at the cost of increased computation and latency.
no code implementations • 11 Oct 2023 • Jungtaek Kim, Jeongbeen Yoon, Minsu Cho
Sorting is a fundamental operation of all computer systems, having been a long-standing significant research topic.
1 code implementation • 6 Oct 2023 • Naren Dhyani, Jianqiao Mo, Minsu Cho, Ameya Joshi, Siddharth Garg, Brandon Reagen, Chinmay Hegde
The Vision Transformer (ViT) architecture has emerged as the backbone of choice for state-of-the-art deep models for computer vision applications.
no code implementations • CVPR 2023 • Dahyun Kang, Piotr Koniusz, Minsu Cho, Naila Murray
For this mixed setup, we propose to improve the pseudo-labels using a pseudo-label enhancer that was trained using the available ground-truth pixel-level labels.
no code implementations • 20 Jun 2023 • SeungWook Kim, Chunghyun Park, Yoonwoo Jeong, Jaesik Park, Minsu Cho
Learning to predict reliable characteristic orientations of 3D point clouds is an important yet challenging problem, as different point clouds of the same class may have largely varying appearances.
1 code implementation • CVPR 2023 • Sanghyun Kim, Deunsol Jung, Minsu Cho
Recent state-of-the-art methods for HOI detection typically build on transformer architectures with two decoder branches, one for human-object pair detection and the other for interaction classification.
Ranked #2 on Human-Object Interaction Detection on V-COCO
no code implementations • CVPR 2023 • Deunsol Jung, Sanghyun Kim, Won Hwa Kim, Minsu Cho
The edge selection module selects relevant object pairs, i. e., edges in the scene graph, which helps contextual reasoning, and the quad attention module then updates the edge features using both edge-to-node and edge-to-edge cross-attentions to capture contextual information between objects and object pairs.
no code implementations • CVPR 2023 • Jongmin Lee, Byungjin Kim, SeungWook Kim, Minsu Cho
The resultant features and their orientations are further processed by group aligning, a novel invariant mapping technique that shifts the group-equivariant features by their orientations along the group dimension.
1 code implementation • CVPR 2023 • Chiheon Kim, Doyup Lee, Saehoon Kim, Minsu Cho, Wook-Shin Han
Despite recent advances in implicit neural representations (INRs), it remains challenging for a coordinate-based multi-layer perceptron (MLP) of INRs to learn a common representation across data instances and generalize it for unseen instances.
no code implementations • 14 Nov 2022 • Deunsol Jung, Dahyun Kang, Suha Kwak, Minsu Cho
Metric learning aims to build a distance metric typically by learning an effective embedding function that maps similar objects into nearby points in its embedding space.
1 code implementation • CVPR 2023 • Hyolim Kang, Hanjung Kim, Joungbin An, Minsu Cho, Seon Joo Kim
Temporal Action Localization (TAL) methods typically operate on top of feature sequences from a frozen snippet encoder that is pretrained with the Trimmed Action Classification (TAC) tasks, resulting in a task discrepancy problem.
no code implementations • 3 Oct 2022 • Seokjun Ahn, Jungtaek Kim, Minsu Cho, Jaesik Park
The assembly problem is challenging since the number of possible structures increases exponentially with the number of available bricks, complicating the physical constraints to satisfy across bricks.
1 code implementation • 24 Aug 2022 • Yoonwoo Jeong, Seungjoo Shin, Junha Lee, Christopher Choy, Animashree Anandkumar, Minsu Cho, Jaesik Park
The recent progress in implicit 3D representation, i. e., Neural Radiance Fields (NeRFs), has made accurate and photorealistic 3D reconstruction possible in a differentiable manner.
2 code implementations • 11 Aug 2022 • Minji Kim, Seungkwan Lee, Jungseul Ok, Bohyung Han, Minsu Cho
Despite the extensive adoption of machine learning on the task of visual object tracking, recent learning-based approaches have largely overlooked the fact that visual tracking is a sequence-level task in its nature; they rely heavily on frame-level training, which inevitably induces inconsistency between training and testing in terms of both data distributions and task objectives.
Ranked #19 on Visual Object Tracking on TrackingNet
no code implementations • 17 Jun 2022 • Minh Pham, Minsu Cho, Ameya Joshi, Chinmay Hegde
We first show that even with a highly accurate teacher, self-distillation allows a student to surpass the teacher in all cases.
1 code implementation • 15 Jun 2022 • Jongmin Lee, Yoonwoo Jeong, Minsu Cho
We study the problem of learning to assign a characteristic pose, i. e., scale and orientation, for an image region of interest.
1 code implementation • 14 Jun 2022 • Juhong Min, Yucheng Zhao, Chong Luo, Minsu Cho
We propose to incorporate peripheral position encoding to the multi-head self-attention layers to let the network learn to partition the visual field into diverse peripheral regions given training data.
no code implementations • 9 Jun 2022 • Doyup Lee, Chiheon Kim, Saehoon Kim, Minsu Cho, Wook-Shin Han
After code stacks in the sequence are randomly masked, Contextual RQ-Transformer is trained to infill the masked code stacks based on the unmasked contexts of the image.
Ranked #1 on Text-to-Image Generation on Conceptual Captions
no code implementations • CVPR 2022 • Dayoung Gong, Joonseok Lee, Manjin Kim, Seong Jong Ha, Minsu Cho
The task of predicting future actions from a video is crucial for a real-world agent interacting with others.
1 code implementation • 24 May 2022 • Jinhwi Lee, Jungtaek Kim, Hyunsoo Chung, Jaesik Park, Minsu Cho
Assembling parts into an object is a combinatorial problem that arises in a variety of contexts in the real world and involves numerous applications in science and engineering.
1 code implementation • CVPR 2022 • SeungWook Kim, Juhong Min, Minsu Cho
Establishing correspondences between images remains a challenging task, especially under large appearance changes due to different viewpoints or intra-class variations.
Ranked #10 on Semantic correspondence on SPair-71k
no code implementations • 12 May 2022 • Ameya Joshi, Minh Pham, Minsu Cho, Leonid Boytsov, Filipe Condessa, J. Zico Kolter, Chinmay Hegde
Randomized smoothing (RS) has been shown to be a fast, scalable technique for certifying the robustness of deep neural network classifiers.
no code implementations • CVPR 2022 • Sungyeon Kim, Dongwon Kim, Minsu Cho, Suha Kwak
At the heart of our framework lies an algorithm that investigates contexts of data on the embedding space to predict their class-equivalence relations as pseudo labels.
1 code implementation • CVPR 2022 • Jongmin Lee, Byungjin Kim, Minsu Cho
Detecting robust keypoints from an image is an integral part of many computer vision problems, and the characteristic orientation and scale of keypoints play an important role for keypoint description and matching.
no code implementations • CVPR 2022 • Dongkeun Kim, Jinsung Lee, Minsu Cho, Suha Kwak
Group activity recognition is the task of understanding the activity conducted by a group of people as a whole in a multi-person video.
1 code implementation • CVPR 2022 • Ahyun Seo, Byungjin Kim, Suha Kwak, Minsu Cho
The inherent challenge of detecting symmetries stems from arbitrary orientations of symmetry patterns; a reflection symmetry mirrors itself against an axis with a specific orientation while a rotation symmetry matches its rotated copy with a specific orientation.
1 code implementation • CVPR 2022 • Dahyun Kang, Minsu Cho
We introduce the integrative task of few-shot classification and segmentation (FS-CS) that aims to both classify and segment target objects in a query image when the target classes are given with a few examples.
3 code implementations • CVPR 2022 • Doyup Lee, Chiheon Kim, Saehoon Kim, Minsu Cho, Wook-Shin Han
However, we postulate that previous VQ cannot shorten the code sequence and generate high-fidelity images together in terms of the rate-distortion trade-off.
Ranked #2 on Text-to-Image Generation on Conceptual Captions
1 code implementation • 4 Feb 2022 • Minsu Cho, Ameya Joshi, Siddharth Garg, Brandon Reagen, Chinmay Hegde
To reduce PI latency we propose a gradient-based algorithm that selectively linearizes ReLUs while maintaining prediction accuracy.
no code implementations • 17 Jan 2022 • Doyup Lee, Sungwoong Kim, Ildoo Kim, Yeongjae Cheon, Minsu Cho, Wook-Shin Han
Consistency regularization on label predictions becomes a fundamental technique in semi-supervised learning, but it still requires a large number of training iterations for high performance.
1 code implementation • CVPR 2022 • Chunghyun Park, Yoonwoo Jeong, Minsu Cho, Jaesik Park
The recent success of neural networks enables a better interpretation of 3D point clouds, but processing a large-scale 3D scene remains a challenging problem.
Ranked #24 on Semantic Segmentation on S3DIS
1 code implementation • 29 Nov 2021 • Jeongbeen Yoon, Dahyun Kang, Minsu Cho
Semi-supervised domain adaptation (SSDA) is to adapt a learner to a new domain with only a small set of labeled samples when a large labeled dataset is given on a source domain.
1 code implementation • NeurIPS 2021 • Manjin Kim, Heeseung Kwon, Chunyu Wang, Suha Kwak, Minsu Cho
Convolution has been arguably the most important feature transform for modern neural networks, leading to the advance of deep learning.
Ranked #11 on Action Recognition on Diving-48
1 code implementation • NeurIPS 2021 • Minguk Kang, Woohyeon Shim, Minsu Cho, Jaesik Park
On this foundation, we propose the Rebooted Auxiliary Classifier Generative Adversarial Network (ReACGAN).
Ranked #1 on Image Generation on CIFAR-10 (NFE metric)
no code implementations • NeurIPS 2021 • Hyunsoo Chung, Jungtaek Kim, Boris Knyazev, Jinhwi Lee, Graham W. Taylor, Jaesik Park, Minsu Cho
Discovering a solution in a combinatorial space is prevalent in many real-world problems but it is also challenging due to diverse complex constraints and the vast number of possible combinations.
no code implementations • NeurIPS 2021 • Minsu Cho, Aditya Balu, Ameya Joshi, Anjana Deva Prasad, Biswajit Khara, Soumik Sarkar, Baskar Ganapathysubramanian, Adarsh Krishnamurthy, Chinmay Hegde
Overall, we show that leveraging this redesigned Jacobian in the form of a differentiable "layer" in predictive models leads to improved performance in diverse applications such as image segmentation, 3D point cloud reconstruction, and finite element analysis.
no code implementations • 29 Sep 2021 • Chunghyun Park, Yoonwoo Jeong, Minsu Cho, Jaesik Park
Although sparse convolution is efficient and scalable for large 3D scenes, the quantization artifacts impair geometric details and degrade prediction accuracy.
no code implementations • 29 Sep 2021 • Jongmin Lee, Byungjin Kim, Minsu Cho
Therefore, we propose a rotation-invariant keypoint detection method using rotation-equivariant CNNs.
no code implementations • 29 Sep 2021 • Seung Wook Kim, Juhong Min, Minsu Cho
Establishing correspondences between images remains a challenging task, especially under large appearance changes due to different viewpoints and intra-class variations.
1 code implementation • 11 Sep 2021 • Juhong Min, SeungWook Kim, Minsu Cho
To validate the proposed techniques, we develop the neural network with CHM layers that perform convolutional matching in the space of translation and scaling.
no code implementations • ICCV 2021 • Junha Lee, SeungWook Kim, Minsu Cho, Jaesik Park
We then construct a set of triplets of correspondences to cast votes on the 6D Hough space, representing the transformation parameters in sparse tensors.
1 code implementation • ICCV 2021 • Yoonwoo Jeong, Seokjun Ahn, Christopher Choy, Animashree Anandkumar, Minsu Cho, Jaesik Park
We also propose a new geometric loss function, viz., projected ray distance loss, to incorporate geometric consistency for complex non-linear camera models.
no code implementations • ICCV 2021 • Ahyun Seo, Woohyeon Shim, Minsu Cho
The task of reflection symmetry detection remains challenging due to significant variations and ambiguities of symmetry patterns in the wild.
1 code implementation • ICCV 2021 • Dahyun Kang, Heeseung Kwon, Juhong Min, Minsu Cho
We propose to address the problem of few-shot classification by meta-learning "what to observe" and "where to attend" in a relational perspective.
Ranked #17 on Few-Shot Image Classification on CUB 200 5-way 5-shot
no code implementations • 17 Jun 2021 • Minsu Cho, Zahra Ghodsi, Brandon Reagen, Siddharth Garg, Chinmay Hegde
The emergence of deep learning has been accompanied by privacy concerns surrounding users' data and service providers' models.
1 code implementation • 4 Apr 2021 • Juhong Min, Dahyun Kang, Minsu Cho
Few-shot semantic segmentation aims at learning to segment a target object from a query image using only a few annotated support images of the target class.
Ranked #13 on Few-Shot Semantic Segmentation on FSS-1000 (5-shot)
1 code implementation • CVPR 2021 • Juhong Min, Minsu Cho
Despite advances in feature representation, leveraging geometric relations is crucial for establishing reliable visual correspondences under large variations of images.
Ranked #4 on Semantic correspondence on PF-WILLOW
2 code implementations • CVPR 2021 • Sungyeon Kim, Dongwon Kim, Minsu Cho, Suha Kwak
Our method exploits pairwise similarities between samples in the source embedding space as the knowledge, and transfers them through a loss used for learning target embedding models.
1 code implementation • ICCV 2021 • Heeseung Kwon, Manjin Kim, Suha Kwak, Minsu Cho
With a sufficient volume of the neighborhood in space and time, it effectively captures long-term interaction and fast motion in the video, leading to robust action recognition.
Ranked #19 on Action Recognition on Something-Something V1 (using extra training data)
no code implementations • 1 Jan 2021 • Sungyeon Kim, Dongwon Kim, Minsu Cho, Suha Kwak
To this end, we design a new loss called smooth contrastive loss, which pulls together or pushes apart a pair of samples in a target embedding space with strength determined by their semantic similarity in the source embedding space; an analysis of the loss reveals that this property enables more important pairs to contribute more to learning the target embedding space.
no code implementations • ICCV 2021 • Juhong Min, Dahyun Kang, Minsu Cho
Few-shot semantic segmentation aims at learning to segment a target object from a query image using only a few annotated support images of the target class.
no code implementations • NeurIPS Workshop LMCA 2020 • Minsu Cho, Ameya Joshi, Xian Yeow Lee, Aditya Balu, Adarsh Krishnamurthy, Baskar Ganapathysubramanian, Soumik Sarkar, Chinmay Hegde
The paradigm of differentiable programming has considerably enhanced the scope of machine learning via the judicious use of gradient-based optimization.
1 code implementation • 1 Jan 2021 • Heeseung Kwon, Manjin Kim, Suha Kwak, Minsu Cho
We leverage the whole volume of STSS and let our model learn to extract an effective motion representation from it.
no code implementations • 1 Jan 2021 • Jeongbeen Yoon, Dahyun Kang, Minsu Cho
Semi-supervised domain adaptation (SSDA) is to adapt a learner to a new domain with only a small set of labeled samples when a large labeled dataset is given on a source domain.
no code implementations • 26 Nov 2020 • Jungtaek Kim, Seungjin Choi, Minsu Cho
The main idea is to use a random mapping which embeds the combinatorial space into a convex polytope in a continuous space, on which all essential process is performed to determine a solution to the black-box optimization in the combinatorial space.
1 code implementation • NeurIPS 2020 • Woohyeon Shim, Minsu Cho
We present a novel discriminator for GANs that improves realness and diversity of generated samples by learning a structured hypersphere embedding space using spherical circles.
no code implementations • NeurIPS Workshop LMCA 2020 • Jinhwi Lee, Jungtaek Kim, Hyunsoo Chung, Jaesik Park, Minsu Cho
Our model processes the candidate fragments in a permutation-equivariant manner and can generalize to cases with an arbitrary number of fragments and even with a different target object.
no code implementations • 9 Sep 2020 • Wonpyo Park, Wonjae Kim, Kihyun You, Minsu Cho
Mutual learning is an ensemble training strategy to improve generalization by transferring individual knowledge to each other while simultaneously training multiple models.
1 code implementation • ECCV 2020 • Juhong Min, Jongmin Lee, Jean Ponce, Minsu Cho
Feature representation plays a crucial role in visual correspondence, and recent methods for image matching resort to deeply stacked convolutional layers.
Ranked #2 on Semantic correspondence on Caltech-101
2 code implementations • ECCV 2020 • Heeseung Kwon, Manjin Kim, Suha Kwak, Minsu Cho
As the frame-by-frame optical flows require heavy computation, incorporating motion information has remained a major computational bottleneck for video understanding.
Ranked #1 on Video Classification on Something-Something V2
2 code implementations • 13 Jul 2020 • Gyeongsik Moon, Heeseung Kwon, Kyoung Mu Lee, Minsu Cho
Most current action recognition methods heavily rely on appearance information by taking an RGB sequence of entire image regions as input.
no code implementations • 7 Jul 2020 • Minsu Cho, Mohammadreza Soltani, Chinmay Hegde
In this paper, we study two important problems in the automated design of neural networks -- Hyper-parameter Optimization (HPO), and Neural Architecture Search (NAS) -- through the lens of sparse recovery methods.
1 code implementation • 28 Jun 2020 • Minsu Cho, Ameya Joshi, Chinmay Hegde
Deep neural networks are often highly overparameterized, prohibiting their use in compute-limited systems.
3 code implementations • 16 Apr 2020 • Jungtaek Kim, Hyunsoo Chung, Jinhwi Lee, Minsu Cho, Jaesik Park
To alleviate this consequence induced by a huge number of feasible combinations, we propose a combinatorial 3D shape generation framework.
1 code implementation • CVPR 2020 • Jonghwan Mun, Minsu Cho, Bohyung Han
This paper addresses the problem of text-to-video temporal grounding, which aims to identify the time interval in a video semantically relevant to a text query.
3 code implementations • CVPR 2020 • Sungyeon Kim, Dongwon Kim, Minsu Cho, Suha Kwak
The former class can leverage fine-grained semantic relations between data points, but slows convergence in general due to its high training complexity.
Ranked #10 on Metric Learning on CUB-200-2011 (using extra training data)
Fine-Grained Image Classification Fine-Grained Vehicle Classification +1
4 code implementations • 25 Feb 2020 • Sangwoo Mo, Minsu Cho, Jinwoo Shin
Generative adversarial networks (GANs) have shown outstanding performance on a wide range of problems in computer vision, graphics, and machine learning, but often require numerous training data and heavy computational resources.
Ranked #5 on 10-shot image generation on Babies
no code implementations • 25 Nov 2019 • Ilchae Jung, Kihyun You, Hyeonwoo Noh, Minsu Cho, Bohyung Han
We propose a novel meta-learning framework for real-time object tracking with efficient model adaptation and channel pruning.
1 code implementation • NeurIPS 2019 • Sangwoo Mo, Chiheon Kim, Sungwoong Kim, Minsu Cho, Jinwoo Shin
Conditional generative adversarial networks (cGANs) have gained a considerable attention in recent years due to its class-wise controllability and superior quality for complex generation tasks.
no code implementations • 3 Oct 2019 • Wonpyo Park, Paul Hongsuck Seo, Bohyung Han, Minsu Cho
We introduce a novel stochastic regularization technique for deep neural networks, which decomposes a layer into multiple branches with different parameters and merges stochastically sampled combinations of the outputs from the branches during training.
no code implementations • 28 Aug 2019 • Juhong Min, Jongmin Lee, Jean Ponce, Minsu Cho
In this paper, we present a new large-scale benchmark dataset of semantically paired images, SPair-71k, which contains 70, 958 image pairs with diverse variations in viewpoint and scale.
1 code implementation • ICCV 2019 • Juhong Min, Jongmin Lee, Jean Ponce, Minsu Cho
Establishing visual correspondences under large intra-class variations requires analyzing images at different levels, from features linked to semantics and context to local patterns, while being invariant to instance-specific details.
Ranked #1 on Semantic correspondence on Caltech-101
1 code implementation • 7 Jun 2019 • Minsu Cho, Mohammadreza Soltani, Chinmay Hegde
Neural Architecture Search remains a very challenging meta-learning problem.
1 code implementation • ICLR 2019 • Sangwoo Mo, Minsu Cho, Jinwoo Shin
Unsupervised image-to-image translation has gained considerable attention due to the recent impressive progress based on generative adversarial networks (GANs).
no code implementations • 24 Apr 2019 • Minsu Cho, Chinmay Hegde
We propose a new algorithm for hyperparameter selection in machine learning algorithms.
1 code implementation • CVPR 2019 • Sungyeon Kim, Minkyo Seo, Ivan Laptev, Minsu Cho, Suha Kwak
Metric Learning for visual similarity has mostly adopted binary supervision indicating whether a pair of images are of the same class or not.
no code implementations • 15 Apr 2019 • Seungkwan Lee, Suha Kwak, Minsu Cho
Bounding-box regression is a popular technique to refine or predict localization boxes in recent object detection approaches.
3 code implementations • CVPR 2019 • Wonpyo Park, Dongju Kim, Yan Lu, Minsu Cho
Knowledge distillation aims at transferring knowledge acquired in one model (a teacher) to another model (a student) that is typically smaller.
1 code implementation • CVPR 2019 • Huy V. Vo, Francis Bach, Minsu Cho, Kai Han, Yann Lecun, Patrick Perez, Jean Ponce
Learning with complete or partial supervision is powerful but relies on ever-growing human annotation efforts.
Ranked #2 on Single-object colocalization on Object Discovery
1 code implementation • 28 Dec 2018 • Sangwoo Mo, Minsu Cho, Jinwoo Shin
Our comparative evaluation demonstrates the effectiveness of the proposed method on different image datasets, in particular, in the aforementioned challenging cases.
no code implementations • ECCV 2018 • Paul Hongsuck Seo, Jongmin Lee, Deunsol Jung, Bohyung Han, Minsu Cho
Semantic correspondence is the problem of establishing correspondences across images depicting different instances of the same object or scene class.
no code implementations • CVPR 2017 • Jeany Son, Mooyeol Baek, Minsu Cho, Bohyung Han
We propose Quadruplet Convolutional Neural Networks (Quad-CNN) for multi-object tracking, which learn to associate object detections across frames using quadruplet losses.
1 code implementation • ICCV 2017 • Kai Han, Rafael S. Rezende, Bumsub Ham, Kwan-Yee K. Wong, Minsu Cho, Cordelia Schmid, Jean Ponce
This paper addresses the problem of establishing semantic correspondences between images depicting different instances of the same object or scene category.
no code implementations • 21 Mar 2017 • Bumsub Ham, Minsu Cho, Cordelia Schmid, Jean Ponce
Finding image correspondences remains a challenging problem in the presence of intra-class variations and large changes in scene layout.
1 code implementation • 12 Dec 2016 • Jonghwan Mun, Minsu Cho, Bohyung Han
Visual attention plays an important role to understand images and demonstrates its effectiveness in generating natural language descriptions of images.
1 code implementation • 14 Sep 2016 • Vadim Kantorov, Maxime Oquab, Minsu Cho, Ivan Laptev
The additive model encourages the predicted object region to be supported by its surrounding context region.
Ranked #4 on Weakly Supervised Object Detection on Charades
no code implementations • CVPR 2016 • Suha Kwak, Minsu Cho, Ivan Laptev
We address the problem of learning a pose-aware, compact embedding that projects images with similar human poses to be placed close-by in the embedding space.
no code implementations • CVPR 2016 • Bumsub Ham, Minsu Cho, Cordelia Schmid, Jean Ponce
Finding image correspondences remains a challenging problem in the presence of intra-class variations and large changes in scene layout.~Semantic flow methods are designed to handle images depicting different instances of the same object or scene category.
no code implementations • CVPR 2015 • Bumsub Ham, Minsu Cho, Jean Ponce
Regularizing images under a guidance signal has been used in various tasks in computer vision and computational photography, particularly for noise reduction and joint upsampling.
no code implementations • ICCV 2015 • Suha Kwak, Minsu Cho, Ivan Laptev, Jean Ponce, Cordelia Schmid
This paper addresses the problem of automatically localizing dominant objects as spatio-temporal tubes in a noisy collection of videos with minimal or even no supervision.
no code implementations • 20 Feb 2015 • Junchi Yan, Minsu Cho, Hongyuan Zha, Xiaokang Yang, Stephen Chu
We propose multi-graph matching methods to incorporate the two aspects by boosting the affinity score, meanwhile gradually infusing the consistency as a regularizer.
no code implementations • CVPR 2015 • Minsu Cho, Suha Kwak, Cordelia Schmid, Jean Ponce
This paper addresses unsupervised discovery and localization of dominant objects from a noisy image collection with multiple object classes.
no code implementations • CVPR 2014 • Minsu Cho, Jian Sun, Olivier Duchenne, Jean Ponce
A major challenge in real-world feature matching problems is to tolerate the numerous outliers arising in typical visual tasks.