Search Results for author: Hyunwoo J. Kim

Found 54 papers, 36 papers with code

Robust Multimodal 3D Object Detection via Modality-Agnostic Decoding and Proximity-based Modality Ensemble

1 code implementation27 Jul 2024 Juhan Cha, Minseok Joo, Jihwan Park, Sanghyeok Lee, Injae Kim, Hyunwoo J. Kim

Additionally, existing fusion methods overlook the detrimental impact of sensor noise induced by environmental changes, on detection performance.

3D Object Detection object-detection

Diffusion Prior-Based Amortized Variational Inference for Noisy Inverse Problems

1 code implementation23 Jul 2024 Sojin Lee, Dogyun Park, Inho Kong, Hyunwoo J. Kim

Recent studies on inverse problems have proposed posterior samplers that leverage the pre-trained diffusion models as powerful priors.

 Ranked #1 on Image Super-Resolution on ImageNet (using extra training data)

Colorization Deblurring +7

Retrieval-Augmented Open-Vocabulary Object Detection

1 code implementation CVPR 2024 Jooyeon Kim, Eulrang Cho, Sehyung Kim, Hyunwoo J. Kim

Specifically, RALF consists of two modules: Retrieval Augmented Losses (RAL) and Retrieval-Augmented visual Features (RAF).

Ranked #10 on Open Vocabulary Object Detection on MSCOCO (using extra training data)

Language Modelling Large Language Model +6

Prompt Learning via Meta-Regularization

1 code implementation CVPR 2024 Jinyoung Park, Juyeon Ko, Hyunwoo J. Kim

Recently, prompt learning approaches have been explored to efficiently and effectively adapt the vision-language models to a variety of downstream tasks.

Domain Generalization General Knowledge +1

Groupwise Query Specialization and Quality-Aware Multi-Assignment for Transformer-based Visual Relationship Detection

1 code implementation CVPR 2024 Jongha Kim, Jihwan Park, Jinyoung Park, Jinyoung Kim, Sehyung Kim, Hyunwoo J. Kim

Groupwise Query Specialization trains a specialized query by dividing queries and relations into disjoint groups and directing a query in a specific query group solely toward relations in the corresponding relation group.

Relation Relationship Detection +2

vid-TLDR: Training Free Token merging for Light-weight Video Transformer

1 code implementation CVPR 2024 Joonmyung Choi, Sanghyeok Lee, Jaewon Chu, Minhyuk Choi, Hyunwoo J. Kim

To tackle these issues, we propose training free token merging for lightweight video Transformer (vid-TLDR) that aims to enhance the efficiency of video Transformers by merging the background tokens without additional training.

Ranked #2 on Video Retrieval on SSv2-template retrieval (using extra training data)

Action Recognition Computational Efficiency +5

Graph Elicitation for Guiding Multi-Step Reasoning in Large Language Models

no code implementations16 Nov 2023 Jinyoung Park, Ameen Patel, Omar Zia Khan, Hyunwoo J. Kim, Joo-Kyung Kim

To deal with them, we propose a GE-Reasoning method, which directs LLMs to generate proper sub-questions and corresponding answers.

Multi-hop Question Answering Question Answering +2

UP-NeRF: Unconstrained Pose-Prior-Free Neural Radiance Fields

1 code implementation7 Nov 2023 Injae Kim, Minhyuk Choi, Hyunwoo J. Kim

Neural Radiance Field (NeRF) has enabled novel view synthesis with high fidelity given images and camera poses.

Novel View Synthesis Pose Estimation

NuTrea: Neural Tree Search for Context-guided Multi-hop KGQA

1 code implementation NeurIPS 2023 Hyeong Kyu Choi, Seunghun Lee, Jaewon Chu, Hyunwoo J. Kim

Multi-hop Knowledge Graph Question Answering (KGQA) is a task that involves retrieving nodes from a knowledge graph (KG) to answer natural language questions.

Graph Question Answering Proper Noun +1

Large Language Models are Temporal and Causal Reasoners for Video Question Answering

1 code implementation24 Oct 2023 Dohwan Ko, Ji Soo Lee, Wooyoung Kang, Byungseok Roh, Hyunwoo J. Kim

We observe that the LLMs provide effective priors in exploiting $\textit{linguistic shortcuts}$ for temporal and causal reasoning in Video Question Answering (VideoQA).

Natural Language Understanding Question Answering +2

Distribution-Aware Prompt Tuning for Vision-Language Models

1 code implementation ICCV 2023 Eulrang Cho, Jooyeon Kim, Hyunwoo J. Kim

Pre-trained vision-language models (VLMs) have shown impressive performance on various downstream tasks by utilizing knowledge learned from large data.

Semantic-Aware Implicit Template Learning via Part Deformation Consistency

1 code implementation ICCV 2023 Sihyeon Kim, Minseok Joo, Jaewon Lee, Juyeon Ko, Juhan Cha, Hyunwoo J. Kim

In this paper, we highlight the importance of part deformation consistency and propose a semantic-aware implicit template learning framework to enable semantically plausible deformation.

Concept Bottleneck with Visual Concept Filtering for Explainable Medical Image Classification

no code implementations23 Aug 2023 Injae Kim, Jongha Kim, Joonmyung Choi, Hyunwoo J. Kim

However, those methods do not consider whether a concept is visually relevant or not, which is an important factor in computing meaningful concept scores.

Image Classification Medical Image Classification

Self-positioning Point-based Transformer for Point Cloud Understanding

1 code implementation CVPR 2023 Jinyoung Park, Sanghyeok Lee, Sihyeon Kim, Yunyang Xiong, Hyunwoo J. Kim

In this paper, we present a Self-Positioning point-based Transformer (SPoTr), which is designed to capture both local and global shape contexts with reduced complexity.

3D Part Segmentation Scene Segmentation +1

MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models

1 code implementation CVPR 2023 Dohwan Ko, Joonmyung Choi, Hyeong Kyu Choi, Kyoung-Woon On, Byungseok Roh, Hyunwoo J. Kim

Therefore, we propose MEta Loss TRansformer (MELTR), a plug-in module that automatically and non-linearly combines various loss functions to aid learning the target task via auxiliary learning.

Auxiliary Learning Multimodal Sentiment Analysis +10

k-SALSA: k-anonymous synthetic averaging of retinal images via local style alignment

1 code implementation20 Mar 2023 Minkyu Jeon, Hyeonjin Park, Hyunwoo J. Kim, Michael Morley, Hyunghoon Cho

While prior works have explored image de-identification strategies based on synthetic averaging of images in other domains (e. g. facial images), existing techniques face difficulty in preserving both privacy and clinical utility in retinal images, as we demonstrate in our work.

De-identification Generative Adversarial Network

Semantic-aware Occlusion Filtering Neural Radiance Fields in the Wild

no code implementations5 Mar 2023 Jaewon Lee, Injae Kim, Hwan Heo, Hyunwoo J. Kim

We present a learning framework for reconstructing neural scene representations from a small number of unconstrained tourist photos.

Novel View Synthesis

Robust Camera Pose Refinement for Multi-Resolution Hash Encoding

no code implementations3 Feb 2023 Hwan Heo, Taekyung Kim, Jiyoung Lee, Jaewon Lee, Soohyun Kim, Hyunwoo J. Kim, Jin-Hwa Kim

Multi-resolution hash encoding has recently been proposed to reduce the computational cost of neural renderings, such as NeRF.

Neural Rendering Novel View Synthesis

Domain Generalization Emerges from Dreaming

no code implementations2 Feb 2023 Hwan Heo, Youngjin Oh, Jaewon Lee, Hyunwoo J. Kim

Recent studies have proven that DNNs, unlike human vision, tend to exploit texture information rather than shape.

Data Augmentation Domain Generalization +1

Relation-Aware Language-Graph Transformer for Question Answering

1 code implementation2 Dec 2022 Jinyoung Park, Hyeong Kyu Choi, Juyeon Ko, Hyeonjin Park, Ji-Hoon Kim, Jisu Jeong, KyungMin Kim, Hyunwoo J. Kim

To address these issues, we propose Question Answering Transformer (QAT), which is designed to jointly reason over language and graphs with respect to entity relations in a unified manner.

Question Answering Relation

Invertible Monotone Operators for Normalizing Flows

1 code implementation15 Oct 2022 Byeongkeun Ahn, Chiyoon Kim, Youngjoon Hong, Hyunwoo J. Kim

Normalizing flows model probability distributions by learning invertible transformations that transfer a simple distribution into complex distributions.

Density Estimation

TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers

1 code implementation14 Oct 2022 Hyeong Kyu Choi, Joonmyung Choi, Hyunwoo J. Kim

To this end, we propose TokenMixup, an efficient attention-guided token-level data augmentation method that aims to maximize the saliency of a mixed set of tokens.

Data Augmentation Image Classification

SageMix: Saliency-Guided Mixup for Point Clouds

1 code implementation13 Oct 2022 Sanghyeok Lee, Minkyu Jeon, Injae Kim, Yunyang Xiong, Hyunwoo J. Kim

Mixup is a simple and widely-used data augmentation technique that has proven effective in alleviating the problems of overfitting and data scarcity.

3D Part Segmentation 3D Point Cloud Classification +3

Deformable Graph Transformer

no code implementations29 Jun 2022 Jinyoung Park, Seongjun Yun, Hyeonjin Park, Jaewoo Kang, Jisu Jeong, Kyung-Min Kim, Jung-Woo Ha, Hyunwoo J. Kim

Transformer-based models have recently shown success in representation learning on graph-structured data beyond natural language processing and computer vision.

Representation Learning

Consistency Learning via Decoding Path Augmentation for Transformers in Human Object Interaction Detection

1 code implementation CVPR 2022 Jihwan Park, Seungjun Lee, Hwan Heo, Hyeong Kyu Choi, Hyunwoo J. Kim

Motivated by various inference paths for HOI detection, we propose cross-path consistency learning (CPC), which is a novel end-to-end learning strategy to improve HOI detection for transformers by leveraging augmented decoding paths.

Human-Object Interaction Detection object-detection +1

Video-Text Representation Learning via Differentiable Weak Temporal Alignment

1 code implementation CVPR 2022 Dohwan Ko, Joonmyung Choi, Juyeon Ko, Shinyeong Noh, Kyoung-Woon On, Eun-Sol Kim, Hyunwoo J. Kim

In this paper, we propose a novel multi-modal self-supervised framework Video-Text Temporally Weak Alignment-based Contrastive Learning (VT-TWINS) to capture significant information from noisy and weakly correlated data using a variant of Dynamic Time Warping (DTW).

Contrastive Learning Dynamic Time Warping +1

Metropolis-Hastings Data Augmentation for Graph Neural Networks

no code implementations NeurIPS 2021 Hyeonjin Park, Seunghun Lee, Sihyeon Kim, Jinyoung Park, Jisu Jeong, Kyung-Min Kim, Jung-Woo Ha, Hyunwoo J. Kim

We also propose a simple and effective semi-supervised learning strategy with generated samples from MH-Aug. Our extensive experiments demonstrate that MH-Aug can generate a sequence of samples according to the target distribution to significantly improve the performance of GNNs.

Data Augmentation Diversity

Improving Object Detection, Multi-object Tracking, and Re-Identification for Disaster Response Drones

4 code implementations5 Jan 2022 Chongkeun Paik, Hyunwoo J. Kim

In the second approach, although DeepSORT only processes a quarter of all frames due to hardware and time limitations, our model with DeepSORT (42. 9%) outperforms FairMOT (71. 4%) in terms of recall.

Disaster Response Multi-Object Tracking +3

Deformable Graph Convolutional Networks

1 code implementation29 Dec 2021 Jinyoung Park, Sungdong Yoo, Jihwan Park, Hyunwoo J. Kim

To address the two common problems of graph convolution, in this paper, we propose Deformable Graph Convolutional Networks (Deformable GCNs) that adaptively perform convolution in multiple latent spaces and capture short/long-range dependencies between nodes.

Node Classification on Non-Homophilic (Heterophilic) Graphs Representation Learning

Graph Transformer Networks: Learning Meta-path Graphs to Improve GNNs

1 code implementation11 Jun 2021 Seongjun Yun, Minbyul Jeong, Sungdong Yoo, Seunghun Lee, Sean S. Yi, Raehyun Kim, Jaewoo Kang, Hyunwoo J. Kim

Despite the success of GNNs, most existing GNNs are designed to learn node representations on the fixed and homogeneous graphs.

Node Classification

HOTR: End-to-End Human-Object Interaction Detection with Transformers

1 code implementation CVPR 2021 Bumsoo Kim, Junhyun Lee, Jaewoo Kang, Eun-Sol Kim, Hyunwoo J. Kim

Human-Object Interaction (HOI) detection is a task of identifying "a set of interactions" in an image, which involves the i) localization of the subject (i. e., humans) and target (i. e., objects) of interaction, and ii) the classification of the interaction labels.

Decoder Human-Object Interaction Detection +3

Robust Neural Networks inspired by Strong Stability Preserving Runge-Kutta methods

1 code implementation ECCV 2020 Byungjoo Kim, Bryce Chudomelka, Jinyoung Park, Jaewoo Kang, Youngjoon Hong, Hyunwoo J. Kim

Motivated by the SSP property and a generalized Runge-Kutta method, we propose Strong Stability Preserving networks (SSP networks) which improve robustness against adversarial attacks.

Unpaired Image Translation via Adaptive Convolution-based Normalization

no code implementations29 Nov 2019 Wonwoong Cho, Kangyeol Kim, Eungyeup Kim, Hyunwoo J. Kim, Jaegul Choo

Disentangling content and style information of an image has played an important role in recent success in image translation.

Translation

Graph Transformer Networks

1 code implementation NeurIPS 2019 Seongjun Yun, Minbyul Jeong, Raehyun Kim, Jaewoo Kang, Hyunwoo J. Kim

In this paper, we propose Graph Transformer Networks (GTNs) that are capable of generating new graph structures, which involve identifying useful connections between unconnected nodes on the original graph, while learning effective node representation on the new graphs in an end-to-end fashion.

General Classification Heterogeneous Node Classification +2

ANTNets: Mobile Convolutional Neural Networks for Resource Efficient Image Classification

no code implementations7 Apr 2019 Yunyang Xiong, Hyunwoo J. Kim, Varsha Hedau

It boosts the representational power by modeling, in a high dimensional space, interdependency of channels between a depthwise convolution layer and a projection layer in the ANTBlocks.

Classification General Classification +1

Efficient Relative Attribute Learning using Graph Neural Networks

1 code implementation ECCV 2018 Zihang Meng, Nagesh Adluru, Hyunwoo J. Kim, Glenn Fung, Vikas Singh

A sizable body of work on relative attributes provides compelling evidence that relating pairs of images along a continuum of strength pertaining to a visual attribute yields significant improvements in a wide variety of tasks in vision.

Attribute Clothing Attribute Recognition

Tensorize, Factorize and Regularize: Robust Visual Relationship Learning

no code implementations CVPR 2018 Seong Jae Hwang, Sathya N. Ravi, Zirui Tao, Hyunwoo J. Kim, Maxwell D. Collins, Vikas Singh

Visual relationships provide higher-level information of objects and their relations in an image – this enables a semantic understanding of the scene and helps downstream applications.

Relational Reasoning Relationship Detection +1

Sampling-free Uncertainty Estimation in Gated Recurrent Units with Exponential Families

no code implementations19 Apr 2018 Seong Jae Hwang, Ronak Mehta, Hyunwoo J. Kim, Vikas Singh

There has recently been a concerted effort to derive mechanisms in vision and machine learning systems to offer uncertainty estimates of the predictions they make.

Finding Differentially Covarying Needles in a Temporally Evolving Haystack: A Scan Statistics Perspective

no code implementations20 Nov 2017 Ronak Mehta, Hyunwoo J. Kim, Shulei Wang, Sterling C. Johnson, Ming Yuan, Vikas Singh

Recent results in coupled or temporal graphical models offer schemes for estimating the relationship structure between features when the data come from related (but distinct) longitudinal sources.

Riemannian Nonlinear Mixed Effects Models: Analyzing Longitudinal Deformations in Neuroimaging

no code implementations CVPR 2017 Hyunwoo J. Kim, Nagesh Adluru, Heemanshu Suri, Baba C. Vemuri, Sterling C. Johnson, Vikas Singh

Statistical machine learning models that operate on manifold-valued data are being extensively studied in vision, motivated by applications in activity recognition, feature tracking and medical imaging.

Activity Recognition regression

Latent Variable Graphical Model Selection Using Harmonic Analysis: Applications to the Human Connectome Project (HCP)

no code implementations CVPR 2016 Won Hwa Kim, Hyunwoo J. Kim, Nagesh Adluru, Vikas Singh

A major goal of imaging studies such as the (ongoing) Human Connectome Project (HCP) is to characterize the structural network map of the human brain and identify its associations with covariates such as genotype, risk factors, and so on that correspond to an individual.

Model Selection

Interpolation on the Manifold of K Component GMMs

no code implementations ICCV 2015 Hyunwoo J. Kim, Nagesh Adluru, Monami Banerjee, Baba C. Vemuri, Vikas Singh

Probability density functions (PDFs) are fundamental "objects" in mathematics with numerous applications in computer vision, machine learning and medical imaging.

Cannot find the paper you are looking for? You can Submit a new open access paper.