Search Results for author: Hyunwoo J. Kim

Found 51 papers, 32 papers with code

Retrieval-Augmented Open-Vocabulary Object Detection

2 code implementations • 8 Apr 2024 • Jooyeon Kim, Eulrang Cho, Sehyung Kim, Hyunwoo J. Kim

Specifically, RALF consists of two modules: Retrieval Augmented Losses (RAL) and Retrieval-Augmented visual Features (RAF).

Ranked #9 on Open Vocabulary Object Detection on MSCOCO (using extra training data)

Language Modelling Large Language Model +6

140

Paper
Code

Prompt Learning via Meta-Regularization

1 code implementation • 1 Apr 2024 • Jinyoung Park, Juyeon Ko, Hyunwoo J. Kim

Recently, prompt learning approaches have been explored to efficiently and effectively adapt the vision-language models to a variety of downstream tasks.

Ranked #2 on Prompt Engineering on Stanford Cars

Domain Generalization General Knowledge +1

Paper
Code

Groupwise Query Specialization and Quality-Aware Multi-Assignment for Transformer-based Visual Relationship Detection

1 code implementation • 26 Mar 2024 • Jongha Kim, Jihwan Park, Jinyoung Park, Jinyoung Kim, Sehyung Kim, Hyunwoo J. Kim

Groupwise Query Specialization trains a specialized query by dividing queries and relations into disjoint groups and directing a query in a specific query group solely toward relations in the corresponding relation group.

Ranked #1 on Scene Graph Generation on Visual Genome

Relation Relationship Detection +2

Paper
Code

vid-TLDR: Training Free Token merging for Light-weight Video Transformer

1 code implementation • 20 Mar 2024 • Joonmyung Choi, Sanghyeok Lee, Jaewon Chu, Minhyuk Choi, Hyunwoo J. Kim

To tackle these issues, we propose training free token merging for lightweight video Transformer (vid-TLDR) that aims to enhance the efficiency of video Transformers by merging the background tokens without additional training.

Ranked #2 on Video Retrieval on SSv2-template retrieval (using extra training data)

Action Recognition Computational Efficiency +5

Paper
Code

Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers

1 code implementation • 15 Mar 2024 • Sanghyeok Lee, Joonmyung Choi, Hyunwoo J. Kim

Here, we argue that token fusion needs to consider diverse relations between tokens to minimize information loss.

Ranked #1 on Efficient ViTs on ImageNet-1K (With LV-ViT-S)

Computational Efficiency Efficient ViTs

Paper
Code

Stochastic Conditional Diffusion Models for Semantic Image Synthesis

no code implementations • 26 Feb 2024 • Juyeon Ko, Inho Kong, Dogyun Park, Hyunwoo J. Kim

This facilitates the generation of an image close to a clean image, enabling robust generation.

Image Generation

Paper
Add Code

DDMI: Domain-Agnostic Latent Diffusion Models for Synthesizing High-Quality Implicit Neural Representations

1 code implementation • 23 Jan 2024 • Dogyun Park, Sihyeon Kim, Sojin Lee, Hyunwoo J. Kim

Arguably, this architecture limits the expressive power of generative models and results in low-quality INR generation.

Ranked #2 on Video Generation on Sky Time-lapse

3D Shape Generation Image Generation +1

Paper
Code

UnionDet: Union-Level Detector Towards Real-Time Human-Object Interaction Detection

no code implementations • ECCV 2020 • Bumsoo Kim, Taeho Choi, Jaewoo Kang, Hyunwoo J. Kim

This is a major bottleneck in HOI detection inference time.

Human-Object Interaction Detection object-detection +1

Paper
Add Code

Graph-Guided Reasoning for Multi-Hop Question Answering in Large Language Models

no code implementations • 16 Nov 2023 • Jinyoung Park, Ameen Patel, Omar Zia Khan, Hyunwoo J. Kim, Joo-Kyung Kim

Specifically, we first leverage LLMs to construct a "question/rationale graph" by using knowledge extraction prompting given the initial question and the rationales generated in the previous steps.

Multi-hop Question Answering Question Answering

Paper
Add Code

UP-NeRF: Unconstrained Pose-Prior-Free Neural Radiance Fields

1 code implementation • 7 Nov 2023 • Injae Kim, Minhyuk Choi, Hyunwoo J. Kim

Neural Radiance Field (NeRF) has enabled novel view synthesis with high fidelity given images and camera poses.

Novel View Synthesis Pose Estimation

Paper
Code

Large Language Models are Temporal and Causal Reasoners for Video Question Answering

1 code implementation • 24 Oct 2023 • Dohwan Ko, Ji Soo Lee, Wooyoung Kang, Byungseok Roh, Hyunwoo J. Kim

We observe that the LLMs provide effective priors in exploiting $\textit{linguistic shortcuts}$ for temporal and causal reasoning in Video Question Answering (VideoQA).

Ranked #1 on Video Question Answering on TVQA

Natural Language Understanding Question Answering +2

Paper
Code

Distribution-Aware Prompt Tuning for Vision-Language Models

1 code implementation • ICCV 2023 • Eulrang Cho, Jooyeon Kim, Hyunwoo J. Kim

Pre-trained vision-language models (VLMs) have shown impressive performance on various downstream tasks by utilizing knowledge learned from large data.

Paper
Code

Read-only Prompt Optimization for Vision-Language Few-shot Learning

1 code implementation • ICCV 2023 • Dongjun Lee, Seokwon Song, Jihee Suh, Joonmyung Choi, Sanghyeok Lee, Hyunwoo J. Kim

RPO leverages masked attention to prevent the internal representation shift in the pre-trained model.

Ranked #6 on Prompt Engineering on Caltech-101

Domain Generalization Few-Shot Learning +1

Paper
Code

Semantic-Aware Implicit Template Learning via Part Deformation Consistency

1 code implementation • ICCV 2023 • Sihyeon Kim, Minseok Joo, Jaewon Lee, Juyeon Ko, Juhan Cha, Hyunwoo J. Kim

In this paper, we highlight the importance of part deformation consistency and propose a semantic-aware implicit template learning framework to enable semantically plausible deformation.

Paper
Code

Concept Bottleneck with Visual Concept Filtering for Explainable Medical Image Classification

no code implementations • 23 Aug 2023 • Injae Kim, Jongha Kim, Joonmyung Choi, Hyunwoo J. Kim

However, those methods do not consider whether a concept is visually relevant or not, which is an important factor in computing meaningful concept scores.

Image Classification Medical Image Classification

Paper
Add Code

Open-vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models

1 code implementation • ICCV 2023 • Dohwan Ko, Ji Soo Lee, Miso Choi, Jaewon Chu, Jihwan Park, Hyunwoo J. Kim

We hence propose a new benchmark, Open-vocabulary Video Question Answering (OVQA), to measure the generalizability of VideoQA models by considering rare and unseen answers.

Ranked #8 on Visual Question Answering (VQA) on MSRVTT-QA

Multiple-choice Question Answering +4

Paper
Code

Self-positioning Point-based Transformer for Point Cloud Understanding

1 code implementation • CVPR 2023 • Jinyoung Park, Sanghyeok Lee, Sihyeon Kim, Yunyang Xiong, Hyunwoo J. Kim

In this paper, we present a Self-Positioning point-based Transformer (SPoTr), which is designed to capture both local and global shape contexts with reduced complexity.

Ranked #2 on 3D Part Segmentation on ShapeNet-Part

3D Part Segmentation 3D Point Cloud Classification +1

Paper
Code

MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models

1 code implementation • CVPR 2023 • Dohwan Ko, Joonmyung Choi, Hyeong Kyu Choi, Kyoung-Woon On, Byungseok Roh, Hyunwoo J. Kim

Therefore, we propose MEta Loss TRansformer (MELTR), a plug-in module that automatically and non-linearly combines various loss functions to aid learning the target task via auxiliary learning.

Ranked #2 on Video Captioning on YouCook2

Auxiliary Learning Multimodal Sentiment Analysis +10

Paper
Code

k-SALSA: k-anonymous synthetic averaging of retinal images via local style alignment

1 code implementation • 20 Mar 2023 • Minkyu Jeon, Hyeonjin Park, Hyunwoo J. Kim, Michael Morley, Hyunghoon Cho

While prior works have explored image de-identification strategies based on synthetic averaging of images in other domains (e. g. facial images), existing techniques face difficulty in preserving both privacy and clinical utility in retinal images, as we demonstrate in our work.

De-identification Generative Adversarial Network

Paper
Code

Semantic-aware Occlusion Filtering Neural Radiance Fields in the Wild

no code implementations • 5 Mar 2023 • Jaewon Lee, Injae Kim, Hwan Heo, Hyunwoo J. Kim

We present a learning framework for reconstructing neural scene representations from a small number of unconstrained tourist photos.

Novel View Synthesis

Paper
Add Code

Robust Camera Pose Refinement for Multi-Resolution Hash Encoding

no code implementations • 3 Feb 2023 • Hwan Heo, Taekyung Kim, Jiyoung Lee, Jaewon Lee, Soohyun Kim, Hyunwoo J. Kim, Jin-Hwa Kim

Multi-resolution hash encoding has recently been proposed to reduce the computational cost of neural renderings, such as NeRF.

Neural Rendering Novel View Synthesis

Paper
Add Code

Domain Generalization Emerges from Dreaming

no code implementations • 2 Feb 2023 • Hwan Heo, Youngjin Oh, Jaewon Lee, Hyunwoo J. Kim

Recent studies have proven that DNNs, unlike human vision, tend to exploit texture information rather than shape.

Data Augmentation Domain Generalization +1

Paper
Add Code

Relation-Aware Language-Graph Transformer for Question Answering

1 code implementation • 2 Dec 2022 • Jinyoung Park, Hyeong Kyu Choi, Juyeon Ko, Hyeonjin Park, Ji-Hoon Kim, Jisu Jeong, KyungMin Kim, Hyunwoo J. Kim

To address these issues, we propose Question Answering Transformer (QAT), which is designed to jointly reason over language and graphs with respect to entity relations in a unified manner.

Question Answering Relation

Paper
Code

Invertible Monotone Operators for Normalizing Flows

1 code implementation • 15 Oct 2022 • Byeongkeun Ahn, Chiyoon Kim, Youngjoon Hong, Hyunwoo J. Kim

Normalizing flows model probability distributions by learning invertible transformations that transfer a simple distribution into complex distributions.

Density Estimation

Paper
Code

TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers

1 code implementation • 14 Oct 2022 • Hyeong Kyu Choi, Joonmyung Choi, Hyunwoo J. Kim

To this end, we propose TokenMixup, an efficient attention-guided token-level data augmentation method that aims to maximize the saliency of a mixed set of tokens.

Ranked #65 on Image Classification on CIFAR-10

Data Augmentation Image Classification

Paper
Code

SageMix: Saliency-Guided Mixup for Point Clouds

1 code implementation • 13 Oct 2022 • Sanghyeok Lee, Minkyu Jeon, Injae Kim, Yunyang Xiong, Hyunwoo J. Kim

Mixup is a simple and widely-used data augmentation technique that has proven effective in alleviating the problems of overfitting and data scarcity.

Ranked #40 on 3D Part Segmentation on ShapeNet-Part

3D Part Segmentation 3D Point Cloud Classification +3

Paper
Code

Deformable Graph Transformer

no code implementations • 29 Jun 2022 • Jinyoung Park, Seongjun Yun, Hyeonjin Park, Jaewoo Kang, Jisu Jeong, Kyung-Min Kim, Jung-Woo Ha, Hyunwoo J. Kim

Transformer-based models have recently shown success in representation learning on graph-structured data beyond natural language processing and computer vision.

Representation Learning

Paper
Add Code

Neo-GNNs: Neighborhood Overlap-aware Graph Neural Networks for Link Prediction

1 code implementation • NeurIPS 2021 • Seongjun Yun, Seoyoon Kim, Junhyun Lee, Jaewoo Kang, Hyunwoo J. Kim

Graph Neural Networks (GNNs) have been widely applied to various fields for learning over graph-structured data.

Graph Classification Link Prediction +1

Paper
Code

Consistency Learning via Decoding Path Augmentation for Transformers in Human Object Interaction Detection

1 code implementation • CVPR 2022 • Jihwan Park, Seungjun Lee, Hwan Heo, Hyeong Kyu Choi, Hyunwoo J. Kim

Motivated by various inference paths for HOI detection, we propose cross-path consistency learning (CPC), which is a novel end-to-end learning strategy to improve HOI detection for transformers by leveraging augmented decoding paths.

Ranked #1 on Human-Object Interaction Detection on V-COCO (MAP metric)

Human-Object Interaction Detection object-detection +1

Paper
Code

Video-Text Representation Learning via Differentiable Weak Temporal Alignment

1 code implementation • CVPR 2022 • Dohwan Ko, Joonmyung Choi, Juyeon Ko, Shinyeong Noh, Kyoung-Woon On, Eun-Sol Kim, Hyunwoo J. Kim

In this paper, we propose a novel multi-modal self-supervised framework Video-Text Temporally Weak Alignment-based Contrastive Learning (VT-TWINS) to capture significant information from noisy and weakly correlated data using a variant of Dynamic Time Warping (DTW).

Contrastive Learning Dynamic Time Warping +1

Paper
Code

Metropolis-Hastings Data Augmentation for Graph Neural Networks

no code implementations • NeurIPS 2021 • Hyeonjin Park, Seunghun Lee, Sihyeon Kim, Jinyoung Park, Jisu Jeong, Kyung-Min Kim, Jung-Woo Ha, Hyunwoo J. Kim

We also propose a simple and effective semi-supervised learning strategy with generated samples from MH-Aug. Our extensive experiments demonstrate that MH-Aug can generate a sequence of samples according to the target distribution to significantly improve the performance of GNNs.

Data Augmentation

Paper
Add Code

Improving Object Detection, Multi-object Tracking, and Re-Identification for Disaster Response Drones

4 code implementations • 5 Jan 2022 • Chongkeun Paik, Hyunwoo J. Kim

In the second approach, although DeepSORT only processes a quarter of all frames due to hardware and time limitations, our model with DeepSORT (42. 9%) outperforms FairMOT (71. 4%) in terms of recall.

Disaster Response Multi-Object Tracking +3

Paper
Code

Deformable Graph Convolutional Networks

1 code implementation • 29 Dec 2021 • Jinyoung Park, Sungdong Yoo, Jihwan Park, Hyunwoo J. Kim

To address the two common problems of graph convolution, in this paper, we propose Deformable Graph Convolutional Networks (Deformable GCNs) that adaptively perform convolution in multiple latent spaces and capture short/long-range dependencies between nodes.

Ranked #3 on Node Classification on Non-Homophilic (Heterophilic) Graphs on Cornell (48%/32%/20% fixed splits)

Node Classification on Non-Homophilic (Heterophilic) Graphs Representation Learning

Paper
Code

Point Cloud Augmentation with Weighted Local Transformations

1 code implementation • ICCV 2021 • Sihyeon Kim, Sanghyeok Lee, Dasol Hwang, Jaewon Lee, Seong Jae Hwang, Hyunwoo J. Kim

Although data augmentation is a standard approach to compensate for the scarcity of data, it has been less explored in the point cloud literature.

Ranked #11 on Point Cloud Classification on PointCloud-C

Data Augmentation Point Cloud Classification

Paper
Code

Graph Transformer Networks: Learning Meta-path Graphs to Improve GNNs

1 code implementation • 11 Jun 2021 • Seongjun Yun, Minbyul Jeong, Sungdong Yoo, Seunghun Lee, Sean S. Yi, Raehyun Kim, Jaewoo Kang, Hyunwoo J. Kim

Despite the success of GNNs, most existing GNNs are designed to learn node representations on the fixed and homogeneous graphs.

Node Classification

909

Paper
Code

HOTR: End-to-End Human-Object Interaction Detection with Transformers

1 code implementation • CVPR 2021 • Bumsoo Kim, Junhyun Lee, Jaewoo Kang, Eun-Sol Kim, Hyunwoo J. Kim

Human-Object Interaction (HOI) detection is a task of identifying "a set of interactions" in an image, which involves the i) localization of the subject (i. e., humans) and target (i. e., objects) of interaction, and ii) the classification of the interaction labels.

Ranked #16 on Human-Object Interaction Detection on V-COCO

Human-Object Interaction Detection Object +2

133

Paper
Code

Self-supervised Auxiliary Learning for Graph Neural Networks via Meta-Learning

1 code implementation • 1 Mar 2021 • Dasol Hwang, Jinyoung Park, Sunyoung Kwon, Kyung-Min Kim, Jung-Woo Ha, Hyunwoo J. Kim

Our method is learning to learn a primary task with various auxiliary tasks to improve generalization performance.

Auxiliary Learning Link Prediction +4

Paper
Code

Robust Neural Networks inspired by Strong Stability Preserving Runge-Kutta methods

1 code implementation • ECCV 2020 • Byungjoo Kim, Bryce Chudomelka, Jinyoung Park, Jaewoo Kang, Youngjoon Hong, Hyunwoo J. Kim

Motivated by the SSP property and a generalized Runge-Kutta method, we propose Strong Stability Preserving networks (SSP networks) which improve robustness against adversarial attacks.

Paper
Code

Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs

1 code implementation • NeurIPS 2020 • Dasol Hwang, Jinyoung Park, Sunyoung Kwon, Kyung-Min Kim, Jung-Woo Ha, Hyunwoo J. Kim

Our proposed method is learning to learn a primary task by predicting meta-paths as auxiliary tasks.

Auxiliary Learning Link Prediction +3

Paper
Code

Unpaired Image Translation via Adaptive Convolution-based Normalization

no code implementations • 29 Nov 2019 • Wonwoong Cho, Kangyeol Kim, Eungyeup Kim, Hyunwoo J. Kim, Jaegul Choo

Disentangling content and style information of an image has played an important role in recent success in image translation.

Translation

Paper
Add Code

Graph Transformer Networks

1 code implementation • NeurIPS 2019 • Seongjun Yun, Minbyul Jeong, Raehyun Kim, Jaewoo Kang, Hyunwoo J. Kim

In this paper, we propose Graph Transformer Networks (GTNs) that are capable of generating new graph structures, which involve identifying useful connections between unconnected nodes on the original graph, while learning effective node representation on the new graphs in an end-to-end fashion.

General Classification Link Prediction +2

909

Paper
Code

ANTNets: Mobile Convolutional Neural Networks for Resource Efficient Image Classification

no code implementations • 7 Apr 2019 • Yunyang Xiong, Hyunwoo J. Kim, Varsha Hedau

It boosts the representational power by modeling, in a high dimensional space, interdependency of channels between a depthwise convolution layer and a projection layer in the ANTBlocks.

Classification General Classification +1

Paper
Add Code

Efficient Relative Attribute Learning using Graph Neural Networks

1 code implementation • ECCV 2018 • Zihang Meng, Nagesh Adluru, Hyunwoo J. Kim, Glenn Fung, Vikas Singh

A sizable body of work on relative attributes provides compelling evidence that relating pairs of images along a continuum of strength pertaining to a visual attribute yields significant improvements in a wide variety of tasks in vision.

Ranked #3 on Clothing Attribute Recognition on Clothing Attributes Dataset

Attribute Clothing Attribute Recognition

Paper
Code

Tensorize, Factorize and Regularize: Robust Visual Relationship Learning

no code implementations • CVPR 2018 • Seong Jae Hwang, Sathya N. Ravi, Zirui Tao, Hyunwoo J. Kim, Maxwell D. Collins, Vikas Singh

Visual relationships provide higher-level information of objects and their relations in an image â this enables a semantic understanding of the scene and helps downstream applications.

Relational Reasoning Relationship Detection +1

Paper
Add Code

Sampling-free Uncertainty Estimation in Gated Recurrent Units with Exponential Families

no code implementations • 19 Apr 2018 • Seong Jae Hwang, Ronak Mehta, Hyunwoo J. Kim, Vikas Singh

There has recently been a concerted effort to derive mechanisms in vision and machine learning systems to offer uncertainty estimates of the predictions they make.

Paper
Add Code

Maximizing Subset Accuracy with Recurrent Neural Networks in Multi-label Classification

no code implementations • NeurIPS 2017 • Jinseok Nam, Eneldo Loza Mencía, Hyunwoo J. Kim, Johannes Fürnkranz

Multi-label classification is the task of predicting a set of labels for a given input instance.

General Classification Multi-Label Classification

Paper
Add Code

Finding Differentially Covarying Needles in a Temporally Evolving Haystack: A Scan Statistics Perspective

no code implementations • 20 Nov 2017 • Ronak Mehta, Hyunwoo J. Kim, Shulei Wang, Sterling C. Johnson, Ming Yuan, Vikas Singh

Recent results in coupled or temporal graphical models offer schemes for estimating the relationship structure between features when the data come from related (but distinct) longitudinal sources.

Paper
Add Code

Riemannian Nonlinear Mixed Effects Models: Analyzing Longitudinal Deformations in Neuroimaging

no code implementations • CVPR 2017 • Hyunwoo J. Kim, Nagesh Adluru, Heemanshu Suri, Baba C. Vemuri, Sterling C. Johnson, Vikas Singh

Statistical machine learning models that operate on manifold-valued data are being extensively studied in vision, motivated by applications in activity recognition, feature tracking and medical imaging.

Activity Recognition regression

Paper
Add Code

Latent Variable Graphical Model Selection Using Harmonic Analysis: Applications to the Human Connectome Project (HCP)

no code implementations • CVPR 2016 • Won Hwa Kim, Hyunwoo J. Kim, Nagesh Adluru, Vikas Singh

A major goal of imaging studies such as the (ongoing) Human Connectome Project (HCP) is to characterize the structural network map of the human brain and identify its associations with covariates such as genotype, risk factors, and so on that correspond to an individual.

Model Selection

Paper
Add Code

Interpolation on the Manifold of K Component GMMs

no code implementations • ICCV 2015 • Hyunwoo J. Kim, Nagesh Adluru, Monami Banerjee, Baba C. Vemuri, Vikas Singh

Probability density functions (PDFs) are fundamental "objects" in mathematics with numerous applications in computer vision, machine learning and medical imaging.

Paper
Add Code

Multivariate General Linear Models (MGLM) on Riemannian Manifolds with Applications to Statistical Analysis of Diffusion Weighted Images

no code implementations • CVPR 2014 • Hyunwoo J. Kim, Nagesh Adluru, Maxwell D. Collins, Moo. K. Chung, Barbara B. Bendlin, Sterling C. Johnson, Richard J. Davidson, Vikas Singh

Linear regression is a parametric model which is ubiquitous in scientific analysis.

Dictionary Learning regression

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.