Search Results for author: Jungin Park

Found 14 papers, 7 papers with code

Bridging Vision and Language Spaces with Assignment Prediction

1 code implementation15 Apr 2024 Jungin Park, Jiyoung Lee, Kwanghoon Sohn

This paper introduces VLAP, a novel approach that bridges pretrained vision models and large language models (LLMs) to make frozen LLMs understand the visual world.

Cross-Modal Retrieval Image Captioning +3

Knowing Where to Focus: Event-aware Transformer for Video Grounding

1 code implementation ICCV 2023 Jinhyun Jang, Jungin Park, Jin Kim, Hyeongjun Kwon, Kwanghoon Sohn

Recent DETR-based video grounding models have made the model directly predict moment timestamps without any hand-crafted components, such as a pre-defined proposal or non-maximum suppression, by learning moment queries.

Moment Queries Sentence +1

PartMix: Regularization Strategy to Learn Part Discovery for Visible-Infrared Person Re-identification

no code implementations CVPR 2023 Minsu Kim, Seungryong Kim, Jungin Park, Seongheon Park, Kwanghoon Sohn

Modern data augmentation using a mixture-based technique can regularize the models from overfitting to the training data in various computer vision applications, but a proper data augmentation technique tailored for the part-based Visible-Infrared person Re-IDentification (VI-ReID) models remains unexplored.

Contrastive Learning Data Augmentation +1

Dual-path Adaptation from Image to Video Transformers

1 code implementation CVPR 2023 Jungin Park, Jiyoung Lee, Kwanghoon Sohn

In this paper, we efficiently transfer the surpassing representation power of the vision foundation models, such as ViT and Swin, for video understanding with only a few trainable parameters.

Action Classification Action Recognition In Videos +2

SimOn: A Simple Framework for Online Temporal Action Localization

1 code implementation8 Nov 2022 Tuan N. Tang, Jungin Park, Kwonyoung Kim, Kwanghoon Sohn

In addition, the evaluation for Online Detection of Action Start (ODAS) demonstrates the effectiveness and robustness of our method in the online setting.

Temporal Action Localization

Language-free Training for Zero-shot Video Grounding

no code implementations24 Oct 2022 Dahye Kim, Jungin Park, Jiyoung Lee, Seongheon Park, Kwanghoon Sohn

Given an untrimmed video and a language query depicting a specific temporal moment in the video, video grounding aims to localize the time interval by understanding the text and video simultaneously.

Video Grounding

PointFix: Learning to Fix Domain Bias for Robust Online Stereo Adaptation

no code implementations27 Jul 2022 Kwonyoung Kim, Jungin Park, Jiyoung Lee, Dongbo Min, Kwanghoon Sohn

To mitigate this issue, we propose to incorporate an auxiliary point-selective network into a meta-learning framework, called PointFix, to provide a robust initialization of stereo models for online stereo adaptation.

Autonomous Driving Meta-Learning

Probabilistic Representations for Video Contrastive Learning

no code implementations CVPR 2022 Jungin Park, Jiyoung Lee, Ig-Jae Kim, Kwanghoon Sohn

This paper presents Probabilistic Video Contrastive Learning, a self-supervised representation learning method that bridges contrastive learning with probabilistic representation.

Action Recognition Contrastive Learning +3

Self-balanced Learning For Domain Generalization

no code implementations31 Aug 2021 Jin Kim, Jiyoung Lee, Jungin Park, Dongbo Min, Kwanghoon Sohn

Domain generalization aims to learn a prediction model on multi-domain source data such that the model can generalize to a target domain with unknown statistics.

Domain Generalization

Bridge to Answer: Structure-aware Graph Interaction Network for Video Question Answering

no code implementations CVPR 2021 Jungin Park, Jiyoung Lee, Kwanghoon Sohn

As a result, our method can learn the question conditioned visual representations attributed to appearance and motion that show powerful capability for video question answering.

Question Answering Video Question Answering

Cross-Domain Grouping and Alignment for Domain Adaptive Semantic Segmentation

1 code implementation15 Dec 2020 Minsu Kim, Sunghun Joung, Seungryong Kim, Jungin Park, Ig-Jae Kim, Kwanghoon Sohn

Existing techniques to adapt semantic segmentation networks across the source and target domains within deep convolutional neural networks (CNNs) deal with all the samples from the two domains in a global or category-aware manner.

Clustering Domain Adaptation +2

SumGraph: Video Summarization via Recursive Graph Modeling

no code implementations ECCV 2020 Jungin Park, Jiyoung Lee, Ig-Jae Kim, Kwanghoon Sohn

The goal of video summarization is to select keyframes that are visually diverse and can represent a whole story of an input video.

Video Summarization

Context-Aware Emotion Recognition Networks

1 code implementation ICCV 2019 Jiyoung Lee, Seungryong Kim, Sunok Kim, Jungin Park, Kwanghoon Sohn

We present deep networks for context-aware emotion recognition, called CAER-Net, that exploit not only human facial expression but also context information in a joint and boosting manner.

Emotion Classification Emotion Recognition in Context

Cannot find the paper you are looking for? You can Submit a new open access paper.