Search Results for author: Jonghwan Mun

Found 12 papers, 5 papers with code

Boundary-aware Self-supervised Learning for Video Scene Segmentation

1 code implementation14 Jan 2022 Jonghwan Mun, Minchul Shin, Gunsoo Han, Sangho Lee, Seongsu Ha, Joonseok Lee, Eun-Sol Kim

Inspired from this, we tackle video scene segmentation, which is a task of temporally localizing scene boundaries in a video, with a self-supervised learning framework where we mainly focus on designing effective pretext tasks.

Scene Segmentation Self-Supervised Learning

Winning the ICCV'2021 VALUE Challenge: Task-aware Ensemble and Transfer Learning with Visual Concepts

no code implementations13 Oct 2021 Minchul Shin, Jonghwan Mun, Kyoung-Woon On, Woo-Young Kang, Gunsoo Han, Eun-Sol Kim

The VALUE (Video-And-Language Understanding Evaluation) benchmark is newly introduced to evaluate and analyze multi-modal representation learning algorithms on three video-and-language tasks: Retrieval, QA, and Captioning.

Benchmark Representation Learning +1

Boundary-aware Pre-training for Video Scene Segmentation

no code implementations29 Sep 2021 Jonghwan Mun, Minchul Shin, Gunsoo Han, Sangho Lee, Seongsu Ha, Joonseok Lee, Eun-Sol Kim

Inspired from this, we tackle video scene segmentation, which is a task of temporally localizing scene boundaries in a video, with a self-supervised learning framework where we mainly focus on designing effective pretext tasks.

Scene Segmentation Self-Supervised Learning

Local-Global Video-Text Interactions for Temporal Grounding

1 code implementation CVPR 2020 Jonghwan Mun, Minsu Cho, Bohyung Han

This paper addresses the problem of text-to-video temporal grounding, which aims to identify the time interval in a video semantically relevant to a text query.

Towards Oracle Knowledge Distillation with Neural Architecture Search

no code implementations29 Nov 2019 Minsoo Kang, Jonghwan Mun, Bohyung Han

We present a novel framework of knowledge distillation that is capable of learning powerful and efficient student models from ensemble teacher networks.

Knowledge Distillation Neural Architecture Search

Streamlined Dense Video Captioning

1 code implementation CVPR 2019 Jonghwan Mun, Linjie Yang, Zhou Ren, Ning Xu, Bohyung Han

Dense video captioning is an extremely challenging task since accurate and coherent description of events in a video requires holistic understanding of video contents as well as contextual reasoning of individual events.

Dense Video Captioning

Learning to Specialize with Knowledge Distillation for Visual Question Answering

no code implementations NeurIPS 2018 Jonghwan Mun, Kimin Lee, Jinwoo Shin, Bohyung Han

The proposed framework is model-agnostic and applicable to any tasks other than VQA, e. g., image classification with a large number of labels but few per-class examples, which is known to be difficult under existing MCL schemes.

General Classification Knowledge Distillation +3

Transfer Learning via Unsupervised Task Discovery for Visual Question Answering

1 code implementation CVPR 2019 Hyeonwoo Noh, Tae-hoon Kim, Jonghwan Mun, Bohyung Han

Specifically, we employ linguistic knowledge sources such as structured lexical database (e. g. WordNet) and visual descriptions for unsupervised task discovery, and transfer a learned task conditional visual classifier as an answering unit in a visual question answering model.

Question Answering Transfer Learning +1

Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization

no code implementations NeurIPS 2017 Hyeonwoo Noh, Tackgeun You, Jonghwan Mun, Bohyung Han

Overfitting is one of the most critical challenges in deep neural networks, and there are various types of regularization methods to improve generalization performance.

Computer Vision

Text-guided Attention Model for Image Captioning

1 code implementation12 Dec 2016 Jonghwan Mun, Minsu Cho, Bohyung Han

Visual attention plays an important role to understand images and demonstrates its effectiveness in generating natural language descriptions of images.

Benchmark Image Captioning

MarioQA: Answering Questions by Watching Gameplay Videos

no code implementations ICCV 2017 Jonghwan Mun, Paul Hongsuck Seo, Ilchae Jung, Bohyung Han

To address this objective, we automatically generate a customized synthetic VideoQA dataset using {\em Super Mario Bros.} gameplay videos so that it contains events with different levels of reasoning complexity.

Question Answering Video Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.