Search Results for author: Jonghwan Mun

Found 17 papers, 8 papers with code

MarioQA: Answering Questions by Watching Gameplay Videos

no code implementations ICCV 2017 Jonghwan Mun, Paul Hongsuck Seo, Ilchae Jung, Bohyung Han

To address this objective, we automatically generate a customized synthetic VideoQA dataset using {\em Super Mario Bros.} gameplay videos so that it contains events with different levels of reasoning complexity.

Question Answering Video Question Answering

Text-guided Attention Model for Image Captioning

1 code implementation12 Dec 2016 Jonghwan Mun, Minsu Cho, Bohyung Han

Visual attention plays an important role to understand images and demonstrates its effectiveness in generating natural language descriptions of images.

Image Captioning

Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization

no code implementations NeurIPS 2017 Hyeonwoo Noh, Tackgeun You, Jonghwan Mun, Bohyung Han

Overfitting is one of the most critical challenges in deep neural networks, and there are various types of regularization methods to improve generalization performance.

Transfer Learning via Unsupervised Task Discovery for Visual Question Answering

1 code implementation CVPR 2019 Hyeonwoo Noh, Tae-hoon Kim, Jonghwan Mun, Bohyung Han

Specifically, we employ linguistic knowledge sources such as structured lexical database (e. g. WordNet) and visual descriptions for unsupervised task discovery, and transfer a learned task conditional visual classifier as an answering unit in a visual question answering model.

Question Answering Transfer Learning +1

Learning to Specialize with Knowledge Distillation for Visual Question Answering

no code implementations NeurIPS 2018 Jonghwan Mun, Kimin Lee, Jinwoo Shin, Bohyung Han

The proposed framework is model-agnostic and applicable to any tasks other than VQA, e. g., image classification with a large number of labels but few per-class examples, which is known to be difficult under existing MCL schemes.

General Classification General Knowledge +5

Streamlined Dense Video Captioning

1 code implementation CVPR 2019 Jonghwan Mun, Linjie Yang, Zhou Ren, Ning Xu, Bohyung Han

Dense video captioning is an extremely challenging task since accurate and coherent description of events in a video requires holistic understanding of video contents as well as contextual reasoning of individual events.

Dense Video Captioning

Towards Oracle Knowledge Distillation with Neural Architecture Search

no code implementations29 Nov 2019 Minsoo Kang, Jonghwan Mun, Bohyung Han

We present a novel framework of knowledge distillation that is capable of learning powerful and efficient student models from ensemble teacher networks.

Image Classification Knowledge Distillation +1

Local-Global Video-Text Interactions for Temporal Grounding

1 code implementation CVPR 2020 Jonghwan Mun, Minsu Cho, Bohyung Han

This paper addresses the problem of text-to-video temporal grounding, which aims to identify the time interval in a video semantically relevant to a text query.

Boundary-aware Pre-training for Video Scene Segmentation

no code implementations29 Sep 2021 Jonghwan Mun, Minchul Shin, Gunsoo Han, Sangho Lee, Seongsu Ha, Joonseok Lee, Eun-Sol Kim

Inspired from this, we tackle video scene segmentation, which is a task of temporally localizing scene boundaries in a video, with a self-supervised learning framework where we mainly focus on designing effective pretext tasks.

Scene Segmentation Self-Supervised Learning

Winning the ICCV'2021 VALUE Challenge: Task-aware Ensemble and Transfer Learning with Visual Concepts

no code implementations13 Oct 2021 Minchul Shin, Jonghwan Mun, Kyoung-Woon On, Woo-Young Kang, Gunsoo Han, Eun-Sol Kim

The VALUE (Video-And-Language Understanding Evaluation) benchmark is newly introduced to evaluate and analyze multi-modal representation learning algorithms on three video-and-language tasks: Retrieval, QA, and Captioning.

Model Optimization Representation Learning +2

Boundary-aware Self-supervised Learning for Video Scene Segmentation

1 code implementation14 Jan 2022 Jonghwan Mun, Minchul Shin, Gunsoo Han, Sangho Lee, Seongsu Ha, Joonseok Lee, Eun-Sol Kim

Inspired from this, we tackle video scene segmentation, which is a task of temporally localizing scene boundaries in a video, with a self-supervised learning framework where we mainly focus on designing effective pretext tasks.

Scene Segmentation Self-Supervised Learning

Learning to Generate Text-grounded Mask for Open-world Semantic Segmentation from Only Image-Text Pairs

1 code implementation CVPR 2023 Junbum Cha, Jonghwan Mun, Byungseok Roh

Existing open-world segmentation methods have shown impressive advances by employing contrastive learning (CL) to learn diverse visual concepts and transferring the learned image-level understanding to the segmentation task.

Contrastive Learning Open Vocabulary Semantic Segmentation +4

Noise-aware Learning from Web-crawled Image-Text Data for Image Captioning

1 code implementation ICCV 2023 Wooyoung Kang, Jonghwan Mun, Sungjun Lee, Byungseok Roh

Image captioning is one of the straightforward tasks that can take advantage of large-scale web-crawled data which provides rich knowledge about the visual world for a captioning model.

Image Captioning Image Retrieval +1

Learning Pseudo-Labeler beyond Noun Concepts for Open-Vocabulary Object Detection

no code implementations4 Dec 2023 Sunghun Kang, Junbum Cha, Jonghwan Mun, Byungseok Roh, Chang D. Yoo

Specifically, the proposed method aims to learn arbitrary image-to-text mapping for pseudo-labeling of arbitrary concepts, named Pseudo-Labeling for Arbitrary Concepts (PLAC).

object-detection Open Vocabulary Object Detection +2

Honeybee: Locality-enhanced Projector for Multimodal LLM

1 code implementation11 Dec 2023 Junbum Cha, Wooyoung Kang, Jonghwan Mun, Byungseok Roh

In Multimodal Large Language Models (MLLMs), a visual projector plays a crucial role in bridging pre-trained vision encoders with LLMs, enabling profound visual understanding while harnessing the LLMs' robust capabilities.

 Ranked #1 on Science Question Answering on ScienceQA (using extra training data)

Science Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.