Search Results for author: Heeseung Yun

Found 9 papers, 7 papers with code

Character Grounding and Re-Identification in Story of Videos and Text Descriptions

no code implementations • ECCV 2020 • Youngjae Yu, Jongseok Kim, Heeseung Yun, Jiwan Chung, Gunhee Kim

We address character grounding and re-identification in multiple story-based videos like movies and associated text descriptions.

Gender Prediction

Paper
Add Code

Dense 2D-3D Indoor Prediction with Sound via Aligned Cross-Modal Distillation

1 code implementation • ICCV 2023 • Heeseung Yun, Joonil Na, Gunhee Kim

Sound can convey significant information for spatial reasoning in our daily lives.

3D Scene Reconstruction Depth Estimation +3

Paper
Code

Fusing Pre-Trained Language Models With Multimodal Prompts Through Reinforcement Learning

1 code implementation • CVPR 2023 • Youngjae Yu, Jiwan Chung, Heeseung Yun, Jack Hessel, Jae Sung Park, Ximing Lu, Rowan Zellers, Prithviraj Ammanabrolu, Ronan Le Bras, Gunhee Kim, Yejin Choi

Language models are capable of commonsense reasoning: while domain-specific models can learn from explicit knowledge (e. g. commonsense graphs [6], ethical norms [25]), and larger models like GPT-3 manifest broad commonsense reasoning capacity.

Language Modelling reinforcement-learning +2

Paper
Code

Panoramic Vision Transformer for Saliency Detection in 360° Videos

1 code implementation • 19 Sep 2022 • Heeseung Yun, Sehun Lee, Gunhee Kim

360$^\circ$ video saliency detection is one of the challenging benchmarks for 360$^\circ$ video understanding since non-negligible distortion and discontinuity occur in the projection of any format of 360$^\circ$ videos, and capture-worthy viewpoint in the omnidirectional sphere is ambiguous by nature.

Saliency Prediction Video Quality Assessment +2

Paper
Code

Multimodal Knowledge Alignment with Reinforcement Learning

1 code implementation • 25 May 2022 • Youngjae Yu, Jiwan Chung, Heeseung Yun, Jack Hessel, JaeSung Park, Ximing Lu, Prithviraj Ammanabrolu, Rowan Zellers, Ronan Le Bras, Gunhee Kim, Yejin Choi

Large language models readily adapt to novel settings, even without task-specific training data.

Audio captioning Language Modelling +3

Paper
Code

Pano-AVQA: Grounded Audio-Visual Question Answering on 360$^\circ$ Videos

1 code implementation • 11 Oct 2021 • Heeseung Yun, Youngjae Yu, Wonsuk Yang, Kangil Lee, Gunhee Kim

However, previous benchmark tasks for panoramic videos are still limited to evaluate the semantic understanding of audio-visual relationships or spherical spatial property in surroundings.

Audio-visual Question Answering Question Answering +2

Paper
Code

Transitional Adaptation of Pretrained Models for Visual Storytelling

no code implementations • CVPR 2021 • Youngjae Yu, Jiwan Chung, Heeseung Yun, Jongseok Kim, Gunhee Kim

In this work, we claim that a transitional adaptation task is required between pretraining and finetuning to harmonize the visual encoder and the language model for challenging downstream target tasks like visual storytelling.

Ranked #1 on Visual Storytelling on VIST (ROUGE-L metric, using extra training data)

Image Captioning Language Modelling +3

Paper
Add Code

Pano-AVQA: Grounded Audio-Visual Question Answering on 360deg Videos

1 code implementation • ICCV 2021 • Heeseung Yun, Youngjae Yu, Wonsuk Yang, Kangil Lee, Gunhee Kim

However, previous benchmark tasks for panoramic videos are still limited to evaluate the semantic understanding of audio-visual relationships or spherical spatial property in surroundings.

Audio-visual Question Answering Question Answering +2

Paper
Code

A Mobile Robot Generating Video Summaries of Seniors' Indoor Activities

1 code implementation • 30 Jan 2019 • Chih-Yuan Yang, Heeseung Yun, Srenavis Varadaraj, Jane Yung-jen Hsu

We develop a system which generates summaries from seniors' indoor-activity videos captured by a social robot to help remote family members know their seniors' daily activities at home.

Action Recognition Human Detection +3

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.