no code implementations • 29 Sep 2021 • Jonghwan Mun, Minchul Shin, Gunsoo Han, Sangho Lee, Seongsu Ha, Joonseok Lee, Eun-Sol Kim
Inspired from this, we tackle video scene segmentation, which is a task of temporally localizing scene boundaries in a video, with a self-supervised learning framework where we mainly focus on designing effective pretext tasks.
no code implementations • 13 Oct 2021 • Minchul Shin, Jonghwan Mun, Kyoung-Woon On, Woo-Young Kang, Gunsoo Han, Eun-Sol Kim
The VALUE (Video-And-Language Understanding Evaluation) benchmark is newly introduced to evaluate and analyze multi-modal representation learning algorithms on three video-and-language tasks: Retrieval, QA, and Captioning.
1 code implementation • 14 Jan 2022 • Jonghwan Mun, Minchul Shin, Gunsoo Han, Sangho Lee, Seongsu Ha, Joonseok Lee, Eun-Sol Kim
Inspired from this, we tackle video scene segmentation, which is a task of temporally localizing scene boundaries in a video, with a self-supervised learning framework where we mainly focus on designing effective pretext tasks.
no code implementations • 23 May 2023 • Eunbi Choi, Kyoung-Woon On, Gunsoo Han, Sungwoong Kim, Daniel Wontae Nam, DaeJin Jo, Seung Eun Rho, Taehwan Kwon, Minjoon Seo
Open-domain conversation systems integrate multiple conversation skills into a single system through a modular approach.
no code implementations • 10 Oct 2023 • DaeJin Jo, Daniel Wontae Nam, Gunsoo Han, Kyoung-Woon On, Taehwan Kwon, Seungeun Rho, Sungwoong Kim
A common practice in knowledge-grounded dialogue generation is to explicitly utilize intermediate steps (e. g., web-search, memory retrieval) with modular approaches.
no code implementations • 6 Apr 2024 • Seungjae Jung, Gunsoo Han, Daniel Wontae Nam, Kyoung-Woon On
In the process of this discovery, we identified two techniques for effective alignment: reward shift and underlying distribution matching.