Search Results for author: Eun-Sol Kim

Found 15 papers, 5 papers with code

Semantic Alignment with Calibrated Similarity for Multilingual Sentence Embedding

no code implementations Findings (EMNLP) 2021 Jiyeon Ham, Eun-Sol Kim

Predicting the similarity score consists of two sub-tasks, which are monolingual similarity evaluation and multilingual sentence retrieval.

Semantic Similarity Semantic Textual Similarity +2

Hypergraph Transformer: Weakly-supervised Multi-hop Reasoning for Knowledge-based Visual Question Answering

1 code implementation ACL 2022 Yu-Jung Heo, Eun-Sol Kim, Woo Suk Choi, Byoung-Tak Zhang

Knowledge-based visual question answering (QA) aims to answer a question which requires visually-grounded external knowledge beyond image content itself.

Question Answering Visual Question Answering

Video-Text Representation Learning via Differentiable Weak Temporal Alignment

1 code implementation CVPR 2022 Dohwan Ko, Joonmyung Choi, Juyeon Ko, Shinyeong Noh, Kyoung-Woon On, Eun-Sol Kim, Hyunwoo J. Kim

In this paper, we propose a novel multi-modal self-supervised framework Video-Text Temporally Weak Alignment-based Contrastive Learning (VT-TWINS) to capture significant information from noisy and weakly correlated data using a variant of Dynamic Time Warping (DTW).

Contrastive Learning Dynamic Time Warping +1

Boundary-aware Self-supervised Learning for Video Scene Segmentation

1 code implementation14 Jan 2022 Jonghwan Mun, Minchul Shin, Gunsoo Han, Sangho Lee, Seongsu Ha, Joonseok Lee, Eun-Sol Kim

Inspired from this, we tackle video scene segmentation, which is a task of temporally localizing scene boundaries in a video, with a self-supervised learning framework where we mainly focus on designing effective pretext tasks.

Scene Segmentation Self-Supervised Learning

Winning the ICCV'2021 VALUE Challenge: Task-aware Ensemble and Transfer Learning with Visual Concepts

no code implementations13 Oct 2021 Minchul Shin, Jonghwan Mun, Kyoung-Woon On, Woo-Young Kang, Gunsoo Han, Eun-Sol Kim

The VALUE (Video-And-Language Understanding Evaluation) benchmark is newly introduced to evaluate and analyze multi-modal representation learning algorithms on three video-and-language tasks: Retrieval, QA, and Captioning.

Representation Learning Transfer Learning

Selective Token Generation for Few-shot Language Modeling

no code implementations29 Sep 2021 DaeJin Jo, Taehwan Kwon, Sungwoong Kim, Eun-Sol Kim

Therefore, in this work, we develop a novel additive learning algorithm based on reinforcement learning (RL) for few-shot natural language generation (NLG) tasks.

Data-to-Text Generation Language Modelling +3

Boundary-aware Pre-training for Video Scene Segmentation

no code implementations29 Sep 2021 Jonghwan Mun, Minchul Shin, Gunsoo Han, Sangho Lee, Seongsu Ha, Joonseok Lee, Eun-Sol Kim

Inspired from this, we tackle video scene segmentation, which is a task of temporally localizing scene boundaries in a video, with a self-supervised learning framework where we mainly focus on designing effective pretext tasks.

Scene Segmentation Self-Supervised Learning

HOTR: End-to-End Human-Object Interaction Detection with Transformers

1 code implementation CVPR 2021 Bumsoo Kim, Junhyun Lee, Jaewoo Kang, Eun-Sol Kim, Hyunwoo J. Kim

Human-Object Interaction (HOI) detection is a task of identifying "a set of interactions" in an image, which involves the i) localization of the subject (i. e., humans) and target (i. e., objects) of interaction, and ii) the classification of the interaction labels.

Human-Object Interaction Detection object-detection +1

Spectrally Similar Graph Pooling

no code implementations1 Jan 2021 Kyoung-Woon On, Eun-Sol Kim, Il-Jae Kwon, Sangwoong Yoon, Byoung-Tak Zhang

To further investigate the effectiveness of our proposed method, we evaluate our approach on a real-world problem, image retrieval with visual scene graphs.

Image Retrieval

Image-to-Image Retrieval by Learning Similarity between Scene Graphs

1 code implementation29 Dec 2020 Sangwoong Yoon, Woo Young Kang, Sungwook Jeon, SeongEun Lee, Changjin Han, Jonghun Park, Eun-Sol Kim

Based on this idea, we propose a novel approach for image-to-image retrieval using scene graph similarity measured by graph neural networks.

Graph Similarity Image Retrieval +1

Hypergraph Attention Networks for Multimodal Learning

no code implementations CVPR 2020 Eun-Sol Kim, Woo Young Kang, Kyoung-Woon On, Yu-Jung Heo, Byoung-Tak Zhang

HANs follow the process: constructing the common semantic space with symbolic graphs of each modality, matching the semantics between sub-structures of the symbolic graphs, constructing co-attention maps between the graphs in the semantic space, and integrating the multimodal inputs using the co-attention maps to get the final joint representation.

Cut-Based Graph Learning Networks to Discover Compositional Structure of Sequential Video Data

no code implementations17 Jan 2020 Kyoung-Woon On, Eun-Sol Kim, Yu-Jung Heo, Byoung-Tak Zhang

Here, we propose Cut-Based Graph Learning Networks (CB-GLNs) for learning video data by discovering these complex structures of the video.

Graph Learning Video Understanding

Compositional Structure Learning for Sequential Video Data

no code implementations3 Jul 2019 Kyoung-Woon On, Eun-Sol Kim, Yu-Jung Heo, Byoung-Tak Zhang

However, most of sequential data, as seen with videos, have complex temporal dependencies that imply variable-length semantic flows and their compositions, and those are hard to be captured by conventional methods.

Visualizing Semantic Structures of Sequential Data by Learning Temporal Dependencies

no code implementations20 Jan 2019 Kyoung-Woon On, Eun-Sol Kim, Yu-Jung Heo, Byoung-Tak Zhang

While conventional methods for sequential learning focus on interaction between consecutive inputs, we suggest a new method which captures composite semantic flows with variable-length dependencies.

Cannot find the paper you are looking for? You can Submit a new open access paper.