Search Results for author: Gunhee Kim

Found 76 papers, 42 papers with code

Character Grounding and Re-Identification in Story of Videos and Text Descriptions

no code implementations ECCV 2020 Youngjae Yu, Jongseok Kim, Heeseung Yun, Jiwan Chung, Gunhee Kim

We address character grounding and re-identification in multiple story-based videos like movies and associated text descriptions.

Gender Prediction

Rethinking Class Activation Mapping for Weakly Supervised Object Localization

1 code implementation ECCV 2020 Wonho Bae, Junhyug Noh, Gunhee Kim

Weakly supervised object localization (WSOL) is a task of localizing an object in an image only using image-level labels.

Weakly-Supervised Object Localization

Model-Agnostic Boundary-Adversarial Sampling for Test-Time Generalization in Few-Shot learning

1 code implementation ECCV 2020 Jaekyeom Kim, Hyoungseok Kim, Gunhee Kim

Few-shot learning is an important research problem that tackles one of the greatest challenges of machine learning: learning a new task from a limited amount of labeled data.

Few-Shot Learning Test

When Meta-Learning Meets Online and Continual Learning: A Survey

no code implementations9 Nov 2023 Jaehyeon Son, Soochan Lee, Gunhee Kim

Over the past decade, deep neural networks have demonstrated significant success using the training scheme that involves mini-batch stochastic gradient descent on extensive datasets.

Continual Learning Meta-Learning

FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions

no code implementations24 Oct 2023 Hyunwoo Kim, Melanie Sclar, Xuhui Zhou, Ronan Le Bras, Gunhee Kim, Yejin Choi, Maarten Sap

Theory of mind (ToM) evaluations currently focus on testing models using passive narratives that inherently lack interactivity.

Question Answering

Can Language Models Laugh at YouTube Short-form Videos?

1 code implementation22 Oct 2023 Dayoon Ko, Sangho Lee, Gunhee Kim

Our ExFunTube is unique over existing datasets in that our videos cover a wide range of domains with various types of humor that necessitate a multimodal understanding of the content.

Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context Reasoning with Language Models

1 code implementation12 Jun 2023 Soochan Lee, Gunhee Kim

Generating intermediate steps, or Chain of Thought (CoT), is an effective way to significantly improve language models' (LM) multi-step reasoning capability.

KoSBi: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Application

1 code implementation28 May 2023 Hwaran Lee, Seokhee Hong, Joonsuk Park, Takyoung Kim, Gunhee Kim, Jung-Woo Ha

Large language models (LLMs) learn not only natural text generation abilities but also social biases against different demographic groups from real-world data.

Language Modelling Large Language Model +1

MPCHAT: Towards Multimodal Persona-Grounded Conversation

1 code implementation27 May 2023 Jaewoo Ahn, Yeda Song, Sangdoo Yun, Gunhee Kim

In order to build self-consistent personalized dialogue agents, previous research has mostly focused on textual persona that delivers personal facts or personalities.

Speaker Identification

Who Wrote this Code? Watermarking for Code Generation

1 code implementation24 May 2023 Taehyun Lee, Seokhee Hong, Jaewoo Ahn, Ilgee Hong, Hwaran Lee, Sangdoo Yun, Jamin Shin, Gunhee Kim

Based on \citet{Kirchenbauer2023watermark}, we propose a new watermarking method, Selective WatErmarking via Entropy Thresholding (SWEET), that promotes "green" tokens only at the position with high entropy of the token distribution during generation, thereby preserving the correctness of the generated code.

Code Generation Text Detection

Fusing Pre-Trained Language Models With Multimodal Prompts Through Reinforcement Learning

1 code implementation CVPR 2023 Youngjae Yu, Jiwan Chung, Heeseung Yun, Jack Hessel, Jae Sung Park, Ximing Lu, Rowan Zellers, Prithviraj Ammanabrolu, Ronan Le Bras, Gunhee Kim, Yejin Choi

Language models are capable of commonsense reasoning: while domain-specific models can learn from explicit knowledge (e. g. commonsense graphs [6], ethical norms [25]), and larger models like GPT-3 manifest broad commonsense reasoning capacity.

Language Modelling reinforcement-learning +2

Variational Laplace Autoencoders

no code implementations30 Nov 2022 Yookoon Park, Chris Dongjoo Kim, Gunhee Kim

Based on the Laplace approximation of the latent variable posterior, VLAEs enhance the expressiveness of the posterior while reducing the amortization error.

Variational Inference

Panoramic Vision Transformer for Saliency Detection in 360° Videos

1 code implementation19 Sep 2022 Heeseung Yun, Sehun Lee, Gunhee Kim

360$^\circ$ video saliency detection is one of the challenging benchmarks for 360$^\circ$ video understanding since non-negligible distortion and discontinuity occur in the projection of any format of 360$^\circ$ videos, and capture-worthy viewpoint in the omnidirectional sphere is ambiguous by nature.

Saliency Prediction Video Quality Assessment +2

LAVOLUTION: Measurement of Non-target Structural Displacement Calibrated by Structured Light

no code implementations15 Sep 2022 Jongbin Won, Minhyuk Song, Gunhee Kim, Jong-Woong Park, Haemin Jeon

A jig for the four beams of structured light is designed and a corresponding alignment process is proposed.

ProsocialDialog: A Prosocial Backbone for Conversational Agents

1 code implementation25 May 2022 Hyunwoo Kim, Youngjae Yu, Liwei Jiang, Ximing Lu, Daniel Khashabi, Gunhee Kim, Yejin Choi, Maarten Sap

With this dataset, we introduce a dialogue safety detection module, Canary, capable of generating RoTs given conversational context, and a socially-informed dialogue agent, Prost.

Dialogue Generation Dialogue Safety Prediction +2

Lipschitz-constrained Unsupervised Skill Discovery

no code implementations ICLR 2022 Seohong Park, Jongwook Choi, Jaekyeom Kim, Honglak Lee, Gunhee Kim

To address this issue, we propose Lipschitz-constrained Skill Discovery (LSD), which encourages the agent to discover more diverse, dynamic, and far-reaching skills.

Unsupervised Representation Learning via Neural Activation Coding

1 code implementation7 Dec 2021 Yookoon Park, Sangho Lee, Gunhee Kim, David M. Blei

We argue that the deep encoder should maximize its nonlinear expressivity on the data for downstream predictors to take full advantage of its representation power.

Representation Learning Retrieval

Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods

1 code implementation NeurIPS 2021 Seohong Park, Jaekyeom Kim, Gunhee Kim

SAR can handle the stochasticity of environments by adaptively reacting to changes in states during action repetition.

Policy Gradient Methods

Continual Learning on Noisy Data Streams via Self-Purified Replay

no code implementations ICCV 2021 Chris Dongjoo Kim, Jinseo Jeong, Sangwoo Moon, Gunhee Kim

Continually learning in the real world must overcome many challenges, among which noisy labels are a common and inevitable issue.

Continual Learning Self-Supervised Learning

Pano-AVQA: Grounded Audio-Visual Question Answering on 360$^\circ$ Videos

1 code implementation11 Oct 2021 Heeseung Yun, Youngjae Yu, Wonsuk Yang, Kangil Lee, Gunhee Kim

However, previous benchmark tasks for panoramic videos are still limited to evaluate the semantic understanding of audio-visual relationships or spherical spatial property in surroundings.

Audio-visual Question Answering Question Answering +1

Unsupervised Skill Discovery with Bottleneck Option Learning

1 code implementation27 Jun 2021 Jaekyeom Kim, Seohong Park, Gunhee Kim

Having the ability to acquire inherent skills from environments without any external rewards or supervision like humans is an important problem.


Transitional Adaptation of Pretrained Models for Visual Storytelling

no code implementations CVPR 2021 Youngjae Yu, Jiwan Chung, Heeseung Yun, Jongseok Kim, Gunhee Kim

In this work, we claim that a transitional adaptation task is required between pretraining and finetuning to harmonize the visual encoder and the language model for challenging downstream target tasks like visual storytelling.

 Ranked #1 on Visual Storytelling on VIST (ROUGE-L metric, using extra training data)

Image Captioning Language Modelling +2

StyleMix: Separating Content and Style for Enhanced Data Augmentation

1 code implementation CVPR 2021 Minui Hong, Jinwoo Choi, Gunhee Kim

In spite of the great success of deep neural networks for many challenging classification tasks, the learned networks are vulnerable to overfitting and adversarial attacks.

Data Augmentation

Drop-Bottleneck: Learning Discrete Compressed Representation for Noise-Robust Exploration

1 code implementation ICLR 2021 Jaekyeom Kim, Minjung Kim, Dongyeon Woo, Gunhee Kim

We propose a novel information bottleneck (IB) method named Drop-Bottleneck, which discretely drops features that are irrelevant to the target variable.

Adversarial Robustness Dimensionality Reduction

ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning

1 code implementation ICCV 2021 Sangho Lee, Jiwan Chung, Youngjae Yu, Gunhee Kim, Thomas Breuel, Gal Chechik, Yale Song

We demonstrate that our approach finds videos with high audio-visual correspondence and show that self-supervised models trained on our data achieve competitive performances compared to models trained on existing manually curated datasets.

Representation Learning

SEDONA: Search for Decoupled Neural Networks toward Greedy Block-wise Learning

1 code implementation ICLR 2021 Myeongjang Pyeon, Jihwan Moon, Taeyoung Hahn, Gunhee Kim

Backward locking and update locking are well-known sources of inefficiency in backpropagation that prevent from concurrently updating layers.

Neural Architecture Search

Self-Supervised Learning of Compressed Video Representations

no code implementations ICLR 2021 Youngjae Yu, Sangho Lee, Gunhee Kim, Yale Song

We show that our approach achieves competitive performance on self-supervised learning of video representations with a considerable improvement in speed compared to the traditional methods.

Self-Supervised Learning

Pano-AVQA: Grounded Audio-Visual Question Answering on 360deg Videos

1 code implementation ICCV 2021 Heeseung Yun, Youngjae Yu, Wonsuk Yang, Kangil Lee, Gunhee Kim

However, previous benchmark tasks for panoramic videos are still limited to evaluate the semantic understanding of audio-visual relationships or spherical spatial property in surroundings.

Audio-visual Question Answering Question Answering +1

Viewpoint-Agnostic Change Captioning With Cycle Consistency

1 code implementation ICCV 2021 Hoeseong Kim, Jongseok Kim, Hyungseok Lee, Hyunsung Park, Gunhee Kim

In addition, we propose a cycle consistency module that can potentially improve the performance of any change captioning networks in general by matching the composite feature of the generated caption and before image with the after image feature.

Characterizing Lookahead Dynamics of Smooth Games

no code implementations1 Jan 2021 Junsoo Ha, Gunhee Kim

As multi-agent systems proliferate in machine learning research, games have attracted much attention as a framework to understand optimization of multiple interacting objectives.

Parameter Efficient Multimodal Transformers for Video Representation Learning

no code implementations ICLR 2021 Sangho Lee, Youngjae Yu, Gunhee Kim, Thomas Breuel, Jan Kautz, Yale Song

The recent success of Transformers in the language domain has motivated adapting it to a multimodal setting, where a new visual model is trained in tandem with an already pretrained language model.

Language Modelling Representation Learning

Augmenting Data for Sarcasm Detection with Unlabeled Conversation Context

no code implementations WS 2020 Hankyol Lee, Youngjae Yu, Gunhee Kim

We present a novel data augmentation technique, CRA (Contextual Response Augmentation), which utilizes conversational context to generate meaningful samples for training.

Data Augmentation Sarcasm Detection

Will I Sound Like Me? Improving Persona Consistency in Dialogues through Pragmatic Self-Consciousness

1 code implementation EMNLP 2020 Hyunwoo Kim, Byeongchang Kim, Gunhee Kim

Results on Dialogue NLI (Welleck et al., 2019) and PersonaChat (Zhang et al., 2018) dataset show that our approach reduces contradiction and improves consistency of existing dialogue models.

Dialogue Generation Natural Language Inference

CurlingNet: Compositional Learning between Images and Text for Fashion IQ Data

1 code implementation27 Mar 2020 Youngjae Yu, Seunghwan Lee, Yuncheol Choi, Gunhee Kim

In order to learn an effective image-text composition for the data in the fashion domain, our model proposes two key components as follows.

Image Retrieval

Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue

3 code implementations ICLR 2020 Byeongchang Kim, Jaewoo Ahn, Gunhee Kim

Knowledge-grounded dialogue is a task of generating an informative response based on both discourse context and external knowledge.

A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning

1 code implementation ICLR 2020 Soochan Lee, Junsoo Ha, Dongsu Zhang, Gunhee Kim

Despite the growing interest in continual learning, most of its contemporary works have been studied in a rather restricted setting where tasks are clearly distinguishable, and task boundaries are known during training.

Continual Learning Image Classification +1

Self-Routing Capsule Networks

1 code implementation NeurIPS 2019 Taeyoung Hahn, Myeongjang Pyeon, Gunhee Kim

Capsule networks have recently gained a great deal of interest as a new architecture of neural networks that can be more robust to input perturbations than similar-sized CNNs.


AudioCaps: Generating Captions for Audios in The Wild

no code implementations NAACL 2019 Chris Dongjoo Kim, Byeongchang Kim, Hyunmin Lee, Gunhee Kim

We explore the problem of Audio Captioning: generating natural language description for any kind of audio in the wild, which has been surprisingly unexplored in previous research.

AudioCaps Audio captioning

IB-GAN: Disentangled Representation Learning with Information Bottleneck GAN

2 code implementations ICLR 2019 Insu Jeon, Wonkwang Lee, Gunhee Kim

IB-GAN objective is similar to that of InfoGAN but has a crucial difference; a capacity regularization for mutual information is adopted, thanks to which the generator of IB-GAN can harness a latent representation in disentangled and interpretable manner.


Harmonizing Maximum Likelihood with GANs for Multimodal Conditional Generation

no code implementations ICLR 2019 Soochan Lee, Junsoo Ha, Gunhee Kim

Recent advances in conditional image generation tasks, such as image-to-image translation and image inpainting, are largely accounted to the success of conditional GAN models, which are often optimized by the joint use of the GAN loss with the reconstruction loss.

Conditional Image Generation Image Inpainting +3

Discovery of Natural Language Concepts in Individual Units of CNNs

1 code implementation ICLR 2019 Seil Na, Yo Joong Choe, Dong-Hyun Lee, Gunhee Kim

Although deep convolutional networks have achieved improved performance in many natural language tasks, they have been treated as black boxes because they are difficult to interpret.

Concept Alignment General Classification +1

Abstractive Summarization of Reddit Posts with Multi-level Memory Networks

1 code implementation NAACL 2019 Byeongchang Kim, Hyunwoo Kim, Gunhee Kim

We address the problem of abstractive summarization in two directions: proposing a novel dataset and a new model.

Abstractive Text Summarization

A Joint Sequence Fusion Model for Video Question Answering and Retrieval

2 code implementations ECCV 2018 Youngjae Yu, Jongseok Kim, Gunhee Kim

We present an approach named JSFusion (Joint Sequence Fusion) that can measure semantic similarity between any pairs of multimodal sequence data (e. g. a video clip and a language sentence).

Multiple-choice Question Answering +6

Video Prediction with Appearance and Motion Conditions

no code implementations ICML 2018 Yunseok Jang, Gunhee Kim, Yale Song

Video prediction aims to generate realistic future frames by learning dynamic visual patterns.

Video Prediction

Improving Occlusion and Hard Negative Handling for Single-Stage Pedestrian Detectors

no code implementations CVPR 2018 Junhyug Noh, Soochan Lee, Beomsu Kim, Gunhee Kim

We propose methods of addressing two critical issues of pedestrian detection: (i) occlusion of target objects as false negative failure, and (ii) confusion with hard negative examples like vertical structures as false positive failure.

Occlusion Handling Pedestrian Detection

A Hierarchical Latent Structure for Variational Conversation Modeling

4 code implementations NAACL 2018 Yookoon Park, Jaemin Cho, Gunhee Kim

To solve the degeneration problem, we propose a novel model named Variational Hierarchical Conversation RNNs (VHCR), involving two key ideas of (1) using a hierarchical structure of latent variables, and (2) exploiting an utterance drop regularization.

Memorization Precedes Generation: Learning Unsupervised GANs with Memory Networks

1 code implementation ICLR 2018 Youngjin Kim, Minjung Kim, Gunhee Kim

We propose an approach to address two issues that commonly occur during training of unsupervised GANs.


A Deep Ranking Model for Spatio-Temporal Highlight Detection from a 360 Video

no code implementations31 Jan 2018 Youngjae Yu, Sang-ho Lee, Joonil Na, Jaeyun Kang, Gunhee Kim

We address the problem of highlight detection from a 360 degree video by summarizing it both spatially and temporally.

Highlight Detection

A Read-Write Memory Network for Movie Story Understanding

1 code implementation ICCV 2017 Seil Na, Sang-ho Lee, Ji-Sung Kim, Gunhee Kim

We propose a novel memory network model named Read-Write Memory Network (RWMN) to perform question and answering tasks for large-scale, multimodal movie story understanding.

Video Story QA

Supervising Neural Attention Models for Video Captioning by Human Gaze Data

no code implementations CVPR 2017 Youngjae Yu, Jongwook Choi, Yeonhwa Kim, Kyung Yoo, Sang-Hun Lee, Gunhee Kim

The attention mechanisms in deep neural networks are inspired by human's attention that sequentially focuses on the most relevant parts of the information over time to generate prediction output.

Descriptive Gaze Prediction +1

Poseidon: A System Architecture for Efficient GPU-based Deep Learning on Multiple Machines

no code implementations19 Dec 2015 Hao Zhang, Zhiting Hu, Jinliang Wei, Pengtao Xie, Gunhee Kim, Qirong Ho, Eric Xing

To investigate how to adapt existing frameworks to efficiently support distributed GPUs, we propose Poseidon, a scalable system architecture for distributed inter-machine communication in existing DL frameworks.

Object Recognition

Expressing an Image Stream with a Sequence of Natural Sentences

1 code implementation NeurIPS 2015 Cesc C. Park, Gunhee Kim

We propose an approach for generating a sequence of natural sentences for an image stream.

Storyline Representation of Egocentric Videos With an Applications to Story-Based Search

no code implementations ICCV 2015 Bo Xiong, Gunhee Kim, Leonid Sigal

To address this, we propose a storyline representation that expresses an egocentric video as a set of jointly inferred, through MRF inference, story elements comprising of actors, locations, supporting objects and events, depicted on a timeline.

Ranking and Retrieval of Image Sequences From Multiple Paragraph Queries

no code implementations CVPR 2015 Gunhee Kim, Seungwhan Moon, Leonid Sigal

While most previous work has dealt with the relations between a natural language sentence and an image or a video, our work extends to the relations between paragraphs and image sequences.


Joint Photo Stream and Blog Post Summarization and Exploration

no code implementations CVPR 2015 Gunhee Kim, Seungwhan Moon, Leonid Sigal

We alternate between solving the two coupled latent SVM problems, by first fixing the summarization and solving for the alignment from blog images to photo streams and vice versa.

Transfer Learning

Joint Summarization of Large-scale Collections of Web Images and Videos for Storyline Reconstruction

no code implementations CVPR 2014 Gunhee Kim, Leonid Sigal, Eric P. Xing

The reconstruction of storyline graphs is formulated as the inference of sparse time-varying directed graphs from a set of photo streams with assistance of videos.

Video Summarization

Reconstructing Storyline Graphs for Image Recommendation from Web Community Photos

no code implementations CVPR 2014 Gunhee Kim, Eric P. Xing

In this paper, we investigate an approach for reconstructing storyline graphs from large-scale collections of Internet images, and optionally other side information such as friendship graphs.

Jointly Aligning and Segmenting Multiple Web Photo Streams for the Inference of Collective Photo Storylines

no code implementations CVPR 2013 Gunhee Kim, Eric P. Xing

To this end, we design a scalable message-passing based optimization framework to jointly achieve both tasks for the whole input image set at once.

Image Segmentation Semantic Segmentation

Unsupervised Detection of Regions of Interest Using Iterative Link Analysis

no code implementations NeurIPS 2009 Gunhee Kim, Antonio Torralba

This paper proposes a fast and scalable alternating optimization technique to detect regions of interest (ROIs) in cluttered Web images without labels.


Cannot find the paper you are looking for? You can Submit a new open access paper.