Search Results for author: Byoung-Tak Zhang

Found 78 papers, 24 papers with code

Scene Graph Parsing via Abstract Meaning Representation in Pre-trained Language Models

no code implementations • NAACL (DLG4NLP) 2022 • Woo Suk Choi, Yu-Jung Heo, Dharani Punithan, Byoung-Tak Zhang

In this work, we propose the application of abstract meaning representation (AMR) based semantic parsing models to parse textual descriptions of a visual scene into scene graphs, which is the first work to the best of our knowledge.

AMR Parsing Dependency Parsing

Paper
Add Code

Language-agnostic Semantic Consistent Text-to-Image Generation

no code implementations • MML (ACL) 2022 • SeongJun Jung, Woo Suk Choi, SeongHo Choi, Byoung-Tak Zhang

Recent GAN-based text-to-image generation models have advanced that they can generate photo-realistic images matching semantically with descriptions.

Generative Adversarial Network Multi-lingual Text-to-Image Generation +2

Paper
Add Code

Devil’s Advocate: Novel Boosting Ensemble Method from Psychological Findings for Text Classification

1 code implementation • Findings (EMNLP) 2021 • Hwiyeol Jo, Jaeseo Lim, Byoung-Tak Zhang

We present a new form of ensemble method–Devil’s Advocate, which uses a deliberately dissenting model to force other submodels within the ensemble to better collaborate.

text-classification Text Classification

Paper
Code

Continual Vision-and-Language Navigation

no code implementations • 22 Mar 2024 • Seongjun Jeong, Gi-Cheon Kang, SeongHo Choi, Joochan Kim, Byoung-Tak Zhang

For the training and evaluation of CVLN agents, we re-arrange existing VLN datasets to propose two datasets: CVLN-I, focused on navigation via initial-instruction interpretation, and CVLN-D, aimed at navigation through dialogue with other agents.

Continual Learning Navigate +1

Paper
Add Code

Unveiling the Significance of Toddler-Inspired Reward Transition in Goal-Oriented Reinforcement Learning

no code implementations • 11 Mar 2024 • Junseok Park, Yoonsung Kim, Hee Bin Yoo, Min Whoo Lee, Kibeom Kim, Won-Seok Choi, Minsu Lee, Byoung-Tak Zhang

Toddlers evolve from free exploration with sparse feedback to exploiting prior experiences for goal-directed learning with denser rewards.

Reinforcement Learning (RL)

Paper
Add Code

Multimodal Anomaly Detection based on Deep Auto-Encoder for Object Slip Perception of Mobile Manipulation Robots

no code implementations • 6 Mar 2024 • Youngjae Yoo, Chung-Yeon Lee, Byoung-Tak Zhang

The experimental results verified that the proposed framework reliably detects anomalies in object slip situations despite various object types and robot behaviors, and visual and auditory noise in the environment.

Anomaly Detection Object

Paper
Add Code

DUEL: Duplicate Elimination on Active Memory for Self-Supervised Class-Imbalanced Learning

no code implementations • 14 Feb 2024 • Won-Seok Choi, Hyundo Lee, Dong-Sig Han, Junseok Park, Heeyeon Koo, Byoung-Tak Zhang

Recent machine learning algorithms have been developed using well-curated datasets, which often require substantial cost and resources.

Paper
Add Code

Visual Hindsight Self-Imitation Learning for Interactive Navigation

no code implementations • 5 Dec 2023 • Kibeom Kim, Kisung Shin, Min Whoo Lee, Moonhoen Lee, Minsu Lee, Byoung-Tak Zhang

Interactive visual navigation tasks, which involve following instructions to reach and interact with specific targets, are challenging not only because successful experiences are very rare but also because the complex visual inputs require a substantial number of samples.

Imitation Learning Visual Navigation

Paper
Add Code

Neural Collage Transfer: Artistic Reconstruction via Material Manipulation

1 code implementation • ICCV 2023 • Ganghun Lee, Minji Kim, Yunsu Lee, Minsu Lee, Byoung-Tak Zhang

Collage is a creative art form that uses diverse material scraps as a base unit to compose a single image.

Paper
Code

PGA: Personalizing Grasping Agents with Single Human-Robot Interaction

1 code implementation • 19 Oct 2023 • Junghyun Kim, Gi-Cheon Kang, Jaein Kim, Seoyun Yang, Minjoon Jung, Byoung-Tak Zhang

Based on the acquired information, PGA pseudo-labels objects in the Reminiscence by our proposed label propagation algorithm.

Object Robotic Grasping

Paper
Code

PROGrasp: Pragmatic Human-Robot Communication for Object Grasping

1 code implementation • 14 Sep 2023 • Gi-Cheon Kang, Junghyun Kim, Jaein Kim, Byoung-Tak Zhang

The robot should then identify the target object by interacting with a human user.

Object Object Discovery +1

Paper
Code

GVCCI: Lifelong Learning of Visual Grounding for Language-Guided Robotic Manipulation

1 code implementation • 12 Jul 2023 • Junghyun Kim, Gi-Cheon Kang, Jaein Kim, Suyeon Shin, Byoung-Tak Zhang

Furthermore, the qualitative analysis shows that the unadapted VG model often fails to find correct objects due to a strong bias learned from the pre-training data.

Object Detection Visual Grounding

Paper
Code

EXOT: Exit-aware Object Tracker for Safe Robotic Manipulation of Moving Object

1 code implementation • 8 Jun 2023 • Hyunseo Kim, Hye Jung Yoon, Minji Kim, Dong-Sig Han, Byoung-Tak Zhang

We evaluate our method on the first-person video benchmark dataset, TREK-150, and on the custom dataset, RMOT-223, that we collect from the UR5e robot.

Object Object Recognition

Paper
Code

Overcoming Weak Visual-Textual Alignment for Video Moment Retrieval

1 code implementation • 5 Jun 2023 • Minjoon Jung, Youwon Jang, SeongHo Choi, Joochan Kim, Jin-Hwa Kim, Byoung-Tak Zhang

Video moment retrieval (VMR) identifies a specific moment in an untrimmed video for a given natural language query.

Ranked #6 on Moment Retrieval on Charades-STA

Moment Retrieval Natural Language Moment Retrieval +1

Paper
Code

L-SA: Learning Under-Explored Targets in Multi-Target Reinforcement Learning

no code implementations • 23 May 2023 • Kibeom Kim, Hyundo Lee, Min Whoo Lee, Moonheon Lee, Minsu Lee, Byoung-Tak Zhang

Tasks that involve interaction with various targets are called multi-target tasks.

General Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Learning Geometry-aware Representations by Sketching

no code implementations • CVPR 2023 • Hyundo Lee, Inwoo Hwang, Hyunsung Go, Won-Seok Choi, Kibeom Kim, Byoung-Tak Zhang

Our method, coined Learning by Sketching (LBS), learns to convert an image into a set of colored strokes that explicitly incorporate the geometric information of the scene in a single inference step without requiring a sketch dataset.

Attribute Semantic Similarity +1

Paper
Add Code

SelecMix: Debiased Learning by Contradicting-pair Sampling

1 code implementation • 4 Nov 2022 • Inwoo Hwang, Sangjun Lee, Yunhyeok Kwak, Seong Joon Oh, Damien Teney, Jin-Hwa Kim, Byoung-Tak Zhang

Experiments on standard benchmarks demonstrate the effectiveness of the method, in particular when label noise complicates the identification of bias-conflicting examples.

Paper
Code

DUEL: Adaptive Duplicate Elimination on Working Memory for Self-Supervised Learning

no code implementations • 31 Oct 2022 • Won-Seok Choi, Dong-Sig Han, Hyundo Lee, Junseok Park, Byoung-Tak Zhang

In Self-Supervised Learning (SSL), it is known that frequent occurrences of the collision in which target data and its negative samples share the same class can decrease performance.

Self-Supervised Learning

Paper
Add Code

Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval

1 code implementation • 23 Oct 2022 • Minjoon Jung, SeongHo Choi, Joochan Kim, Jin-Hwa Kim, Byoung-Tak Zhang

Video corpus moment retrieval (VCMR) is the task to retrieve the most relevant video moment from a large video corpus using a natural language query.

Ranked #2 on Video Corpus Moment Retrieval on TVR

Moment Retrieval Multimodal Reasoning +3

Paper
Code

Robust Imitation via Mirror Descent Inverse Reinforcement Learning

no code implementations • 20 Oct 2022 • Dong-Sig Han, Hyunseo Kim, Hyundo Lee, Je-Hwan Ryu, Byoung-Tak Zhang

Recently, adversarial imitation learning has shown a scalable reward acquisition method for inverse reinforcement learning (IRL) problems.

Density Estimation Imitation Learning +2

Paper
Add Code

SGRAM: Improving Scene Graph Parsing via Abstract Meaning Representation

no code implementations • 17 Oct 2022 • Woo Suk Choi, Yu-Jung Heo, Byoung-Tak Zhang

To this end, we design a simple yet effective two-stage scene graph parsing framework utilizing abstract meaning representation, SGRAM (Scene GRaph parsing via Abstract Meaning representation): 1) transforming a textual description of an image into an AMR graph (Text-to-AMR) and 2) encoding the AMR graph into a Transformer-based language model to generate a scene graph (AMR-to-SG).

Dependency Parsing Graph Generation +5

Paper
Add Code

Learning to Write with Coherence From Negative Examples

no code implementations • 22 Sep 2022 • Seonil Son, Jaeseo Lim, Youwon Jang, Jaeyoung Lee, Byoung-Tak Zhang

We compare our approach with Unlikelihood (UL) training in a text continuation task on commonsense natural language inference (NLI) corpora to show which method better models the coherence by avoiding unlikely continuations.

Natural Language Inference Sentence +1

Paper
Add Code

On the Importance of Critical Period in Multi-stage Reinforcement Learning

no code implementations • 9 Aug 2022 • Junseok Park, Inwoo Hwang, Min Whoo Lee, Hyunseok Oh, Minsu Lee, Youngki Lee, Byoung-Tak Zhang

The initial years of an infant's life are known as the critical period, during which the overall development of learning performance is significantly impacted due to neural plasticity.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

From Scratch to Sketch: Deep Decoupled Hierarchical Reinforcement Learning for Robotic Sketching Agent

1 code implementation • 9 Aug 2022 • Ganghun Lee, Minji Kim, Minsu Lee, Byoung-Tak Zhang

We present an automated learning framework for a robotic sketching agent that is capable of learning stroke-based rendering and motor control simultaneously.

Hierarchical Reinforcement Learning reinforcement-learning +1

Paper
Code

Cross-Modal Alignment Learning of Vision-Language Conceptual Systems

no code implementations • 31 Jul 2022 • Taehyeong Kim, Hyeonseop Song, Byoung-Tak Zhang

Additionally, we also propose an aligned cross-modal representation learning method that learns semantic representations of visual objects and words in a self-supervised manner based on the cross-modal relational graph networks.

Representation Learning Zero-Shot Learning

Paper
Add Code

The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training

2 code implementations • CVPR 2023 • Gi-Cheon Kang, Sungdong Kim, Jin-Hwa Kim, Donghyun Kwak, Byoung-Tak Zhang

As a result, GST scales the amount of training data up to an order of magnitude that of VisDial (1. 2M to 12. 9M QA data).

Conditional Text Generation Out-of-Distribution Detection +1

Paper
Code

Hypergraph Transformer: Weakly-supervised Multi-hop Reasoning for Knowledge-based Visual Question Answering

1 code implementation • ACL 2022 • Yu-Jung Heo, Eun-Sol Kim, Woo Suk Choi, Byoung-Tak Zhang

Knowledge-based visual question answering (QA) aims to answer a question which requires visually-grounded external knowledge beyond image content itself.

Question Answering Visual Question Answering

Paper
Code

Toddler-Guidance Learning: Impacts of Critical Period on Multimodal AI Agents

no code implementations • 12 Jan 2022 • Junseok Park, Kwanyoung Park, Hyunseok Oh, Ganghun Lee, Minsu Lee, Youngki Lee, Byoung-Tak Zhang

To validate this hypothesis, we adapt this notion of critical periods to learning in AI agents and investigate the critical period in the virtual environment for AI agents.

Reinforcement Learning (RL) Transfer Learning

Paper
Add Code

Smooth-Swap: A Simple Enhancement for Face-Swapping with Smoothness

no code implementations • CVPR 2022 • Jiseob Kim, Jihoon Lee, Byoung-Tak Zhang

Face-swapping models have been drawing attention for their compelling generation quality, but their complex architectures and loss functions often require careful tuning for successful training.

Contrastive Learning Face Swapping

Paper
Add Code

Goal-Aware Cross-Entropy for Multi-Target Reinforcement Learning

1 code implementation • NeurIPS 2021 • Kibeom Kim, Min Whoo Lee, Yoonsung Kim, Je-Hwan Ryu, Minsu Lee, Byoung-Tak Zhang

Learning in a multi-target environment without prior knowledge about the targets requires a large amount of samples and makes generalization difficult.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Toward a Human-Level Video Understanding Intelligence

no code implementations • 8 Oct 2021 • Yu-Jung Heo, Minsu Lee, SeongHo Choi, Woo Suk Choi, Minjung Shin, Minjoon Jung, Jeh-Kwang Ryu, Byoung-Tak Zhang

In this paper, we propose the Video Turing Test to provide effective and practical assessments of video understanding intelligence as well as human-likeness evaluation of AI agents.

Video Understanding

Paper
Add Code

Mounting Video Metadata on Transformer-based Language Model for Open-ended Video Question Answering

no code implementations • 11 Aug 2021 • Donggeon Lee, SeongHo Choi, Youwon Jang, Byoung-Tak Zhang

In this paper, we challenge the existing multiple-choice video question answering by changing it to open-ended video question answering.

Language Modelling Multiple-choice +2

Paper
Add Code

CogME: A Novel Evaluation Metric for Video Understanding Intelligence

no code implementations • 21 Jul 2021 • Minjung Shin, Jeonghoon Kim, SeongHo Choi, Yu-Jung Heo, Donghyun Kim, Minsu Lee, Byoung-Tak Zhang, Jeh-Kwang Ryu

Then we propose a top-down evaluation system for VideoQA, based on the cognitive process of humans and story elements: Cognitive Modules for Evaluation (CogME).

Question Answering Sentence +2

Paper
Add Code

Attend What You Need: Motion-Appearance Synergistic Networks for Video Question Answering

1 code implementation • ACL 2021 • Ahjeong Seo, Gi-Cheon Kang, Joonhan Park, Byoung-Tak Zhang

MASN consists of a motion module, an appearance module, and a motion-appearance fusion module.

Question Answering Video Question Answering

Paper
Code

M2FN: Multi-step Modality Fusion for Advertisement Image Assessment

no code implementations • 31 Jan 2021 • Kyung-Wha Park, Jung-Woo Ha, Junghoon Lee, Sunyoung Kwon, Kyung-Min Kim, Byoung-Tak Zhang

Assessing advertisements, specifically on the basis of user preferences and ad quality, is crucial to the marketing industry.

Marketing

Paper
Add Code

Learning task-agnostic representation via toddler-inspired learning

no code implementations • 27 Jan 2021 • Kwanyoung Park, Junseok Park, Hyunseok Oh, Byoung-Tak Zhang, Youngki Lee

One of the inherent limitations of current AI systems, stemming from the passive learning mechanisms (e. g., supervised learning), is that they perform well on labeled datasets but cannot deduce knowledge on their own.

Image Classification Object Localization

Paper
Add Code

Unbiased learning with State-Conditioned Rewards in Adversarial Imitation Learning

no code implementations • 1 Jan 2021 • Dong-Sig Han, Hyunseo Kim, Hyundo Lee, Je-Hwan Ryu, Byoung-Tak Zhang

The formulation draws a strong connection between adversarial learning and energy-based reinforcement learning; thus, the architecture is capable of recovering a reward function that induces a multi-modal policy.

Continuous Control Imitation Learning +2

Paper
Add Code

Ruminating Word Representations with Random Noise Masking

no code implementations • 1 Jan 2021 • Hwiyeol Jo, Byoung-Tak Zhang

Through the re-training process, some of noises can be compensated and other noises can be utilized to learn better representations.

text-classification Text Classification +1

Paper
Add Code

Deep Quotient Manifold Modeling

no code implementations • 1 Jan 2021 • Jiseob Kim, Seungjae Jung, Hyundo Lee, Byoung-Tak Zhang

One of the difficulties in modeling real-world data is their complex multi-manifold structure due to discrete features.

Paper
Add Code

ColdExpand: Semi-Supervised Graph Learning in Cold Start

no code implementations • 1 Jan 2021 • Il-Jae Kwon, Kyoung-Woon On, Dong-Geon Lee, Byoung-Tak Zhang

Most real-world graphs are dynamic and eventually face the cold start problem.

Graph Learning Link Prediction +1

Paper
Add Code

Spectrally Similar Graph Pooling

no code implementations • 1 Jan 2021 • Kyoung-Woon On, Eun-Sol Kim, Il-Jae Kwon, Sangwoong Yoon, Byoung-Tak Zhang

To further investigate the effectiveness of our proposed method, we evaluate our approach on a real-world problem, image retrieval with visual scene graphs.

Image Retrieval Retrieval

Paper
Add Code

Message Passing Adaptive Resonance Theory for Online Active Semi-supervised Learning

no code implementations • 2 Dec 2020 • Taehyeong Kim, Injune Hwang, Hyundo Lee, Hyunseo Kim, Won-Seok Choi, Joseph J. Lim, Byoung-Tak Zhang

Active learning is widely used to reduce labeling effort and training time by repeatedly querying only the most beneficial samples from unlabeled data.

Active Learning

Paper
Add Code

Human-Like Active Learning: Machines Simulating the Human Learning Process

no code implementations • 7 Nov 2020 • Jaeseo Lim, Hwiyeol Jo, Byoung-Tak Zhang, Jooyong Park

In the end, we showed not only that we can make build better machine training framework through the human experiment result, but also empirically confirm the result of human experiment through imitated machine experiments; human-like active learning have crucial effect on learning performance.

Active Learning Knowledge Distillation

Paper
Add Code

Co-attentional Transformers for Story-Based Video Understanding

no code implementations • 27 Oct 2020 • Björn Bebensee, Byoung-Tak Zhang

Inspired by recent trends in vision and language learning, we explore applications of attention mechanisms for visio-lingual fusion within an application to story-based video understanding.

Question Answering Video Question Answering +1

Paper
Add Code

Toward General Scene Graph: Integration of Visual Semantic Knowledge with Entity Synset Alignment

no code implementations • WS 2020 • Woo Suk Choi, Kyoung-Woon On, Yu-Jung Heo, Byoung-Tak Zhang

In experiment, the integrated scene graph is applied to the image-caption retrieval task as a down-stream task.

Retrieval

Paper
Add Code

Pattern Denoising in Molecular Associative Memory using Pairwise Markov Random Field Models

no code implementations • 28 May 2020 • Dharani Punithan, Byoung-Tak Zhang

We propose an in silico molecular associative memory model for pattern learning, storage and denoising using Pairwise Markov Random Field (PMRF) model.

Denoising

Paper
Add Code

DramaQA: Character-Centered Video Story Understanding with Hierarchical QA

1 code implementation • 7 May 2020 • Seong-Ho Choi, Kyoung-Woon On, Yu-Jung Heo, Ahjeong Seo, Youwon Jang, Minsu Lee, Byoung-Tak Zhang

Despite recent progress on computer vision and natural language processing, developing a machine that can understand video story is still hard to achieve due to the intrinsic difficulty of video story.

Question Answering Video Question Answering +1

Paper
Code

Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer

1 code implementation • Findings (EMNLP) 2021 • Gi-Cheon Kang, Junseok Park, Hwaran Lee, Byoung-Tak Zhang, Jin-Hwa Kim

Visual dialog is a task of answering a sequence of questions grounded in an image using the previous dialog history as context.

Graph Learning Graph structure learning +2

Paper
Code

Cut-Based Graph Learning Networks to Discover Compositional Structure of Sequential Video Data

no code implementations • 17 Jan 2020 • Kyoung-Woon On, Eun-Sol Kim, Yu-Jung Heo, Byoung-Tak Zhang

Here, we propose Cut-Based Graph Learning Networks (CB-GLNs) for learning video data by discovering these complex structures of the video.

Graph Learning Video Understanding

Paper
Add Code

Ruminating Word Representations with Random Noised Masker

no code implementations • 8 Nov 2019 • Hwiyeol Jo, Byoung-Tak Zhang

Next, we gradually add random noises to the word representations and repeat the training process from scratch, but initialize with the noised word representations.

text-classification Text Classification +1

Paper
Add Code

Which Ads to Show? Advertisement Image Assessment with Auxiliary Information via Multi-step Modality Fusion

no code implementations • 6 Oct 2019 • Kyung-Wha Park, Junghoon Lee, Sunyoung Kwon, Jung-Woo Ha, Kyung-Min Kim, Byoung-Tak Zhang

Despite crucial influences of image quality, auxiliary information of ad images such as tags and target subjects can also determine image preference.

Paper
Add Code

Manifold Learning and Alignment with Generative Adversarial Networks

no code implementations • 25 Sep 2019 • Jiseob Kim, Seungjae Jung, Hyundo Lee, Byoung-Tak Zhang

We present a generative adversarial network (GAN) that conducts manifold learning and alignment (MLA): A task to learn the multi-manifold structure underlying data and to align those manifolds without any correspondence information.

Disentanglement Generative Adversarial Network

Paper
Add Code

Discriminative Variational Autoencoder for Continual Learning with Generative Replay

no code implementations • 25 Sep 2019 • Woo-Young Kang, Cheol-Ho Han, Byoung-Tak Zhang

Generative replay (GR) is a method to alleviate catastrophic forgetting in continual learning (CL) by generating previous task data and learning them together with the data from new tasks.

Continual Learning Permuted-MNIST +2

Paper
Add Code

Compositional Structure Learning for Sequential Video Data

no code implementations • 3 Jul 2019 • Kyoung-Woon On, Eun-Sol Kim, Yu-Jung Heo, Byoung-Tak Zhang

However, most of sequential data, as seen with videos, have complex temporal dependencies that imply variable-length semantic flows and their compositions, and those are hard to be captured by conventional methods.

Paper
Add Code

Encoder-Powered Generative Adversarial Networks

no code implementations • 3 Jun 2019 • Jiseob Kim, Seungjae Jung, Hyundo Lee, Byoung-Tak Zhang

We present an encoder-powered generative adversarial network (EncGAN) that is able to learn both the multi-manifold structure and the abstract features of data.

Generative Adversarial Network Style Transfer

Paper
Add Code

Simulating Problem Difficulty in Arithmetic Cognition Through Dynamic Connectionist Models

no code implementations • 9 May 2019 • Sungjae Cho, Jaeseo Lim, Chris Hickey, Jung Ae Park, Byoung-Tak Zhang

Problem difficulty was operationalized by the number of carries involved in solving a given problem.

Paper
Add Code

Constructing Hierarchical Q&A Datasets for Video Story Understanding

no code implementations • 1 Apr 2019 • Yu-Jung Heo, Kyoung-Woon On, SeongHo Choi, Jaeseo Lim, Jinah Kim, Jeh-Kwang Ryu, Byung-Chull Bae, Byoung-Tak Zhang

Video understanding is emerging as a new paradigm for studying human-like AI.

Video Understanding

Paper
Add Code

Dual Attention Networks for Visual Reference Resolution in Visual Dialog

2 code implementations • IJCNLP 2019 • Gi-Cheon Kang, Jaeseo Lim, Byoung-Tak Zhang

Specifically, REFER module learns latent relationships between a given question and a dialog history by employing a self-attention mechanism.

Ranked #2 on Visual Dialog on VisDial v0.9 val

Question Answering Visual Dialog +2

Paper
Code

Visualizing Semantic Structures of Sequential Data by Learning Temporal Dependencies

no code implementations • 20 Jan 2019 • Kyoung-Woon On, Eun-Sol Kim, Yu-Jung Heo, Byoung-Tak Zhang

While conventional methods for sequential learning focus on interaction between consecutive inputs, we suggest a new method which captures composite semantic flows with variable-length dependencies.

Paper
Add Code

Data Interpolations in Deep Generative Models under Non-Simply-Connected Manifold Topology

no code implementations • 20 Jan 2019 • Jiseob Kim, Byoung-Tak Zhang

Exploiting the deep generative model's remarkable ability of learning the data-manifold structure, some recent researches proposed a geometric data interpolation method based on the geodesic curves on the learned data-manifold.

Paper
Add Code

Multimodal Dual Attention Memory for Video Story Question Answering

no code implementations • ECCV 2018 • Kyung-Min Kim, Seong-Ho Choi, Jin-Hwa Kim, Byoung-Tak Zhang

We confirm the best performance of the dual attention mechanism combined with late fusion by ablation studies.

Question Answering

Paper
Add Code

GLAC Net: GLocal Attention Cascading Networks for Multi-image Cued Story Generation

2 code implementations • 28 May 2018 • Taehyeong Kim, Min-Oh Heo, Seonil Son, Kyoung-Wha Park, Byoung-Tak Zhang

The task of multi-image cued story generation, such as visual storytelling dataset (VIST) challenge, is to compose multiple coherent sentences from a given sequence of images.

Ranked #30 on Visual Storytelling on VIST (METEOR metric)

Sentence Visual Storytelling

Paper
Code

Bilinear Attention Networks

8 code implementations • NeurIPS 2018 • Jin-Hwa Kim, Jaehyun Jun, Byoung-Tak Zhang

In this paper, we propose bilinear attention networks (BAN) that find bilinear attention distributions to utilize given vision-language information seamlessly.

Ranked #11 on Phrase Grounding on Flickr30k Entities Test

Visual Question Answering

5,413

Paper
Code

Answerer in Questioner's Mind: Information Theoretic Approach to Goal-Oriented Visual Dialog

1 code implementation • NeurIPS 2018 • Sang-Woo Lee, Yu-Jung Heo, Byoung-Tak Zhang

Goal-oriented dialogue tasks occur when a questioner asks an action-oriented question and an answerer responds with the intent of letting the questioner know a correct action to take.

Goal-Oriented Dialog Visual Dialog

Paper
Code

Understanding Local Minima in Neural Networks by Loss Surface Decomposition

no code implementations • ICLR 2018 • Hanock Kwak, Byoung-Tak Zhang

The parameter domain of the loss surface can be decomposed into regions in which activation values (zero or one for rectified linear units) are consistent.

Paper
Add Code

Visual Explanations from Hadamard Product in Multimodal Deep Networks

no code implementations • 18 Dec 2017 • Jin-Hwa Kim, Byoung-Tak Zhang

Kim et al. (2016) show that the Hadamard product in multimodal deep networks, which is well-known for the joint function of visual question answering tasks, implicitly performs an attentional mechanism for visual inputs.

Question Answering Visual Question Answering

Paper
Add Code

CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication

2 code implementations • ACL 2019 • Jin-Hwa Kim, Nikita Kitaev, Xinlei Chen, Marcus Rohrbach, Byoung-Tak Zhang, Yuandong Tian, Dhruv Batra, Devi Parikh

The game involves two players: a Teller and a Drawer.

Imitation Learning

Paper
Code

Multi-focus Attention Network for Efficient Deep Reinforcement Learning

no code implementations • 13 Dec 2017 • Jinyoung Choi, Beom-Jin Lee, Byoung-Tak Zhang

In multi-agent cooperative task experiments, our model shows 20% faster learning than existing state-of-the-art model.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

DeepStory: Video Story QA by Deep Embedded Memory Networks

no code implementations • 4 Jul 2017 • Kyung-Min Kim, Min-Oh Heo, Seong-Ho Choi, Byoung-Tak Zhang

This is mainly due to 1) the reconstruction of video stories in a scene-dialogue combined form that utilize the latent embedding and 2) attention.

Question Answering Video Story QA

Paper
Add Code

Overcoming Catastrophic Forgetting by Incremental Moment Matching

1 code implementation • NeurIPS 2017 • Sang-Woo Lee, Jin-Hwa Kim, Jaehyun Jun, Jung-Woo Ha, Byoung-Tak Zhang

Catastrophic forgetting is a problem of neural networks that loses the information of the first task after training the second task.

Transfer Learning

Paper
Code

Micro-Objective Learning : Accelerating Deep Reinforcement Learning through the Discovery of Continuous Subgoals

no code implementations • 11 Mar 2017 • Sungtae Lee, Sang-Woo Lee, Jinyoung Choi, Dong-Hyun Kwak, Byoung-Tak Zhang

To solve this issue, the subgoal and option framework have been proposed.

Game of Go Montezuma's Revenge +2

Paper
Add Code

Ways of Conditioning Generative Adversarial Networks

no code implementations • 4 Nov 2016 • Hanock Kwak, Byoung-Tak Zhang

The GANs are generative models whose random samples realistically reflect natural images.

Paper
Add Code

Hadamard Product for Low-rank Bilinear Pooling

8 code implementations • 14 Oct 2016 • Jin-Hwa Kim, Kyoung-Woon On, Woosang Lim, Jeonghee Kim, Jung-Woo Ha, Byoung-Tak Zhang

Bilinear models provide rich representations compared with linear models.

Visual Question Answering

10,425

Paper
Code

Human Body Orientation Estimation using Convolutional Neural Network

no code implementations • 7 Sep 2016 • Jinyoung Choi, Beom-Jin Lee, Byoung-Tak Zhang

However, in most of the service robot applications, the user needs to move himself/herself to allow the robot to see him/her face to face.

Face Detection

Paper
Add Code

Generating Images Part by Part with Composite Generative Adversarial Networks

no code implementations • 19 Jul 2016 • Hanock Kwak, Byoung-Tak Zhang

We propose a model called composite generative adversarial network, that reveals the complex structure of images with multiple generators in which each generator generates some part of the image.

Generative Adversarial Network Image Generation

Paper
Add Code

Multimodal Residual Learning for Visual QA

1 code implementation • NeurIPS 2016 • Jin-Hwa Kim, Sang-Woo Lee, Dong-Hyun Kwak, Min-Oh Heo, Jeonghee Kim, Jung-Woo Ha, Byoung-Tak Zhang

We present Multimodal Residual Networks (MRN) for the multimodal residual learning of visual question-answering, which extends the idea of the deep residual learning.

Ranked #6 on Visual Question Answering (VQA) on COCO Visual Question Answering (VQA) real images 1.0 multiple choice

Multiple-choice Question Answering +1

Paper
Code

Dual Memory Architectures for Fast Deep Learning of Stream Data via an Online-Incremental-Transfer Strategy

no code implementations • 15 Jun 2015 • Sang-Woo Lee, Min-Oh Heo, Jiwon Kim, Jeonghee Kim, Byoung-Tak Zhang

The proposed architecture consists of deep representation learners and fast learnable shallow kernel networks, both of which synergize to track the information of new data.

Transfer Learning

Paper
Add Code

Generative Local Metric Learning for Nearest Neighbor Classification

no code implementations • NeurIPS 2010 • Yung-Kyun Noh, Byoung-Tak Zhang, Daniel D. Lee

We consider the problem of learning a local metric to enhance the performance of nearest neighbor classification.

Classification Dimensionality Reduction +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.