no code implementations • 21 Apr 2024 • Suyeon Shin, Sujin jeon, Junghyun Kim, Gi-Cheon Kang, Byoung-Tak Zhang
Embodied Instruction Following (EIF) is the task of executing natural language instructions by navigating and interacting with objects in 3D environments.
no code implementations • 22 Mar 2024 • Seongjun Jeong, Gi-Cheon Kang, SeongHo Choi, Joochan Kim, Byoung-Tak Zhang
For the training and evaluation of CVLN agents, we re-arrange existing VLN datasets to propose two datasets: CVLN-I, focused on navigation via initial-instruction interpretation, and CVLN-D, aimed at navigation through dialogue with other agents.
1 code implementation • 19 Oct 2023 • Junghyun Kim, Gi-Cheon Kang, Jaein Kim, Seoyun Yang, Minjoon Jung, Byoung-Tak Zhang
Based on the acquired information, PGA pseudo-labels objects in the Reminiscence by our proposed label propagation algorithm.
1 code implementation • 14 Sep 2023 • Gi-Cheon Kang, Junghyun Kim, Jaein Kim, Byoung-Tak Zhang
The robot should then identify the target object by interacting with a human user.
1 code implementation • 12 Jul 2023 • Junghyun Kim, Gi-Cheon Kang, Jaein Kim, Suyeon Shin, Byoung-Tak Zhang
Furthermore, the qualitative analysis shows that the unadapted VG model often fails to find correct objects due to a strong bias learned from the pre-training data.
2 code implementations • CVPR 2023 • Gi-Cheon Kang, Sungdong Kim, Jin-Hwa Kim, Donghyun Kwak, Byoung-Tak Zhang
As a result, GST scales the amount of training data up to an order of magnitude that of VisDial (1. 2M to 12. 9M QA data).
Conditional Text Generation Out-of-Distribution Detection +1
1 code implementation • ACL 2021 • Ahjeong Seo, Gi-Cheon Kang, Joonhan Park, Byoung-Tak Zhang
MASN consists of a motion module, an appearance module, and a motion-appearance fusion module.
1 code implementation • Findings (EMNLP) 2021 • Gi-Cheon Kang, Junseok Park, Hwaran Lee, Byoung-Tak Zhang, Jin-Hwa Kim
Visual dialog is a task of answering a sequence of questions grounded in an image using the previous dialog history as context.
2 code implementations • IJCNLP 2019 • Gi-Cheon Kang, Jaeseo Lim, Byoung-Tak Zhang
Specifically, REFER module learns latent relationships between a given question and a dialog history by employing a self-attention mechanism.
Ranked #2 on Visual Dialog on VisDial v0.9 val