no code implementations • MML (ACL) 2022 • SeongJun Jung, Woo Suk Choi, SeongHo Choi, Byoung-Tak Zhang
Recent GAN-based text-to-image generation models have advanced that they can generate photo-realistic images matching semantically with descriptions.
Generative Adversarial Network Multi-lingual Text-to-Image Generation +2
no code implementations • NAACL (DLG4NLP) 2022 • Woo Suk Choi, Yu-Jung Heo, Dharani Punithan, Byoung-Tak Zhang
In this work, we propose the application of abstract meaning representation (AMR) based semantic parsing models to parse textual descriptions of a visual scene into scene graphs, which is the first work to the best of our knowledge.
no code implementations • 17 Oct 2022 • Woo Suk Choi, Yu-Jung Heo, Byoung-Tak Zhang
To this end, we design a simple yet effective two-stage scene graph parsing framework utilizing abstract meaning representation, SGRAM (Scene GRaph parsing via Abstract Meaning representation): 1) transforming a textual description of an image into an AMR graph (Text-to-AMR) and 2) encoding the AMR graph into a Transformer-based language model to generate a scene graph (AMR-to-SG).
1 code implementation • ACL 2022 • Yu-Jung Heo, Eun-Sol Kim, Woo Suk Choi, Byoung-Tak Zhang
Knowledge-based visual question answering (QA) aims to answer a question which requires visually-grounded external knowledge beyond image content itself.
no code implementations • 8 Oct 2021 • Yu-Jung Heo, Minsu Lee, SeongHo Choi, Woo Suk Choi, Minjung Shin, Minjoon Jung, Jeh-Kwang Ryu, Byoung-Tak Zhang
In this paper, we propose the Video Turing Test to provide effective and practical assessments of video understanding intelligence as well as human-likeness evaluation of AI agents.
no code implementations • WS 2020 • Woo Suk Choi, Kyoung-Woon On, Yu-Jung Heo, Byoung-Tak Zhang
In experiment, the integrated scene graph is applied to the image-caption retrieval task as a down-stream task.