Search Results for author: Caren Han

Found 6 papers, 3 papers with code

RoViST: Learning Robust Metrics for Visual Storytelling

1 code implementation Findings (NAACL) 2022 Eileen Wang, Caren Han, Josiah Poon

We measure the reliability of our metric sets by analysing its correlation with human judgement scores on a sample of machine stories obtained from 4 state-of-the-arts models trained on the Visual Storytelling Dataset (VIST).

Visual Grounding Visual Storytelling

SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering

no code implementations16 Dec 2022 Feiqi Cao, Siwen Luo, Felipe Nunez, Zean Wen, Josiah Poon, Caren Han

To make explicit teaching of the relations between the two modalities, we proposed and integrated two attention modules, namely a scene graph-based semantic relation-aware attention and a positional relation-aware attention.

Optical Character Recognition (OCR) Question Answering +1

An Analysis of Deep Reinforcement Learning Agents for Text-based Games

no code implementations9 Sep 2022 Chen Chen, Yue Dai, Josiah Poon, Caren Han

Text-based games(TBG) are complex environments which allow users or computer agents to make textual interactions and achieve game goals. In TBG agent design and training process, balancing the efficiency and performance of the agent models is a major challenge.

reinforcement-learning Reinforcement Learning (RL) +1

RoViST:Learning Robust Metrics for Visual Storytelling

1 code implementation8 May 2022 Eileen Wang, Caren Han, Josiah Poon

We measure the reliability of our metric sets by analysing its correlation with human judgement scores on a sample of machine stories obtained from 4 state-of-the-arts models trained on the Visual Storytelling Dataset (VIST).

Visual Grounding Visual Storytelling

Local Interpretations for Explainable Natural Language Processing: A Survey

no code implementations20 Mar 2021 Siwen Luo, Hamish Ivison, Caren Han, Josiah Poon

As the use of deep learning techniques has grown across various fields over the past decade, complaints about the opaqueness of the black-box models have increased, resulting in an increased focus on transparency in deep learning models.

Machine Translation Sentiment Analysis +1

VICTR: Visual Information Captured Text Representation for Text-to-Vision Multimodal Tasks

1 code implementation COLING 2020 Caren Han, Siqu Long, Siwen Luo, Kunze Wang, Josiah Poon

We propose a new visual contextual text representation for text-to-image multimodal tasks, VICTR, which captures rich visual semantic information of objects from the text input.

Dependency Parsing

Cannot find the paper you are looking for? You can Submit a new open access paper.