Search Results for author: Caren Han

Found 8 papers, 4 papers with code

VICTR: Visual Information Captured Text Representation for Text-to-Vision Multimodal Tasks

1 code implementation • COLING 2020 • Caren Han, Siqu Long, Siwen Luo, Kunze Wang, Josiah Poon

We propose a new visual contextual text representation for text-to-image multimodal tasks, VICTR, which captures rich visual semantic information of objects from the text input.

Dependency Parsing Sentence

Paper
Code

RoViST:Learning Robust Metrics for Visual Storytelling

1 code implementation • 8 May 2022 • Eileen Wang, Caren Han, Josiah Poon

We measure the reliability of our metric sets by analysing its correlation with human judgement scores on a sample of machine stories obtained from 4 state-of-the-arts models trained on the Visual Storytelling Dataset (VIST).

Sentence Visual Grounding +1

Paper
Code

RoViST: Learning Robust Metrics for Visual Storytelling

1 code implementation • Findings (NAACL) 2022 • Eileen Wang, Caren Han, Josiah Poon

Sentence Visual Grounding +1

Paper
Code

PEACH: Pretrained-embedding Explanation Across Contextual and Hierarchical Structure

1 code implementation • 21 Apr 2024 • Feiqi Cao, Caren Han, Hyunsuk Chung

In this work, we propose a novel tree-based explanation technique, PEACH (Pretrained-embedding Explanation Across Contextual and Hierarchical Structure), that can explain how text-based documents are classified by using any pretrained contextual embeddings in a tree-based human-interpretable manner.

Paper
Code

Local Interpretations for Explainable Natural Language Processing: A Survey

no code implementations • 20 Mar 2021 • Siwen Luo, Hamish Ivison, Caren Han, Josiah Poon

As the use of deep learning techniques has grown across various fields over the past decade, complaints about the opaqueness of the black-box models have increased, resulting in an increased focus on transparency in deep learning models.

Machine Translation Sentiment Analysis +1

Paper
Add Code

An Analysis of Deep Reinforcement Learning Agents for Text-based Games

no code implementations • 9 Sep 2022 • Chen Chen, Yue Dai, Josiah Poon, Caren Han

Text-based games(TBG) are complex environments which allow users or computer agents to make textual interactions and achieve game goals. In TBG agent design and training process, balancing the efficiency and performance of the agent models is a major challenge.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering

no code implementations • 16 Dec 2022 • Feiqi Cao, Siwen Luo, Felipe Nunez, Zean Wen, Josiah Poon, Caren Han

To make explicit teaching of the relations between the two modalities, we proposed and integrated two attention modules, namely a scene graph-based semantic relation-aware attention and a positional relation-aware attention.

Optical Character Recognition Optical Character Recognition (OCR) +3

Paper
Add Code

M3-VRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding

no code implementations • 28 Feb 2024 • Yihao Ding, Lorenzo Vaiani, Caren Han, Jean Lee, Paolo Garza, Josiah Poon, Luca Cagliero

This paper presents a groundbreaking multimodal, multi-task, multi-teacher joint-grained knowledge distillation model for visually-rich form document understanding.

document understanding Knowledge Distillation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.