Search Results for author: Josiah Poon

Found 36 papers, 18 papers with code

RoViST: Learning Robust Metrics for Visual Storytelling

1 code implementation Findings (NAACL) 2022 Eileen Wang, Caren Han, Josiah Poon

We measure the reliability of our metric sets by analysing its correlation with human judgement scores on a sample of machine stories obtained from 4 state-of-the-arts models trained on the Visual Storytelling Dataset (VIST).

Sentence Visual Grounding +1

M3-VRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding

no code implementations28 Feb 2024 Yihao Ding, Lorenzo Vaiani, Caren Han, Jean Lee, Paolo Garza, Josiah Poon, Luca Cagliero

This paper presents a groundbreaking multimodal, multi-task, multi-teacher joint-grained knowledge distillation model for visually-rich form document understanding.

document understanding Knowledge Distillation

SCO-VIST: Social Interaction Commonsense Knowledge-based Visual Storytelling

no code implementations1 Feb 2024 Eileen Wang, Soyeon Caren Han, Josiah Poon

This weighted story graph produces the storyline in a sequence of events using Floyd-Warshall's algorithm.

Image Captioning Visual Grounding +1

Re-Temp: Relation-Aware Temporal Representation Learning for Temporal Knowledge Graph Completion

no code implementations24 Oct 2023 Kunze Wang, Soyeon Caren Han, Josiah Poon

Temporal Knowledge Graph Completion (TKGC) under the extrapolation setting aims to predict the missing entity from a fact in the future, posing a challenge that aligns more closely with real-world prediction problems.

Knowledge Graph Completion Relation +2

MC-DRE: Multi-Aspect Cross Integration for Drug Event/Entity Extraction

1 code implementation12 Aug 2023 Jie Yang, Soyeon Caren Han, Siqu Long, Josiah Poon, Goran Nenadic

Extracting meaningful drug-related information chunks, such as adverse drug events (ADE), is crucial for preventing morbidity and saving many lives.

Event Detection Event Extraction +4

Workshop on Document Intelligence Understanding

no code implementations31 Jul 2023 Soyeon Caren Han, Yihao Ding, Siwen Luo, Josiah Poon, HeeGuen Yoon, Zhe Huang, Paul Duuring, Eun Jung Holden

Document understanding and information extraction include different tasks to understand a document and extract valuable information automatically.

document understanding Visual Question Answering (VQA)

Tri-level Joint Natural Language Understanding for Multi-turn Conversational Datasets

1 code implementation28 May 2023 Henry Weld, Sijia Hu, Siqu Long, Josiah Poon, Soyeon Caren Han

We present a novel tri-level joint natural language understanding approach, adding domain, and explicitly exchange semantic information between all levels.

Intent Detection Natural Language Understanding +3

SimCGNN: Simple Contrastive Graph Neural Network for Session-based Recommendation

no code implementations8 Feb 2023 Yuan Cao, Xudong Zhang, Fan Zhang, Feifei Kou, Josiah Poon, Xiongnan Jin, Yongheng Wang, Jinpeng Chen

Session-based recommendation (SBR) problem, which focuses on next-item prediction for anonymous users, has received increasingly more attention from researchers.

Contrastive Learning Session-Based Recommendations

Spoken Language Understanding for Conversational AI: Recent Advances and Future Direction

no code implementations21 Dec 2022 Soyeon Caren Han, Siqu Long, Henry Weld, Josiah Poon

This tutorial will discuss how the joint task is set up and introduce Spoken Language Understanding/Natural Language Understanding (SLU/NLU) with Deep Learning techniques.

intent-classification Intent Classification +7

SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering

no code implementations16 Dec 2022 Feiqi Cao, Siwen Luo, Felipe Nunez, Zean Wen, Josiah Poon, Caren Han

To make explicit teaching of the relations between the two modalities, we proposed and integrated two attention modules, namely a scene graph-based semantic relation-aware attention and a positional relation-aware attention.

Optical Character Recognition Optical Character Recognition (OCR) +3

SG-Shuffle: Multi-aspect Shuffle Transformer for Scene Graph Generation

no code implementations9 Nov 2022 Anh Duc Bui, Soyeon Caren Han, Josiah Poon

Scene Graph Generation (SGG) serves a comprehensive representation of the images for human understanding as well as visual understanding tasks.

Graph Generation Scene Graph Generation

An Analysis of Deep Reinforcement Learning Agents for Text-based Games

no code implementations9 Sep 2022 Chen Chen, Yue Dai, Josiah Poon, Caren Han

Text-based games(TBG) are complex environments which allow users or computer agents to make textual interactions and achieve game goals. In TBG agent design and training process, balancing the efficiency and performance of the agent models is a major challenge.

reinforcement-learning Reinforcement Learning (RL) +1

SUPER-Rec: SUrrounding Position-Enhanced Representation for Recommendation

no code implementations9 Sep 2022 Taejun Lim, Siqu Long, Josiah Poon, Soyeon Caren Han

Collaborative filtering problems are commonly solved based on matrix completion techniques which recover the missing values of user-item interaction matrices.

Collaborative Filtering Matrix Completion +4

Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis

1 code implementation COLING 2022 Siwen Luo, Yihao Ding, Siqu Long, Josiah Poon, Soyeon Caren Han

Recognizing the layout of unstructured digital documents is crucial when parsing the documents into the structured, machine-readable format for downstream applications.

Component Classification Document Layout Analysis

Understanding Attention for Vision-and-Language Tasks

1 code implementation COLING 2022 Feiqi Cao, Soyeon Caren Han, Siqu Long, Changwei Xu, Josiah Poon

Attention mechanism has been used as an important component across Vision-and-Language(VL) tasks in order to bridge the semantic gap between visual and textual features.

Image Retrieval Question Answering +4

InducT-GCN: Inductive Graph Convolutional Networks for Text Classification

1 code implementation1 Jun 2022 Kunze Wang, Soyeon Caren Han, Josiah Poon

Under the extreme settings with no extra resource and limited amount of training set, can we still learn an inductive graph-based text classification model?

text-classification Text Classification

RoViST:Learning Robust Metrics for Visual Storytelling

1 code implementation8 May 2022 Eileen Wang, Caren Han, Josiah Poon

We measure the reliability of our metric sets by analysing its correlation with human judgement scores on a sample of machine stories obtained from 4 state-of-the-arts models trained on the Visual Storytelling Dataset (VIST).

Sentence Visual Grounding +1

Understanding Graph Convolutional Networks for Text Classification

1 code implementation30 Mar 2022 Soyeon Caren Han, Zihan Yuan, Kunze Wang, Siqu Long, Josiah Poon

Graph Convolutional Networks (GCN) have been effective at tasks that have rich relational structure and can preserve global structure information of a dataset in graph embeddings.

graph construction text-classification +1

Bi-directional Joint Neural Networks for Intent Classification and Slot Filling

no code implementations26 Feb 2022 Soyeon Caren Han, Siqu Long, Huichun Li, Henry Weld, Josiah Poon

In this paper, we propose a bi-directional joint model for intent classification and slot filling, which includes a multi-stage hierarchical process via BERT and bi-directional joint natural language understanding mechanisms, including intent2slot and slot2intent, to obtain mutual performance enhancement between intent classification and slot filling.

Classification intent-classification +6

GLocal-K: Global and Local Kernels for Recommender Systems

3 code implementations27 Aug 2021 Soyeon Caren Han, Taejun Lim, Siqu Long, Bernd Burgstaller, Josiah Poon

Then, the pre-trained auto encoder is fine-tuned with the rating matrix, produced by a convolution-based global kernel, which captures the characteristics of each item.

Collaborative Filtering Matrix Completion +1

FedNLP: An interpretable NLP System to Decode Federal Reserve Communications

1 code implementation11 Jun 2021 Jean Lee, Hoyoul Luis Youn, Nicholas Stevens, Josiah Poon, Soyeon Caren Han

The Federal Reserve System (the Fed) plays a significant role in affecting monetary policy and financial conditions worldwide.

Sentiment Analysis Text Classification

Local Interpretations for Explainable Natural Language Processing: A Survey

no code implementations20 Mar 2021 Siwen Luo, Hamish Ivison, Caren Han, Josiah Poon

As the use of deep learning techniques has grown across various fields over the past decade, complaints about the opaqueness of the black-box models have increased, resulting in an increased focus on transparency in deep learning models.

Machine Translation Sentiment Analysis +1

Deep Structured Feature Networks for Table Detection and Tabular Data Extraction from Scanned Financial Document Images

no code implementations20 Feb 2021 Siwen Luo, Mengting Wu, Yiwen Gong, Wanying Zhou, Josiah Poon

The main contributions of this paper are proposing the Financial Documents dataset with table-area annotations, the superior detection model and the rule-based layout segmentation technique for the tabular data extraction from PDF files.

Optical Character Recognition Optical Character Recognition (OCR) +1

Event-Driven LSTM For Forex Price Prediction

no code implementations29 Jan 2021 Ling Qi, Matloob Khushi, Josiah Poon

The majority of studies in the field of AI guided financial trading focus on purely applying machine learning algorithms to continuous historical price and technical analysis data.

feature selection

A Survey on Extraction of Causal Relations from Natural Language Text

no code implementations16 Jan 2021 Jie Yang, Soyeon Caren Han, Josiah Poon

Existing causality extraction techniques include knowledge-based, statistical machine learning(ML)-based, and deep learning-based approaches.

BIG-bench Machine Learning Feature Engineering +2

VICTR: Visual Information Captured Text Representation for Text-to-Vision Multimodal Tasks

1 code implementation COLING 2020 Caren Han, Siqu Long, Siwen Luo, Kunze Wang, Josiah Poon

We propose a new visual contextual text representation for text-to-image multimodal tasks, VICTR, which captures rich visual semantic information of objects from the text input.

Dependency Parsing Sentence

VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks

1 code implementation7 Oct 2020 Soyeon Caren Han, Siqu Long, Siwen Luo, Kunze Wang, Josiah Poon

We propose a new visual contextual text representation for text-to-image multimodal tasks, VICTR, which captures rich visual semantic information of objects from the text input.

Ranked #24 on Text-to-Image Generation on MS COCO (Inception score metric)

Dependency Parsing Sentence +1

REXUP: I REason, I EXtract, I UPdate with Structured Compositional Reasoning for Visual Question Answering

1 code implementation27 Jul 2020 Siwen Luo, Soyeon Caren Han, Kaiyuan Sun, Josiah Poon

Visual question answering (VQA) is a challenging multi-modal task that requires not only the semantic understanding of both images and questions, but also the sound perception of a step-by-step reasoning process that would lead to the correct answer.

Question Answering Visual Question Answering

AMI-Net+: A Novel Multi-Instance Neural Network for Medical Diagnosis from Incomplete and Imbalanced Data

1 code implementation3 Jul 2019 Zeyuan Wang, Josiah Poon, Simon Poon

In medical real-world study (RWS), how to fully utilize the fragmentary and scarce information in model training to generate the solid diagnosis results is a challenging task.

Medical Diagnosis

Attention-based Multi-instance Neural Network for Medical Diagnosis from Incomplete and Low Quality Data

1 code implementation9 Apr 2019 Zeyuan Wang, Josiah Poon, Shiding Sun, Simon Poon

However, in many real-world cases, data is often of low-quality due to a variety of reasons, such as data consistency, integrity, completeness, accuracy, etc.

General Classification Medical Diagnosis

CNN based Multi-Instance Multi-Task Learning for Syndrome Differentiation of Diabetic Patients

no code implementations19 Dec 2018 Zeyuan Wang, Josiah Poon, Shiding Sun, Simon Poon

Inspired from it, we employ multi-instance multi-task learning combined with the convolutional neural network (MIMT-CNN) for syndrome differentiation, which takes region proposals as input and output image labels directly.

Multi-Task Learning object-detection +2

Cannot find the paper you are looking for? You can Submit a new open access paper.