Search Results for author: Jey Han Lau

Found 61 papers, 22 papers with code

Cloze Evaluation for Deeper Understanding of Commonsense Stories in Indonesian

no code implementations CSRR (ACL) 2022 Fajri Koto, Timothy Baldwin, Jey Han Lau

Story comprehension that involves complex causal and temporal relations is a critical task in NLP, but previous studies have focused predominantly on English, leaving open the question of how the findings generalize to other languages, such as Indonesian.

Cloze Test Zero-Shot Cross-Lingual Transfer

The patient is more dead than alive: exploring the current state of the multi-document summarisation of the biomedical literature

no code implementations ACL 2022 Yulia Otmakhova, Karin Verspoor, Timothy Baldwin, Jey Han Lau

Although multi-document summarisation (MDS) of the biomedical literature is a highly valuable task that has recently attracted substantial interest, evaluation of the quality of biomedical summaries lacks consistency and transparency.

Robust Task-Oriented Dialogue Generation with Contrastive Pre-training and Adversarial Filtering

no code implementations20 May 2022 Shiquan Yang, Xinting Huang, Jey Han Lau, Sarah Erfani

Data artifacts incentivize machine learning models to learn non-transferable generalizations by taking advantage of shortcuts in the data, and there is growing evidence that data artifacts play a role for the strong results that deep learning models achieve in recent natural language processing benchmarks.

Contrastive Learning Dialogue Generation +1

An Interpretable Neuro-Symbolic Reasoning Framework for Task-Oriented Dialogue Generation

1 code implementation ACL 2022 Shiquan Yang, Rui Zhang, Sarah Erfani, Jey Han Lau

To obtain a transparent reasoning process, we introduce neuro-symbolic to perform explicit reasoning that justifies model decisions by reasoning chains.

Dialogue Generation Task-Oriented Dialogue Systems

ITTC @ TREC 2021 Clinical Trials Track

no code implementations16 Feb 2022 Thinh Hung Truong, Yulia Otmakhova, Rahmad Mahendra, Timothy Baldwin, Jey Han Lau, Trevor Cohn, Lawrence Cavedon, Damiano Spina, Karin Verspoor

This paper describes the submissions of the Natural Language Processing (NLP) team from the Australian Research Council Industrial Transformation Training Centre (ITTC) for Cognitive Computing in Medical Technologies to the TREC 2021 Clinical Trials Track.

Natural Language Processing

Findings on Conversation Disentanglement

no code implementations ALTA 2021 Rongxin Zhu, Jey Han Lau, Jianzhong Qi

Conversation disentanglement, the task to identify separate threads in conversations, is an important pre-processing step in multi-party conversational NLP applications such as conversational question answering and conversation summarization.

Conversational Question Answering Conversation Disentanglement +2

IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization

1 code implementation EMNLP 2021 Fajri Koto, Jey Han Lau, Timothy Baldwin

We present IndoBERTweet, the first large-scale pretrained model for Indonesian Twitter that is trained by extending a monolingually-trained Indonesian BERT model with additive domain-specific vocabulary.

Language Modelling

Evaluating the Efficacy of Summarization Evaluation across Languages

1 code implementation Findings (ACL) 2021 Fajri Koto, Jey Han Lau, Timothy Baldwin

We take a summarization corpus for eight different languages, and manually annotate generated summaries for focus (precision) and coverage (recall).

Automatic Classification of Neutralization Techniques in the Narrative of Climate Change Scepticism

no code implementations NAACL 2021 Shraey Bhatia, Jey Han Lau, Timothy Baldwin

Neutralisation techniques, e. g. denial of responsibility and denial of victim, are used in the narrative of climate change scepticism to justify lack of action or to promote an alternative view.

Discourse Probing of Pretrained Language Models

1 code implementation NAACL 2021 Fajri Koto, Jey Han Lau, Timothy Baldwin

Existing work on probing of pretrained language models (LMs) has predominantly focused on sentence-level syntactic tasks.

Pretrained Language Models

Top-down Discourse Parsing via Sequence Labelling

1 code implementation EACL 2021 Fajri Koto, Jey Han Lau, Timothy Baldwin

We introduce a top-down approach to discourse parsing that is conceptually simpler than its predecessors (Kobayashi et al., 2020; Zhang et al., 2020).

Discourse Parsing

FFCI: A Framework for Interpretable Automatic Evaluation of Summarization

2 code implementations27 Nov 2020 Fajri Koto, Timothy Baldwin, Jey Han Lau

In this paper, we propose FFCI, a framework for fine-grained summarization evaluation that comprises four elements: faithfulness (degree of factual consistency with the source), focus (precision of summary content relative to the reference), coverage (recall of summary content relative to the reference), and inter-sentential coherence (document fluency between adjacent sentences).

Question Answering Semantic Textual Similarity

IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP

no code implementations COLING 2020 Fajri Koto, Afshin Rahimi, Jey Han Lau, Timothy Baldwin

Although the Indonesian language is spoken by almost 200 million people and the 10th most spoken language in the world, it is under-represented in NLP research.

Language Modelling

COVID-SEE: Scientific Evidence Explorer for COVID-19 Related Research

no code implementations18 Aug 2020 Karin Verspoor, Simon Šuster, Yulia Otmakhova, Shevon Mendis, Zenan Zhai, Biaoyan Fang, Jey Han Lau, Timothy Baldwin, Antonio Jimeno Yepes, David Martinez

We present COVID-SEE, a system for medical literature discovery based on the concept of information exploration, which builds on several distinct text analysis and natural language processing methods to structure and organise information in publications, and augments search by providing a visual overview supporting exploration of a collection to identify key articles of interest.

Natural Language Processing

Less is More: Rejecting Unreliable Reviews for Product Question Answering

1 code implementation9 Jul 2020 Shiwei Zhang, Xiuzhen Zhang, Jey Han Lau, Jeffrey Chan, Cecile Paris

In the literature, PQA is formulated as a retrieval problem with the goal to search for the most relevant reviews to answer a given product question.

Community Question Answering

Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis?

no code implementations ACL 2020 Kobi Leins, Jey Han Lau, Timothy Baldwin

We focus in particular on the role of data statements in ethically assessing research, but also discuss the topic of dual use, and examine the outcomes of similar debates in other scientific disciplines.

Elephant in the Room: An Evaluation Framework for Assessing Adversarial Examples in NLP

no code implementations22 Jan 2020 Ying Xu, Xu Zhong, Antonio Jose Jimeno Yepes, Jey Han Lau

An adversarial example is an input transformed by small perturbations that machine learning models consistently misclassify.

Improved Document Modelling with a Neural Discourse Parser

1 code implementation ALTA 2019 Fajri Koto, Jey Han Lau, Timothy Baldwin

We empirically investigate the benefit of the proposed approach on two different tasks: abstractive summarization and popularity prediction of online petitions.

Abstractive Text Summarization Text Generation

Early Rumour Detection

no code implementations NAACL 2019 Kaimin Zhou, Chang Shu, Binyang Li, Jey Han Lau

Motivated by this, our paper focuses on the task of rumour detection; particularly, we are interested in understanding how early we can detect them.

reinforcement-learning Rumour Detection

From Shakespeare to Li-Bai: Adapting a Sonnet Model to Chinese Poetry

no code implementations ALTA 2019 Zhuohan Xie, Jey Han Lau, Trevor Cohn

In this paper, we adapt Deep-speare, a joint neural network model for English sonnets, to Chinese poetry.

Topic Intrusion for Automatic Topic Model Evaluation

no code implementations EMNLP 2018 Shraey Bhatia, Jey Han Lau, Timothy Baldwin

Topic coherence is increasingly being used to evaluate topic models and filter topics for end-user applications.

Information Retrieval Topic Models

The Influence of Context on Sentence Acceptability Judgements

1 code implementation ACL 2018 Jean-Philippe Bernardy, Shalom Lappin, Jey Han Lau

We investigate the influence that document context exerts on human acceptability judgements for English sentences, via two sets of experiments.

Language Modelling Machine Translation +1

Document Chunking and Learning Objective Generation for Instruction Design

no code implementations1 Jun 2018 Khoi-Nguyen Tran, Jey Han Lau, Danish Contractor, Utkarsh Gupta, Bikram Sengupta, Christopher J. Butler, Mukesh Mohania

Instructional Systems Design is the practice of creating of instructional experiences that make the acquisition of knowledge and skill more efficient, effective, and appealing.


Evaluating Word Embedding Hyper-Parameters for Similarity and Analogy Tasks

no code implementations11 Apr 2018 Maryam Fanaeepour, Adam Makarucha, Jey Han Lau

The versatility of word embeddings for various applications is attracting researchers from various fields.

Word Embeddings

Topically Driven Neural Language Model

1 code implementation ACL 2017 Jey Han Lau, Timothy Baldwin, Trevor Cohn

Language models are typically applied at the sentence level, without access to the broader document context.

Language Modelling

Multimodal Topic Labelling

no code implementations EACL 2017 Ionut Sorodoc, Jey Han Lau, Nikolaos Aletras, Timothy Baldwin

Automatic topic labelling is the task of generating a succinct label that summarises the theme or subject of a topic, with the intention of reducing the cognitive load of end-users when interpreting these topics.

Topic Models

An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation

4 code implementations WS 2016 Jey Han Lau, Timothy Baldwin

Recently, Le and Mikolov (2014) proposed doc2vec as an extension to word2vec (Mikolov et al., 2013a) to learn document-level embeddings.

Document Embedding Word Embeddings

Automatic Detection and Language Identification of Multilingual Documents

no code implementations TACL 2014 Marco Lui, Jey Han Lau, Timothy Baldwin

Language identification is the task of automatically detecting the language(s) present in a document based on the content of the document.

Language Identification Machine Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.