Dimsum @LaySumm 20

1 code implementation EMNLP (sdp) 2020 Tiezheng Yu, Dan Su, Wenliang Dai, Pascale Fung

Lay summarization aims to generate lay summaries of scientific papers automatically.

Lay Summarization

Instruct-Align: Teaching Novel Languages with to LLMs through Alignment-based Cross-Lingual Instruction

no code implementations23 May 2023 Samuel Cahyawijaya, Holy Lovenia, Tiezheng Yu, Willy Chung, Pascale Fung

Prior works on adapting new languages to LLMs find that naively adapting new languages to instruction-tuned LLMs will result in catastrophic forgetting, which in turn causes the loss of multitasking ability in these LLMs.

RHO ($ρ$): Reducing Hallucination in Open-domain Dialogues with Knowledge Grounding

1 code implementation3 Dec 2022 Ziwei Ji, Zihan Liu, Nayeon Lee, Tiezheng Yu, Bryan Wilie, Min Zeng, Pascale Fung

Dialogue systems can leverage large pre-trained language models and knowledge to generate fluent and informative responses.

Representation Learning Re-Ranking

Enabling Classifiers to Make Judgements Explicitly Aligned with Human Values

no code implementations14 Oct 2022 Yejin Bang, Tiezheng Yu, Andrea Madotto, Zhaojiang Lin, Mona Diab, Pascale Fung

Therefore, we introduce a framework for value-aligned classification that performs prediction based on explicitly written human values in the command.

Classification Few-Shot Learning +1

Kaggle Competition: Cantonese Audio-Visual Speech Recognition for In-car Commands

no code implementations6 Jul 2022 Wenliang Dai, Samuel Cahyawijaya, Tiezheng Yu, Elham J Barezi, Pascale Fung

With the rise of deep learning and intelligent vehicles, the smart assistant has become an essential in-car component to facilitate driving and provide extra functionalities.

Audio-Visual Speech Recognition speech-recognition +1

Towards Answering Open-ended Ethical Quandary Questions

no code implementations12 May 2022 Yejin Bang, Nayeon Lee, Tiezheng Yu, Leila Khalatbari, Yan Xu, Samuel Cahyawijaya, Dan Su, Bryan Wilie, Romain Barraud, Elham J. Barezi, Andrea Madotto, Hayden Kee, Pascale Fung

We explore the current capability of LLMs in providing an answer with a deliberative exchange of different perspectives to an ethical quandary, in the approach of Socratic philosophy, instead of providing a closed answer like an oracle.

Few-Shot Learning Generative Question Answering +2

SNP2Vec: Scalable Self-Supervised Pre-Training for Genome-Wide Association Study

1 code implementation BioNLP (ACL) 2022 Samuel Cahyawijaya, Tiezheng Yu, Zihan Liu, Tiffany T. W. Mak, Xiaopu Zhou, Nancy Y. Ip, Pascale Fung

We apply SNP2Vec to perform long-sequence genomics modeling, and we evaluate the effectiveness of our approach on predicting Alzheimer's disease risk in a Chinese cohort.

NeuS: Neutral Multi-News Summarization for Mitigating Framing Bias

1 code implementation NAACL 2022 Nayeon Lee, Yejin Bang, Tiezheng Yu, Andrea Madotto, Pascale Fung

Based on our discovery that title provides a good signal for framing bias, we present NeuS-TITLE that learns to neutralize news content in hierarchical order from title to article.

Multi-Task Learning News Summarization

Survey of Hallucination in Natural Language Generation

no code implementations8 Feb 2022 Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Yejin Bang, Wenliang Dai, Andrea Madotto, Pascale Fung

This advancement has led to more fluent and coherent NLG, leading to improved development in downstream tasks such as abstractive summarization, dialogue generation and data-to-text generation.

Abstractive Text Summarization Data-to-Text Generation +3

Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset

1 code implementation LREC 2022 Tiezheng Yu, Rita Frieske, Peng Xu, Samuel Cahyawijaya, Cheuk Tung Shadow Yiu, Holy Lovenia, Wenliang Dai, Elham J. Barezi, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung

We further conduct experiments with Fairseq S2T Transformer, a state-of-the-art ASR model, on the biggest existing dataset, Common Voice zh-HK, and our proposed MDCC, and the results show the effectiveness of our dataset.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation

2 code implementations LREC 2022 Holy Lovenia, Samuel Cahyawijaya, Genta Indra Winata, Peng Xu, Xu Yan, Zihan Liu, Rita Frieske, Tiezheng Yu, Wenliang Dai, Elham J. Barezi, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung

ASCEND (A Spontaneous Chinese-English Dataset) is a high-quality Mandarin Chinese-English code-switching corpus built on spontaneous multi-turn conversational dialogue sources collected in Hong Kong.

Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization

1 code implementation EMNLP 2021 Tiezheng Yu, Wenliang Dai, Zihan Liu, Pascale Fung

Multimodal abstractive summarization (MAS) models that summarize videos (vision modality) and their corresponding transcripts (text modality) are able to extract the essential information from massive multimodal data on the Internet.

Abstractive Text Summarization Text Generation

AdaptSum: Towards Low-Resource Domain Adaptation for Abstractive Summarization

1 code implementation NAACL 2021 Tiezheng Yu, Zihan Liu, Pascale Fung

State-of-the-art abstractive summarization models generally rely on extensive labeled data, which lowers their generalization ability on domains where such data are not available.

Abstractive Text Summarization Domain Adaptation

Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-TaskLearning for Offensive Language Detection

1 code implementation SEMEVAL 2020 Wenliang Dai, Tiezheng Yu, Zihan Liu, Pascale Fung

Nowadays, offensive content in social media has become a serious problem, and automatically detecting offensive language is an essential task.

Language Modelling Multi-Task Learning

Multi-hop Question Generation with Graph Convolutional Network

1 code implementation Findings of the Association for Computational Linguistics 2020 Dan Su, Yan Xu, Wenliang Dai, Ziwei Ji, Tiezheng Yu, Pascale Fung

Multi-hop Question Generation (QG) aims to generate answer-related questions by aggregating and reasoning over multiple scattered evidence from different paragraphs.

Question Generation Question-Generation

Modality-Transferable Emotion Embeddings for Low-Resource Multimodal Emotion Recognition

1 code implementation Asian Chapter of the Association for Computational Linguistics 2020 Wenliang Dai, Zihan Liu, Tiezheng Yu, Pascale Fung

Despite the recent achievements made in the multi-modal emotion recognition task, two problems still exist and have not been well investigated: 1) the relationship between different emotion categories are not utilized, which leads to sub-optimal performance; and 2) current models fail to cope well with low-resource emotions, especially for unseen emotions.

Multimodal Emotion Recognition Word Embeddings

CAiRE-COVID: A Question Answering and Query-focused Multi-Document Summarization System for COVID-19 Scholarly Information Management

1 code implementation EMNLP (NLP-COVID19) 2020 Dan Su, Yan Xu, Tiezheng Yu, Farhad Bin Siddique, Elham J. Barezi, Pascale Fung

We present CAiRE-COVID, a real-time question answering (QA) and multi-document summarization system, which won one of the 10 tasks in the Kaggle COVID-19 Open Research Dataset Challenge, judged by medical experts.

Document Summarization Information Retrieval +3

Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for Offensive Language Detection

1 code implementation28 Apr 2020 Wenliang Dai, Tiezheng Yu, Zihan Liu, Pascale Fung

Nowadays, offensive content in social media has become a serious problem, and automatically detecting offensive language is an essential task.

Abuse Detection Language Modelling +1

