1 code implementation • IJCNLP 2019 • Wen Xiao, Giuseppe Carenini
In this paper, we propose a novel neural single document extractive summarization model for long documents, incorporating both the global context of the whole document and the local context within the current topic.
Ranked #19 on Text Summarization on Pubmed
2 code implementations • CVPR 2021 • Qingyong Hu, Bo Yang, Sheikh Khalid, Wen Xiao, Niki Trigoni, Andrew Markham
An essential prerequisite for unleashing the potential of supervised deep learning algorithms in the area of 3D scene understanding is the availability of large-scale and richly annotated datasets.
1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Wen Xiao, Giuseppe Carenini
Our analysis of large summarization datasets indicates that redundancy is a very serious problem when summarizing long documents.
Ranked #15 on Text Summarization on Pubmed
no code implementations • EMNLP (CODI) 2020 • Wen Xiao, Patrick Huber, Giuseppe Carenini
The multi-head self-attention of popular transformer models is widely used within Natural Language Processing (NLP), including for the task of extractive summarization.
1 code implementation • NAACL 2021 • Wen Xiao, Patrick Huber, Giuseppe Carenini
Previous work indicates that discourse information benefits summarization.
no code implementations • ACL 2021 • Linzi Xing, Wen Xiao, Giuseppe Carenini
In news articles the lead bias is a common phenomenon that usually dominates the learning signals for neural extractive summarizers, severely limiting their performance on data with different or even no bias.
no code implementations • ACL 2021 • Patrick Huber, Wen Xiao, Giuseppe Carenini
Aiming for a better integration of data-driven and linguistically-inspired approaches, we explore whether RST Nuclearity, assigning a binary assessment of importance between text segments, can be replaced by automatically generated, real-valued scores, in what we call a Weighted-RST framework.
1 code implementation • 31 Aug 2021 • Raymond Li, Wen Xiao, Lanjun Wang, Hyeju Jang, Giuseppe Carenini
Transformers are the dominant architecture in NLP, but their training and fine-tuning is still very challenging.
2 code implementations • ACL 2022 • Wen Xiao, Iz Beltagy, Giuseppe Carenini, Arman Cohan
We introduce PRIMERA, a pre-trained model for multi-document representation with a focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning labeled data.
Ranked #1 on Multi-Document Summarization on Multi-News
1 code implementation • 10 Dec 2021 • Raymond Li, Wen Xiao, Linzi Xing, Lanjun Wang, Gabriel Murray, Giuseppe Carenini
The multi-head self-attention mechanism of the transformer model has been thoroughly investigated recently.
no code implementations • 12 Jan 2022 • Qingyong Hu, Bo Yang, Sheikh Khalid, Wen Xiao, Niki Trigoni, Andrew Markham
Each point in the dataset has been labelled with fine-grained semantic annotations, resulting in a dataset that is three times the size of the previous existing largest photogrammetric point cloud dataset.
1 code implementation • 7 Sep 2022 • Wen Xiao, Giuseppe Carenini
Despite the success of recent abstractive summarizers on automatic evaluation metrics, the generated summaries still present factual inconsistencies with the source document.
1 code implementation • 21 Dec 2022 • Wen Xiao, Lesly Miculicich, Yang Liu, Pengcheng He, Giuseppe Carenini
Content-Controllable Summarization generates summaries focused on the given controlling signals.
no code implementations • 12 Feb 2023 • Chuyuan Li, Patrick Huber, Wen Xiao, Maxime Amblard, Chloé Braud, Giuseppe Carenini
As a result, we explore approaches to build discourse structures for dialogues, based on attention matrices from Pre-trained Language Models (PLMs).
1 code implementation • 4 May 2023 • Wen Xiao, Yujia Xie, Giuseppe Carenini, Pengcheng He
The inference-only large language model (ChatGPT) serves as both the generator and editor, with a smaller model acting as the instructor to guide output generation.
no code implementations • 21 Nov 2023 • Raymond Li, Ruixin Yang, Wen Xiao, Ahmed Aburaed, Gabriel Murray, Giuseppe Carenini
While transformer-based models have achieved state-of-the-art results in a variety of classification and generation tasks, their black-box nature makes them challenging for interpretability.
no code implementations • 12 Dec 2023 • Yu Fu, Yufei Li, Wen Xiao, Cong Liu, Yue Dong
Recent developments in balancing the usefulness and safety of Large Language Models (LLMs) have raised a critical question: Are mainstream NLP tasks adequately aligned with safety consideration?
1 code implementation • EMNLP (ACL) 2021 • Raymond Li, Wen Xiao, Lanjun Wang, Hyeju Jang, Giuseppe Carenini
Transformers are the dominant architecture in NLP, but their training and fine-tuning is still very challenging.
no code implementations • NAACL (DeeLIO) 2021 • Hyeju Jang, Seojin Bang, Wen Xiao, Giuseppe Carenini, Raymond Ng, Young ji Lee
Text classification has wide-ranging applications in various domains.