1 code implementation • EMNLP (ACL) 2021 • Raymond Li, Wen Xiao, Lanjun Wang, Hyeju Jang, Giuseppe Carenini
Transformers are the dominant architecture in NLP, but their training and fine-tuning is still very challenging.
no code implementations • NAACL (DeeLIO) 2021 • Hyeju Jang, Seojin Bang, Wen Xiao, Giuseppe Carenini, Raymond Ng, Young ji Lee
Text classification has wide-ranging applications in various domains.
no code implementations • EMNLP (CODI) 2020 • Grigorii Guz, Giuseppe Carenini
We present preliminary results on investigating the benefits of coreference resolution features for neural RST discourse parsing by considering different levels of coupling of the discourse parser with the coreference resolver.
2 code implementations • 22 Dec 2024 • Mahdi Mostajabdaveh, Timothy T. Yu, Samarendra Chandan Bindu Dash, Rindranirina Ramamonjison, Jabo Serge Byusa, Giuseppe Carenini, Zirui Zhou, Yong Zhang
In this paper, we introduce and apply Operations Research Question Answering (ORQA), a new benchmark designed to assess the generalization capabilities of Large Language Models (LLMs) in the specialized technical domain of Operations Research (OR).
1 code implementation • 9 Dec 2024 • Amirhossein Abaskohi, Spandana Gella, Giuseppe Carenini, Issam H. Laradji
Multimodal multihop question answering is a complex task that requires reasoning over multiple sources of information, such as images and text, to answer questions.
no code implementations • 27 Jun 2024 • Giuseppe Carenini, Jordon Johnson, Ali Salamatian
Automatically captioning visualizations is not new, but recent advances in large language models(LLMs) open exciting new possibilities.
no code implementations • 12 Jun 2024 • Yuxi Feng, Raymond Li, Zhenan Fan, Giuseppe Carenini, Mohammadreza Pourreza, Weiwei Zhang, Yong Zhang
While in-context Learning (ICL) has proven to be an effective technique to improve the performance of Large Language Models (LLMs) in a variety of complex tasks, notably in translating natural language questions into Structured Query Language (NL2SQL), the question of how to select the most beneficial demonstration examples remains an open research problem.
1 code implementation • 3 Apr 2024 • Amirhossein Abaskohi, Amirhossein Dabiriaghdam, Lele Wang, Giuseppe Carenini
Memes, combining text and images, frequently use metaphors to convey persuasive messages, shaping public opinion.
Caption Generation Hierarchical Multi-label Classification +1
1 code implementation • 26 Mar 2024 • Felipe González-Pizarro, Giuseppe Carenini
This paper presents the first systematic and comprehensive evaluation of multimodal topic modeling of documents containing both text and images.
no code implementations • 30 Nov 2023 • Linzi Xing, Quan Tran, Fabian Caba, Franck Dernoncourt, Seunghyun Yoon, Zhaowen Wang, Trung Bui, Giuseppe Carenini
Video topic segmentation unveils the coarse-grained semantic structure underlying videos and is essential for other video understanding tasks.
no code implementations • 24 Nov 2023 • Linzi Xing, Brad Hackinen, Giuseppe Carenini
U. S. Federal Regulators receive over one million comment letters each year from businesses, interest groups, and members of the public, all advocating for changes to proposed regulations.
no code implementations • 21 Nov 2023 • Raymond Li, Ruixin Yang, Wen Xiao, Ahmed Aburaed, Gabriel Murray, Giuseppe Carenini
While transformer-based models have achieved state-of-the-art results in a variety of classification and generation tasks, their black-box nature makes them challenging for interpretability.
no code implementations • 24 Oct 2023 • Raymond Li, Gabriel Murray, Giuseppe Carenini
In this work, we propose a method that combines two popular research areas by injecting linguistic structures into pre-trained language models in the parameter-efficient fine-tuning (PEFT) setting.
1 code implementation • 25 May 2023 • Raymond Li, Felipe González-Pizarro, Linzi Xing, Gabriel Murray, Giuseppe Carenini
The standard approach for neural topic modeling uses a variational autoencoder (VAE) framework that jointly minimizes the KL divergence between the estimated posterior and prior, in addition to the reconstruction loss.
1 code implementation • 4 May 2023 • Wen Xiao, Yujia Xie, Giuseppe Carenini, Pengcheng He
The inference-only large language model (ChatGPT) serves as both the generator and editor, with a smaller model acting as the instructor to guide output generation.
1 code implementation • 14 Mar 2023 • Rindranirina Ramamonjison, Timothy T. Yu, Raymond Li, Haley Li, Giuseppe Carenini, Bissan Ghaddar, Shiqi He, Mahdi Mostajabdaveh, Amin Banitalebi-Dehkordi, Zirui Zhou, Yong Zhang
The Natural Language for Optimization (NL4Opt) Competition was created to investigate methods of extracting the meaning and formulation of an optimization problem based on its text description.
no code implementations • 12 Feb 2023 • Chuyuan Li, Patrick Huber, Wen Xiao, Maxime Amblard, Chloé Braud, Giuseppe Carenini
As a result, we explore approaches to build discourse structures for dialogues, based on attention matrices from Pre-trained Language Models (PLMs).
1 code implementation • 21 Dec 2022 • Wen Xiao, Lesly Miculicich, Yang Liu, Pengcheng He, Giuseppe Carenini
Content-Controllable Summarization generates summaries focused on the given controlling signals.
no code implementations • 18 Oct 2022 • Patrick Huber, Giuseppe Carenini
With a growing need for robust and general discourse structures in many downstream tasks and real-world applications, the current lack of high-quality, high-quantity discourse trees poses a severe shortcoming.
no code implementations • 18 Oct 2022 • Patrick Huber, Giuseppe Carenini
Discourse analysis and discourse parsing have shown great impact on many important problems in the field of Natural Language Processing (NLP).
no code implementations • 18 Oct 2022 • Patrick Huber, Giuseppe Carenini
Discourse parsing is an essential upstream task in Natural Language Processing with strong implications for many real-world applications.
1 code implementation • 21 Sep 2022 • Yan Liu, Maria Laricheva, Chiyu Zhang, Patrick Boutet, GuanYu Chen, Terence Tracey, Giuseppe Carenini, Richard Young
This study is to explore how to use natural language processing (NLP) methods, especially unsupervised machine learning, to assist psychologists to analyze emotions and sentiments and to use topic modeling to identify common issues and challenges that young people with IDD and their families have.
no code implementations • COLING (CODI, CRAC) 2022 • Linzi Xing, Patrick Huber, Giuseppe Carenini
Recent neural supervised topic segmentation models achieve distinguished superior effectiveness over unsupervised methods, with the availability of large-scale training corpora sampled from Wikipedia.
1 code implementation • 7 Sep 2022 • Wen Xiao, Giuseppe Carenini
Despite the success of recent abstractive summarizers on automatic evaluation metrics, the generated summaries still present factual inconsistencies with the source document.
1 code implementation • 12 Aug 2022 • Maria Laricheva, Chiyu Zhang, Yan Liu, GuanYu Chen, Terence Tracey, Richard Young, Giuseppe Carenini
Conversational data is essential in psychology because it can help researchers understand individuals cognitive processes, emotions, and behaviors.
no code implementations • NAACL 2022 • Patrick Huber, Giuseppe Carenini
With a growing number of BERTology work analyzing different components of pre-trained language models, we extend this line of research through an in-depth analysis of discourse information in pre-trained and fine-tuned language models.
no code implementations • 12 Dec 2021 • Patrick Huber, Linzi Xing, Giuseppe Carenini
RST-style discourse parsing plays a vital role in many NLP tasks, revealing the underlying semantic/pragmatic structure of potentially complex and diverse documents.
1 code implementation • 10 Dec 2021 • Raymond Li, Wen Xiao, Linzi Xing, Lanjun Wang, Gabriel Murray, Giuseppe Carenini
The multi-head self-attention mechanism of the transformer model has been thoroughly investigated recently.
3 code implementations • ACL 2022 • Wen Xiao, Iz Beltagy, Giuseppe Carenini, Arman Cohan
We introduce PRIMERA, a pre-trained model for multi-document representation with a focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning labeled data.
Ranked #1 on Multi-Document Summarization on Multi-News
1 code implementation • 31 Aug 2021 • Raymond Li, Wen Xiao, Lanjun Wang, Hyeju Jang, Giuseppe Carenini
Transformers are the dominant architecture in NLP, but their training and fine-tuning is still very challenging.
no code implementations • 30 Aug 2021 • Raymond Li, Enamul Hoque, Giuseppe Carenini, Richard Lester, Raymond Chau
The proliferation of text messaging for mobile health is generating a large amount of patient-doctor conversations that can be extremely valuable to health care professionals.
1 code implementation • SIGDIAL (ACL) 2021 • Linzi Xing, Giuseppe Carenini
Dialogue topic segmentation is critical in several dialogue modeling problems.
no code implementations • ACL 2021 • Patrick Huber, Wen Xiao, Giuseppe Carenini
Aiming for a better integration of data-driven and linguistically-inspired approaches, we explore whether RST Nuclearity, assigning a binary assessment of importance between text segments, can be replaced by automatically generated, real-valued scores, in what we call a Weighted-RST framework.
no code implementations • ACL 2021 • Linzi Xing, Wen Xiao, Giuseppe Carenini
In news articles the lead bias is a common phenomenon that usually dominates the learning signals for neural extractive summarizers, severely limiting their performance on data with different or even no bias.
1 code implementation • NAACL 2021 • Wen Xiao, Patrick Huber, Giuseppe Carenini
Previous work indicates that discourse information benefits summarization.
no code implementations • 17 Dec 2020 • Patrick Huber, Giuseppe Carenini
In this paper we are inferring general tree structures of natural text in multiple domains, showing promising results on a diverse set of tasks.
no code implementations • EMNLP (CODI) 2020 • Wen Xiao, Patrick Huber, Giuseppe Carenini
The multi-head self-attention of popular transformer models is widely used within Natural Language Processing (NLP), including for the task of extractive summarization.
no code implementations • COLING 2020 • Grigorii Guz, Patrick Huber, Giuseppe Carenini
RST-based discourse parsing is an important NLP task with numerous downstream applications, such as summarization, machine translation and opinion mining.
Ranked #18 on Discourse Parsing on RST-DT (Standard Parseval (Span) metric)
1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Wen Xiao, Giuseppe Carenini
Our analysis of large summarization datasets indicates that redundancy is a very serious problem when summarizing long documents.
Ranked #16 on Text Summarization on Pubmed
no code implementations • 6 Nov 2020 • Grigorii Guz, Patrick Huber, Giuseppe Carenini
RST-based discourse parsing is an important NLP task with numerous downstream applications, such as summarization, machine translation and opinion mining.
Ranked #9 on Discourse Parsing on Instructional-DT (Instr-DT)
1 code implementation • EMNLP 2020 • Patrick Huber, Giuseppe Carenini
The lack of large and diverse discourse treebanks hinders the application of data-driven approaches, such as deep-learning, to RST-style discourse parsing.
no code implementations • COLING 2020 • Patrick Huber, Giuseppe Carenini
Sentiment analysis, especially for long documents, plausibly requires methods capturing complex linguistics structures.
1 code implementation • 4 Nov 2020 • Tanzila Rahman, Shih-Han Chou, Leonid Sigal, Giuseppe Carenini
We also propose multimodal fusion module to combine both visual and textual information.
no code implementations • DT4TP 2020 • Grigorii Guz, Giuseppe Carenini
With the goal of fostering more general and data-driven approaches to text structuring, we propose the new and domain-independent NLG task of structuring and ordering a (possibly large) set of EDUs.
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Linzi Xing, Brad Hackinen, Giuseppe Carenini, Francesco Trebbi
Topic segmentation is critical in key NLP tasks and recent works favor highly effective neural supervised approaches.
1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Grigorii Guz, Peyman Bateni, Darius Muglich, Giuseppe Carenini
We evaluate our approach on the Grammarly Corpus for Discourse Coherence (GCDC) and show that when ensembled with the current state of the art, we can achieve the new state of the art accuracy on this benchmark.
Ranked #1 on Coherence Evaluation on GCDC + RST - F1
no code implementations • EMNLP (NLP-COVID19) 2020 • Hyeju Jang, Emily Rempel, Giuseppe Carenini, Naveed Janjua
We also examine people's sentiment about COVID-19 related issues.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA)
no code implementations • WS 2019 • John-Jose Nunez, Giuseppe Carenini
Pre-trained word embeddings are becoming increasingly popular for natural language processing tasks.
no code implementations • IJCNLP 2019 • Patrick Huber, Giuseppe Carenini
Results indicate that while our parser does not yet match the performance of a parser trained and tested on the same dataset (intra-domain), it does perform remarkably well on the much more difficult and arguably more useful task of inter-domain discourse structure prediction, where the parser is trained on one domain and tested/applied on another one.
1 code implementation • IJCNLP 2019 • Wen Xiao, Giuseppe Carenini
In this paper, we propose a novel neural single document extractive summarization model for long documents, incorporating both the global context of the whole document and the local context within the current topic.
Ranked #20 on Text Summarization on Arxiv HEP-TH citation graph
1 code implementation • IJCNLP 2019 • Linzi Xing, Michael J. Paul, Giuseppe Carenini
Probabilistic topic models such as latent Dirichlet allocation (LDA) are popularly used with Bayesian inference methods such as Gibbs sampling to learn posterior distributions over topic model parameters.
no code implementations • ACL 2019 • Shafiq Joty, Giuseppe Carenini, Raymond Ng, Gabriel Murray
Discourse processing is a suite of Natural Language Processing (NLP) tasks to uncover linguistic structures from texts at several levels, which can support many downstream applications.
no code implementations • IJCNLP 2017 • Shikib Mehri, Giuseppe Carenini
Thread disentanglement is a precursor to any high-level analysis of multiparticipant chats.
no code implementations • WS 2017 • Karan Singla, Evgeny Stepanov, Ali Orkan Bayer, Giuseppe Carenini, Giuseppe Riccardi
Summarization of spoken conversations is a challenging task, since it requires deep understanding of dialogs.
no code implementations • WS 2017 • Enamul Hoque, Giuseppe Carenini
With the proliferation of Web-based social media, asynchronous conversations have become very common for supporting online communication and collaboration.
no code implementations • WS 2017 • Bita Nejat, Giuseppe Carenini, Raymond Ng
Discourse Parsing and Sentiment Analysis are two fundamental tasks in Natural Language Processing that have been shown to be mutually beneficial.
Ranked #26 on Sentiment Analysis on SST-5 Fine-grained classification
1 code implementation • WS 2017 • Vaden Masrani, Gabriel Murray, Thalia Field, Giuseppe Carenini
We investigate if writers with dementia can be automatically distinguished from those without by analyzing linguistic markers in written text, in the form of blog posts.
no code implementations • WS 2017 • Jordon Johnson, Vaden Masrani, Giuseppe Carenini, Raymond Ng
We define and motivate the problem of summarizing partial email threads.
no code implementations • COLING 2016 • Kailang Jiang, Giuseppe Carenini, Raymond Ng
We propose a training data enrichment framework that relies on co-training of two different discourse parsers on unlabeled documents.
no code implementations • COLING 2016 • Enamul Hoque, Shafiq Joty, Llu{\'\i}s M{\`a}rquez, Alberto Barr{\'o}n-Cede{\~n}o, Giovanni Da San Martino, Aless Moschitti, ro, Preslav Nakov, Salvatore Romeo, Giuseppe Carenini
We present an interactive system to provide effective and efficient search capabilities in Community Question Answering (cQA) forums.
no code implementations • CL 2015 • Shafiq Joty, Giuseppe Carenini, Raymond T. Ng
Ranked #8 on Discourse Parsing on RST-DT (RST-Parseval (Span) metric)
no code implementations • 4 Feb 2014 • Shafiq Rayhan Joty, Giuseppe Carenini, Raymond T. Ng
Topic segmentation and labeling is often considered a prerequisite for higher-level conversation analysis and has been shown to be useful in many Natural Language Processing (NLP) applications.
no code implementations • AAAI 2008 • Jan Ulrich, Gabriel Murray, Giuseppe Carenini
Annotated email corpora are necessary for evaluation and training of machine learning summarization techniques.