no code implementations • sdp (COLING) 2022 • Sajad Sotudeh, Nazli Goharian
This paper presents our approach for the MuP 2022 shared task —-Multi-Perspective Scientific Document Summarization, where the objective is to enable summarization models to explore methods for generating multi-perspective summaries for scientific papers.
no code implementations • EMNLP (sdp) 2020 • Sajad Sotudeh Gharebagh, Arman Cohan, Nazli Goharian
A two stage model that additionally includes an abstraction step using BART; and 3.
1 code implementation • LREC 2022 • Hrishikesh Kulkarni, Sean MacAvaney, Nazli Goharian, Ophir Frieder
To complement this evaluation, we propose a dynamic thresholding technique that adjusts the classifier’s sensitivity as a function of the number of posts a user has.
no code implementations • 1 Nov 2024 • Sajad Sotudeh, Nazli Goharian
Compared to the state-of-the-art, our model outperforms on QMSum benchmark (all metrics) and matches on SQuALITY benchmark (2 metrics) as measured by Rouge and BertScore while offering a lower training overhead.
1 code implementation • 25 Aug 2024 • Hrishikesh Kulkarni, Nazli Goharian, Ophir Frieder, Sean MacAvaney
We address hallucination by adapting an existing genetic generation approach with a new 'balanced fitness function' consisting of a cross-encoder model for relevance and an n-gram overlap metric to promote grounding.
1 code implementation • 25 Aug 2024 • Hrishikesh Kulkarni, Nazli Goharian, Ophir Frieder, Sean MacAvaney
For efficiency, approximation methods like HNSW are frequently used to approximate exhaustive dense retrieval.
1 code implementation • 31 Jul 2023 • Hrishikesh Kulkarni, Sean MacAvaney, Nazli Goharian, Ophir Frieder
We introduce 'LADR' (Lexically-Accelerated Dense Retrieval), a simple-yet-effective approach that improves the efficiency of existing dense retrieval models without compromising on retrieval effectiveness.
no code implementations • 14 Jul 2023 • Sajad Sotudeh, Nazli Goharian
Query-focused summarization (QFS) is a challenging task in natural language processing that generates summaries to address specific queries.
no code implementations • 2 Feb 2023 • Sajad Sotudeh, Nazli Goharian, Hanieh Deilamsalehy, Franck Dernoncourt
Automatically generating short summaries from users' online mental health posts could save counselors' reading time and reduce their fatigue so that they can provide timely responses to those seeking help for improving their mental state.
no code implementations • 2 Feb 2023 • Sajad Sotudeh, Hanieh Deilamsalehy, Franck Dernoncourt, Nazli Goharian
Recent Transformer-based summarization models have provided a promising approach to abstractive summarization.
no code implementations • LREC 2022 • Sajad Sotudeh, Nazli Goharian, Zachary Young
Some of these platforms, such as Reachout, are dedicated forums where the users register to seek help.
Ranked #1 on
Text Summarization
on MentSum
2 code implementations • NAACL 2022 • Sajad Sotudeh, Nazli Goharian
The recent interest to tackle this problem motivated curation of scientific datasets, arXiv-Long and PubMed-Long, containing human-written summaries of 400-600 words, hence, providing a venue for research in generating long/extended summaries.
1 code implementation • EMNLP (newsum) 2021 • Sajad Sotudeh, Hanieh Deilamsalehy, Franck Dernoncourt, Nazli Goharian
Recent models in developing summarization systems consist of millions of parameters and the model performance is highly dependent on the abundance of training data.
Ranked #1 on
Extreme Summarization
on TLDR9+
1 code implementation • 3 Mar 2021 • Sean MacAvaney, Andrew Yates, Sergey Feldman, Doug Downey, Arman Cohan, Nazli Goharian
Managing the data for Information Retrieval (IR) experiments can be challenging.
no code implementations • EACL (WASSA) 2021 • Tong Xiang, Sean MacAvaney, Eugene Yang, Nazli Goharian
Despite the recent successes of transformer-based models in terms of effectiveness on a variety of tasks, their decisions often remain opaque to humans.
1 code implementation • 28 Dec 2020 • Sajad Sotudeh, Arman Cohan, Nazli Goharian
We then present our results on three long summarization datasets, arXiv-Long, PubMed-Long, and Longsumm.
Ranked #1 on
Extended Summarization
on arXiv-Long Test
2 code implementations • 2 Nov 2020 • Sean MacAvaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan
Pretrained contextualized language models such as BERT and T5 have established a new state-of-the-art for ad-hoc search.
no code implementations • EMNLP 2020 • Sean MacAvaney, Arman Cohan, Nazli Goharian
With worldwide concerns surrounding the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), there is a rapidly growing body of scientific literature on the virus.
no code implementations • SEMEVAL 2020 • Michael Kranzlein, Shabnam Behzad, Nazli Goharian
This paper presents our systems for SemEval 2020 Shared Task 11: Detection of Propaganda Techniques in News Articles.
no code implementations • SEMEVAL 2020 • Sajad Sotudeh, Tong Xiang, Hao-Ren Yao, Sean MacAvaney, Eugene Yang, Nazli Goharian, Ophir Frieder
Offensive language detection is an important and challenging task in natural language processing.
no code implementations • 18 May 2020 • Sean MacAvaney, Franck Dernoncourt, Walter Chang, Nazli Goharian, Ophir Frieder
We present an elegant and effective approach for addressing limitations in existing multi-label classification models by incorporating interaction matching, a concept shown to be useful for ad-hoc search result ranking.
1 code implementation • 5 May 2020 • Sean MacAvaney, Arman Cohan, Nazli Goharian
In this work, we present a search system called SLEDGE, which utilizes SciBERT to effectively re-rank articles.
no code implementations • ACL 2020 • Sajad Sotudeh, Nazli Goharian, Ross W. Filice
Sequence-to-sequence (seq2seq) network is a well-established model for text summarization task.
1 code implementation • 29 Apr 2020 • Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto, Nazli Goharian, Ophir Frieder
Deep pretrained transformer networks are effective at various ranking tasks, such as question answering and ad-hoc document ranking.
1 code implementation • 29 Apr 2020 • Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto, Nazli Goharian, Ophir Frieder
We show that the proposed heuristics can be used to build a training curriculum that down-weights difficult samples early in the training process.
1 code implementation • 29 Apr 2020 • Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto, Nazli Goharian, Ophir Frieder
We also observe that the performance is additive with the current leading first-stage retrieval methods, further narrowing the gap between inexpensive and cost-prohibitive passage ranking approaches.
no code implementations • 18 Jan 2020 • Sean MacAvaney, Arman Cohan, Nazli Goharian, Ross Filice
This allows medical practitioners to easily identify and learn from the reports in which their interpretation most substantially differed from that of the attending physician (who finalized the report).
1 code implementation • 30 Dec 2019 • Sean MacAvaney, Luca Soldaini, Nazli Goharian
While billions of non-English speaking users rely on search engines every day, the problem of ad-hoc information retrieval is rarely studied for non-English languages.
no code implementations • 14 May 2019 • Sean MacAvaney, Sajad Sotudeh, Arman Cohan, Nazli Goharian, Ish Talati, Ross W. Filice
Automatically generating accurate summaries from clinical reports could save a clinician's time, improve summary coverage, and reduce errors.
7 code implementations • 15 Apr 2019 • Sean MacAvaney, Andrew Yates, Arman Cohan, Nazli Goharian
We call this joint approach CEDR (Contextualized Embeddings for Document Ranking).
Ranked #3 on
Ad-Hoc Information Retrieval
on TREC Robust04
no code implementations • WS 2018 • Sean MacAvaney, Bart Desmet, Arman Cohan, Luca Soldaini, Andrew Yates, Ayah Zirikly, Nazli Goharian
Self-reported diagnosis statements have been widely employed in studying language related to mental health in social media.
1 code implementation • COLING 2018 • Arman Cohan, Bart Desmet, Andrew Yates, Luca Soldaini, Sean MacAvaney, Nazli Goharian
Mental health is a significant and growing public health concern.
no code implementations • WS 2018 • Luca Soldaini, Timothy Walsh, Arman Cohan, Julien Han, Nazli Goharian
In recent years, online communities have formed around suicide and self-harm prevention.
2 code implementations • NAACL 2018 • Arman Cohan, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Seokhwan Kim, Walter Chang, Nazli Goharian
Neural abstractive summarization models have led to promising results in summarizing relatively short documents.
Ranked #4 on
Unsupervised Extractive Summarization
on Pubmed
1 code implementation • SEMEVAL 2018 • Sean MacAvaney, Luca Soldaini, Arman Cohan, Nazli Goharian
SemEval 2018 Task 7 focuses on relation ex- traction and classification in scientific literature.
no code implementations • EMNLP 2017 • Andrew Yates, Arman Cohan, Nazli Goharian
We propose methods for identifying posts in support communities that may indicate a risk of self-harm, and demonstrate that our approach outperforms strong previously proposed methods for identifying such posts.
no code implementations • 15 Aug 2017 • Arman Cohan, Allan Fong, Raj Ratwani, Nazli Goharian
Preventable medical errors are estimated to be among the leading causes of injury and death in the United States.
no code implementations • SEMEVAL 2017 • Sean MacAvaney, Arman Cohan, Nazli Goharian
Clinical TempEval 2017 (SemEval 2017 Task 12) addresses the task of cross-domain temporal extraction from clinical text.
no code implementations • 12 Jun 2017 • Arman Cohan, Nazli Goharian
We present a framework for scientific summarization which takes advantage of the citations and the scientific discourse structure.
no code implementations • 23 May 2017 • Arman Cohan, Nazli Goharian
Citation texts are sometimes not very informative or in some cases inaccurate by themselves; they need the appropriate context from the referenced paper to reflect its exact contributions.
1 code implementation • EMNLP 2015 • Arman Cohan, Nazli Goharian
We propose a summarization approach for scientific articles which takes advantage of citation-context and the document discourse model.
no code implementations • 23 Feb 2017 • Arman Cohan, Allan Fong, Nazli Goharian, Raj Ratwani
Medical errors are leading causes of death in the US and as such, prevention of these errors is paramount to promoting health care.
no code implementations • 22 Feb 2017 • Arman Cohan, Sydney Young, Andrew Yates, Nazli Goharian
Our analysis on the interaction of the moderators with the users further indicates that without an automatic way to identify critical content, it is indeed challenging for the moderators to provide timely response to the users in need.
no code implementations • LREC 2016 • Andrew Yates, Alek Kolcz, Nazli Goharian, Ophir Frieder
In this work we use a larger feed to investigate the effects of sampling on Twitter trend detection.
1 code implementation • LREC 2016 • Arman Cohan, Nazli Goharian
Finally, we propose an alternative metric for summarization evaluation which is based on the content relevance between a system generated summary and the corresponding human written summaries.
no code implementations • LREC 2014 • Andrew Yates, Jon Parker, Nazli Goharian, Ophir Frieder
With the rapid growth of social media, there is increasing potential to augment traditional public health surveillance methods with data from social media.