no code implementations • EMNLP (sdp) 2020 • Sajad Sotudeh Gharebagh, Arman Cohan, Nazli Goharian
A two stage model that additionally includes an abstraction step using BART; and 3.
1 code implementation • EMNLP (newsum) 2021 • Sajad Sotudeh, Hanieh Deilamsalehy, Franck Dernoncourt, Nazli Goharian
Recent models in developing summarization systems consist of millions of parameters and the model performance is highly dependent on the abundance of training data.
Ranked #1 on
Extreme Summarization
on TLDR9+
1 code implementation • 3 Mar 2021 • Sean MacAvaney, Andrew Yates, Sergey Feldman, Doug Downey, Arman Cohan, Nazli Goharian
Managing the data for Information Retrieval (IR) experiments can be challenging.
no code implementations • EACL (WASSA) 2021 • Tong Xiang, Sean MacAvaney, Eugene Yang, Nazli Goharian
Despite the recent successes of transformer-based models in terms of effectiveness on a variety of tasks, their decisions often remain opaque to humans.
1 code implementation • 28 Dec 2020 • Sajad Sotudeh, Arman Cohan, Nazli Goharian
We then present our results on three long summarization datasets, arXiv-Long, PubMed-Long, and Longsumm.
Ranked #1 on
Extended Summarization
on Longsumm Val
1 code implementation • 2 Nov 2020 • Sean MacAvaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan
We present a new comprehensive framework for Analyzing the Behavior of Neural IR ModeLs (ABNIRML), which includes new types of diagnostic tests that allow us to probe several characteristics---such as sensitivity to word order---that are not addressed by previous techniques.
no code implementations • EMNLP 2020 • Sean MacAvaney, Arman Cohan, Nazli Goharian
With worldwide concerns surrounding the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), there is a rapidly growing body of scientific literature on the virus.
no code implementations • SEMEVAL 2020 • Michael Kranzlein, Shabnam Behzad, Nazli Goharian
This paper presents our systems for SemEval 2020 Shared Task 11: Detection of Propaganda Techniques in News Articles.
no code implementations • SEMEVAL 2020 • Sajad Sotudeh, Tong Xiang, Hao-Ren Yao, Sean MacAvaney, Eugene Yang, Nazli Goharian, Ophir Frieder
Offensive language detection is an important and challenging task in natural language processing.
no code implementations • 18 May 2020 • Sean MacAvaney, Franck Dernoncourt, Walter Chang, Nazli Goharian, Ophir Frieder
We present an elegant and effective approach for addressing limitations in existing multi-label classification models by incorporating interaction matching, a concept shown to be useful for ad-hoc search result ranking.
1 code implementation • 5 May 2020 • Sean MacAvaney, Arman Cohan, Nazli Goharian
In this work, we present a search system called SLEDGE, which utilizes SciBERT to effectively re-rank articles.
no code implementations • ACL 2020 • Sajad Sotudeh, Nazli Goharian, Ross W. Filice
Sequence-to-sequence (seq2seq) network is a well-established model for text summarization task.
1 code implementation • 29 Apr 2020 • Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto, Nazli Goharian, Ophir Frieder
Deep pretrained transformer networks are effective at various ranking tasks, such as question answering and ad-hoc document ranking.
1 code implementation • 29 Apr 2020 • Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto, Nazli Goharian, Ophir Frieder
We show that the proposed heuristics can be used to build a training curriculum that down-weights difficult samples early in the training process.
1 code implementation • 29 Apr 2020 • Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto, Nazli Goharian, Ophir Frieder
We also observe that the performance is additive with the current leading first-stage retrieval methods, further narrowing the gap between inexpensive and cost-prohibitive passage ranking approaches.
no code implementations • 18 Jan 2020 • Sean MacAvaney, Arman Cohan, Nazli Goharian, Ross Filice
This allows medical practitioners to easily identify and learn from the reports in which their interpretation most substantially differed from that of the attending physician (who finalized the report).
1 code implementation • 30 Dec 2019 • Sean MacAvaney, Luca Soldaini, Nazli Goharian
While billions of non-English speaking users rely on search engines every day, the problem of ad-hoc information retrieval is rarely studied for non-English languages.
no code implementations • 14 May 2019 • Sean MacAvaney, Sajad Sotudeh, Arman Cohan, Nazli Goharian, Ish Talati, Ross W. Filice
Automatically generating accurate summaries from clinical reports could save a clinician's time, improve summary coverage, and reduce errors.
6 code implementations • 15 Apr 2019 • Sean MacAvaney, Andrew Yates, Arman Cohan, Nazli Goharian
We call this joint approach CEDR (Contextualized Embeddings for Document Ranking).
Ranked #3 on
Ad-Hoc Information Retrieval
on TREC Robust04
no code implementations • WS 2018 • Sean MacAvaney, Bart Desmet, Arman Cohan, Luca Soldaini, Andrew Yates, Ayah Zirikly, Nazli Goharian
Self-reported diagnosis statements have been widely employed in studying language related to mental health in social media.
no code implementations • COLING 2018 • Arman Cohan, Bart Desmet, Andrew Yates, Luca Soldaini, Sean MacAvaney, Nazli Goharian
Mental health is a significant and growing public health concern.
no code implementations • WS 2018 • Luca Soldaini, Timothy Walsh, Arman Cohan, Julien Han, Nazli Goharian
In recent years, online communities have formed around suicide and self-harm prevention.
2 code implementations • NAACL 2018 • Arman Cohan, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Seokhwan Kim, Walter Chang, Nazli Goharian
Neural abstractive summarization models have led to promising results in summarizing relatively short documents.
Ranked #4 on
Unsupervised Extractive Summarization
on Pubmed
Abstractive Text Summarization
Unsupervised Extractive Summarization
1 code implementation • SEMEVAL 2018 • Sean MacAvaney, Luca Soldaini, Arman Cohan, Nazli Goharian
SemEval 2018 Task 7 focuses on relation ex- traction and classification in scientific literature.
no code implementations • EMNLP 2017 • Andrew Yates, Arman Cohan, Nazli Goharian
We propose methods for identifying posts in support communities that may indicate a risk of self-harm, and demonstrate that our approach outperforms strong previously proposed methods for identifying such posts.
no code implementations • 15 Aug 2017 • Arman Cohan, Allan Fong, Raj Ratwani, Nazli Goharian
Preventable medical errors are estimated to be among the leading causes of injury and death in the United States.
no code implementations • SEMEVAL 2017 • Sean MacAvaney, Arman Cohan, Nazli Goharian
Clinical TempEval 2017 (SemEval 2017 Task 12) addresses the task of cross-domain temporal extraction from clinical text.
no code implementations • 12 Jun 2017 • Arman Cohan, Nazli Goharian
We present a framework for scientific summarization which takes advantage of the citations and the scientific discourse structure.
no code implementations • 23 May 2017 • Arman Cohan, Nazli Goharian
Citation texts are sometimes not very informative or in some cases inaccurate by themselves; they need the appropriate context from the referenced paper to reflect its exact contributions.
1 code implementation • EMNLP 2015 • Arman Cohan, Nazli Goharian
We propose a summarization approach for scientific articles which takes advantage of citation-context and the document discourse model.
no code implementations • 23 Feb 2017 • Arman Cohan, Allan Fong, Nazli Goharian, Raj Ratwani
Medical errors are leading causes of death in the US and as such, prevention of these errors is paramount to promoting health care.
no code implementations • 22 Feb 2017 • Arman Cohan, Sydney Young, Andrew Yates, Nazli Goharian
Our analysis on the interaction of the moderators with the users further indicates that without an automatic way to identify critical content, it is indeed challenging for the moderators to provide timely response to the users in need.
no code implementations • LREC 2016 • Andrew Yates, Alek Kolcz, Nazli Goharian, Ophir Frieder
In this work we use a larger feed to investigate the effects of sampling on Twitter trend detection.
no code implementations • LREC 2016 • Arman Cohan, Nazli Goharian
Finally, we propose an alternative metric for summarization evaluation which is based on the content relevance between a system generated summary and the corresponding human written summaries.
no code implementations • LREC 2014 • Andrew Yates, Jon Parker, Nazli Goharian, Ophir Frieder
With the rapid growth of social media, there is increasing potential to augment traditional public health surveillance methods with data from social media.