no code implementations • NAACL (ACL) 2022 • Manoj Kumar, Yuval Merhav, Haidar Khan, Rahul Gupta, Anna Rumshisky, Wael Hamza
Use of synthetic data is rapidly emerging as a realistic alternative to manually annotating live traffic for industry-scale model building.
no code implementations • 4 Apr 2023 • Vladislav Lialin, Stephen Rawls, David Chan, Shalini Ghosh, Anna Rumshisky, Wael Hamza
Currently popular video-text data mining approach via automatic speech recognition (ASR) used in HowTo100M provides low-quality captions that often do not refer to the video content.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+5
1 code implementation • 29 Mar 2023 • Namrata Shivagunde, Vladislav Lialin, Anna Rumshisky
Language model probing is often used to test specific capabilities of these models.
no code implementations • 28 Mar 2023 • Vladislav Lialin, Vijeta Deshpande, Anna Rumshisky
This paper presents a systematic overview and comparison of parameter-efficient fine-tuning methods covering over 40 papers published between February 2019 and February 2023.
no code implementations • 15 Nov 2022 • Saurabh Kulshreshtha, Anna Rumshisky
Multi-hop Question Generation is the task of generating questions which require the reader to reason over and combine information spread across multiple passages using several reasoning steps.
no code implementations • 8 Oct 2022 • Tzu-Hsiang Lin, Ta-Chung Chi, Anna Rumshisky
Recent advancements in dialogue response selection (DRS) are based on the \textit{task-adaptive pre-training (TAP)} approach, by first initializing their model with BERT~\cite{devlin-etal-2019-bert}, and adapt to dialogue data with dialogue-specific or fine-grained pre-training tasks.
1 code implementation • 2 Aug 2022 • Saleh Soltan, Shankar Ananthakrishnan, Jack FitzGerald, Rahul Gupta, Wael Hamza, Haidar Khan, Charith Peris, Stephen Rawls, Andy Rosenbaum, Anna Rumshisky, Chandana Satya Prakash, Mukund Sridhar, Fabian Triefenbach, Apurv Verma, Gokhan Tur, Prem Natarajan
In this work, we demonstrate that multilingual large-scale sequence-to-sequence (seq2seq) models, pre-trained on a mixture of denoising and Causal Language Modeling (CLM) tasks, are more efficient few-shot learners than decoder-only models on various tasks.
Ranked #8 on
Natural Language Inference
on CommitmentBank
1 code implementation • NAACL (ClinicalNLP) 2022 • Eric Lehman, Vladislav Lialin, Katelyn Y. Legaspi, Anne Janelle R. Sy, Patricia Therese S. Pile, Nicole Rose I. Alberto, Richard Raymund R. Ragasa, Corinna Victoria M. Puyat, Isabelle Rose I. Alberto, Pia Gabrielle I. Alfonso, Marianne Taliño, Dana Moukheiber, Byron C. Wallace, Anna Rumshisky, Jenifer J. Liang, Preethi Raghavan, Leo Anthony Celi, Peter Szolovits
The questions are generated by medical experts from 100+ MIMIC-III discharge summaries.
1 code implementation • ACL 2022 • Vladislav Lialin, Kevin Zhao, Namrata Shivagunde, Anna Rumshisky
Existing pre-trained transformer analysis works usually focus only on one or two model families at a time, overlooking the variability of the architecture and pre-training objectives.
1 code implementation • ACL 2022 • Saurabh Kulshreshtha, Olga Kovaleva, Namrata Shivagunde, Anna Rumshisky
Solving crossword puzzles requires diverse reasoning capabilities, access to a vast amount of knowledge about language and the world, and the ability to satisfy the constraints imposed by the structure of the puzzle.
Natural Language Understanding
Open-Domain Question Answering
+1
no code implementations • NAACL 2022 • Rahul Sharma, Anil Ramakrishna, Ansel MacLaughlin, Anna Rumshisky, Jimit Majmudar, Clement Chung, Salman Avestimehr, Rahul Gupta
Federated learning (FL) has recently emerged as a method for training ML models on edge devices using sensitive user data and is seen as a way to mitigate concerns over data privacy.
1 code implementation • 21 Jul 2021 • Mikhail Burtsev, Anna Rumshisky
Transformer-based encoder-decoder models produce a fused token-wise representation after every encoder layer.
no code implementations • 14 Jul 2021 • Christophe Dupuy, Radhika Arava, Rahul Gupta, Anna Rumshisky
However, the data used to train NLU models may contain private information such as addresses or phone numbers, particularly when drawn from human subjects.
no code implementations • Findings (ACL) 2021 • Olga Kovaleva, Saurabh Kulshreshtha, Anna Rogers, Anna Rumshisky
Multiple studies have shown that Transformers are remarkably robust to pruning.
no code implementations • COLING 2020 • Anna Rogers, Anna Rumshisky
Question answering, natural language inference and commonsense reasoning are increasingly popular as general NLP system benchmarks, driving both modeling and dataset work.
no code implementations • 15 Oct 2020 • Vladislav Lialin, Rahul Goel, Andrey Simanovsky, Anna Rumshisky, Rushin Shah
To reduce training time, one can fine-tune the previously trained model on each patch, but naive fine-tuning exhibits catastrophic forgetting - degradation of the model performance on the data not represented in the data patch.
no code implementations • WS 2020 • Olga Kovaleva, Chaitanya Shivade, Satyan Kashyap, a, Karina Kanjaria, Joy Wu, Deddeh Ballah, Adam Coy, Alex Karargyris, ros, Yufan Guo, David Beymer Beymer, Anna Rumshisky, V Mukherjee, ana Mukherjee
Using MIMIC-CXR, an openly available database of chest X-ray images, we construct both a synthetic and a real-world dataset and provide baseline scores achieved by state-of-the-art models.
no code implementations • EMNLP 2020 • Sai Prasanna, Anna Rogers, Anna Rumshisky
Large Transformer-based models were shown to be reducible to a smaller number of self-attention heads and layers.
no code implementations • 27 Feb 2020 • Anna Rogers, Olga Kovaleva, Anna Rumshisky
Transformer-based models have pushed state of the art in many areas of NLP, but our understanding of what is behind their success is still limited.
no code implementations • WS 2019 • Anna Rogers, Olga Kovaleva, Anna Rumshisky
Calls to action on social media are known to be effective means of mobilization in social movements, and a frequent target of censorship.
no code implementations • 16 Oct 2019 • David Donahue, Yuanliang Meng, Anna Rumshisky
The first design features a sequence-to-sequence architecture with two separate NTM modules, one for each participant in the conversation.
2 code implementations • 16 Oct 2019 • David Donahue, Vladislav Lialin, Anna Rumshisky
The Transformer architecture has become increasingly popular over the past two years, owing to its impressive performance on a number of natural language processing (NLP) tasks.
no code implementations • 29 Aug 2019 • Anna Rogers, Marzena Karpinska, Ankita Gupta, Vladislav Lialin, Gregory Smelkov, Anna Rumshisky
For the past decade, temporal annotation has been sparse: only a small portion of event pairs in a text was annotated.
no code implementations • 28 Aug 2019 • Yuanliang Meng, Anna Rumshisky
This paper proposes a Transformer-based model to generate equations for math word problems.
no code implementations • IJCNLP 2019 • Olga Kovaleva, Alexey Romanov, Anna Rogers, Anna Rumshisky
BERT-based architectures currently give state-of-the-art performance on many NLP tasks, but little is known about the exact mechanisms that contribute to its success.
no code implementations • NAACL 2019 • Alexey Romanov, Maria De-Arteaga, Hanna Wallach, Jennifer Chayes, Christian Borgs, Alexandra Chouldechova, Sahin Geyik, Krishnaram Kenthapadi, Anna Rumshisky, Adam Tauman Kalai
In the context of mitigating bias in occupation classification, we propose a method for discouraging correlation between the predicted probability of an individual's true occupation and a word embedding of their name.
no code implementations • 11 Oct 2018 • David Donahue, Anna Rumshisky
This is largely because sequences of text are discrete, and thus gradients cannot propagate from the discriminator to the generator.
no code implementations • EMNLP 2018 • Olga Kovaleva, Anna Rumshisky, Alexey Romanov
This paper addresses the problem of representation learning.
1 code implementation • COLING 2018 • Yuanliang Meng, Anna Rumshisky
We propose a triad-based neural network system that generates affinity scores between entity mentions for coreference resolution.
2 code implementations • NAACL 2019 • Alexey Romanov, Anna Rumshisky, Anna Rogers, David Donahue
We show that the proposed method is capable of fine-grained controlled change of these aspects of the input sentence.
no code implementations • COLING 2018 • Anna Rogers, Shashwath Hosur Ananthakrishna, Anna Rumshisky
Attempts to find a single technique for general-purpose intrinsic evaluation of word embeddings have so far not been successful.
no code implementations • COLING 2018 • Anna Rogers, Alexey Romanov, Anna Rumshisky, Svitlana Volkova, Mikhail Gronas, Alex Gribov
This paper presents RuSentiment, a new dataset for sentiment analysis of social media posts in Russian, and a new set of comprehensive annotation guidelines that are extensible to other languages.
Ranked #2 on
Sentiment Analysis
on RuSentiment
no code implementations • ACL 2018 • Yuanliang Meng, Anna Rumshisky
We propose a context-aware neural network model for temporal information extraction.
no code implementations • 6 Mar 2018 • Willie Boag, Elena Sergeeva, Saurabh Kulshreshtha, Peter Szolovits, Anna Rumshisky, Tristan Naumann
Clinical notes often describe important aspects of a patient's stay and are therefore critical to medical research.
no code implementations • ICLR 2018 • Alexey Romanov, Anna Rumshisky
Learning a better representation with neural networks is a challenging problem, which has been tackled from different perspectives in the past few years.
no code implementations • IJCNLP 2017 • Peter Potash, Robin Bhattacharya, Anna Rumshisky
In this work, we provide insight into three key aspects related to predicting argument convincingness.
no code implementations • WS 2017 • Peter Potash, Alexey Romanov, Anna Rumshisky, Mikhail Gronas
We show that on the task of predicting which side is likely to prefer a given article, a Naive Bayes classifier can record 90. 3{\%} accuracy looking only at domain names of the news sources.
no code implementations • EMNLP 2017 • Peter Potash, Anna Rumshisky
In this paper we introduce a practical first step towards the creation of an automated debate agent: a state-of-the-art recurrent predictive model for predicting debate winners.
no code implementations • SEMEVAL 2017 • Peter Potash, Alexey Romanov, Anna Rumshisky
This paper describes a new shared task for humor understanding that attempts to eschew the ubiquitous binary approach to humor detection and focus on comparative humor ranking instead.
no code implementations • SEMEVAL 2017 • David Donahue, Alexey Romanov, Anna Rumshisky
This paper describes the winning system for SemEval-2017 Task 6: {\#}HashtagWars: Learning a Sense of Humor.
no code implementations • 1 May 2017 • Alexey Romanov, Anna Rumshisky
Learning a better representation with neural networks is a challenging problem, which was tackled extensively from different prospectives in the past few years.
no code implementations • EMNLP 2017 • Yuanliang Meng, Anna Rumshisky, Alexey Romanov
In this paper, we propose to use a set of simple, uniform in architecture LSTM-based models to recover different kinds of temporal relations from text.
no code implementations • EMNLP 2017 • Peter Potash, Alexey Romanov, Anna Rumshisky
One of the major goals in automated argumentation mining is to uncover the argument structure present in argumentative text.
no code implementations • WS 2018 • Peter Potash, Alexey Romanov, Anna Rumshisky
The goal of this paper is to develop evaluation methods for one such task, ghostwriting of rap lyrics, and to provide an explicit, quantifiable foundation for the goals and future directions of this task.
no code implementations • 9 Dec 2016 • Peter Potash, Alexey Romanov, Anna Rumshisky
Our best supervised system achieved 63. 7% accuracy, suggesting that this task is much more difficult than comparable humor detection tasks.
no code implementations • 16 Oct 2015 • Weiyi Sun, Anna Rumshisky, Ozlem Uzuner
We analyze the RI-TIMEXes in temporally annotated corpora and propose two hypotheses regarding the normalization of RI-TIMEXes in the clinical narrative domain: the anchor point hypothesis and the anchor relation hypothesis.
no code implementations • LREC 2012 • Anna Rumshisky, Nick Botchan, Sophie Kushkuley, James Pustejovsky
In this paper, we explore different strategies for implementing a crowdsourcing methodology for a single-step construction of an empirically-derived sense inventory and the corresponding sense-annotated corpus.