no code implementations • 6 Mar 2018 • Willie Boag, Elena Sergeeva, Saurabh Kulshreshtha, Peter Szolovits, Anna Rumshisky, Tristan Naumann
Clinical notes often describe important aspects of a patient's stay and are therefore critical to medical research.
no code implementations • EMNLP 2017 • Yuanliang Meng, Anna Rumshisky, Alexey Romanov
In this paper, we propose to use a set of simple, uniform in architecture LSTM-based models to recover different kinds of temporal relations from text.
no code implementations • EMNLP 2017 • Peter Potash, Alexey Romanov, Anna Rumshisky
One of the major goals in automated argumentation mining is to uncover the argument structure present in argumentative text.
no code implementations • 1 May 2017 • Alexey Romanov, Anna Rumshisky
Learning a better representation with neural networks is a challenging problem, which was tackled extensively from different prospectives in the past few years.
no code implementations • 9 Dec 2016 • Peter Potash, Alexey Romanov, Anna Rumshisky
Our best supervised system achieved 63. 7% accuracy, suggesting that this task is much more difficult than comparable humor detection tasks.
no code implementations • WS 2018 • Peter Potash, Alexey Romanov, Anna Rumshisky
The goal of this paper is to develop evaluation methods for one such task, ghostwriting of rap lyrics, and to provide an explicit, quantifiable foundation for the goals and future directions of this task.
no code implementations • 16 Oct 2015 • Weiyi Sun, Anna Rumshisky, Ozlem Uzuner
We analyze the RI-TIMEXes in temporally annotated corpora and propose two hypotheses regarding the normalization of RI-TIMEXes in the clinical narrative domain: the anchor point hypothesis and the anchor relation hypothesis.
no code implementations • 11 Oct 2018 • David Donahue, Anna Rumshisky
This is largely because sequences of text are discrete, and thus gradients cannot propagate from the discriminator to the generator.
no code implementations • EMNLP 2018 • Olga Kovaleva, Anna Rumshisky, Alexey Romanov
This paper addresses the problem of representation learning.
no code implementations • ACL 2018 • Yuanliang Meng, Anna Rumshisky
We propose a context-aware neural network model for temporal information extraction.
no code implementations • SEMEVAL 2017 • Peter Potash, Alexey Romanov, Anna Rumshisky
This paper describes a new shared task for humor understanding that attempts to eschew the ubiquitous binary approach to humor detection and focus on comparative humor ranking instead.
no code implementations • SEMEVAL 2017 • David Donahue, Alexey Romanov, Anna Rumshisky
This paper describes the winning system for SemEval-2017 Task 6: {\#}HashtagWars: Learning a Sense of Humor.
no code implementations • EMNLP 2017 • Peter Potash, Anna Rumshisky
In this paper we introduce a practical first step towards the creation of an automated debate agent: a state-of-the-art recurrent predictive model for predicting debate winners.
no code implementations • WS 2017 • Peter Potash, Alexey Romanov, Anna Rumshisky, Mikhail Gronas
We show that on the task of predicting which side is likely to prefer a given article, a Naive Bayes classifier can record 90. 3{\%} accuracy looking only at domain names of the news sources.
no code implementations • COLING 2018 • Anna Rogers, Alexey Romanov, Anna Rumshisky, Svitlana Volkova, Mikhail Gronas, Alex Gribov
This paper presents RuSentiment, a new dataset for sentiment analysis of social media posts in Russian, and a new set of comprehensive annotation guidelines that are extensible to other languages.
Ranked #2 on Sentiment Analysis on RuSentiment
no code implementations • COLING 2018 • Anna Rogers, Shashwath Hosur Ananthakrishna, Anna Rumshisky
Attempts to find a single technique for general-purpose intrinsic evaluation of word embeddings have so far not been successful.
no code implementations • IJCNLP 2017 • Peter Potash, Robin Bhattacharya, Anna Rumshisky
In this work, we provide insight into three key aspects related to predicting argument convincingness.
no code implementations • ICLR 2018 • Alexey Romanov, Anna Rumshisky
Learning a better representation with neural networks is a challenging problem, which has been tackled from different perspectives in the past few years.
no code implementations • LREC 2012 • Anna Rumshisky, Nick Botchan, Sophie Kushkuley, James Pustejovsky
In this paper, we explore different strategies for implementing a crowdsourcing methodology for a single-step construction of an empirically-derived sense inventory and the corresponding sense-annotated corpus.
no code implementations • NAACL 2019 • Alexey Romanov, Maria De-Arteaga, Hanna Wallach, Jennifer Chayes, Christian Borgs, Alexandra Chouldechova, Sahin Geyik, Krishnaram Kenthapadi, Anna Rumshisky, Adam Tauman Kalai
In the context of mitigating bias in occupation classification, we propose a method for discouraging correlation between the predicted probability of an individual's true occupation and a word embedding of their name.
no code implementations • IJCNLP 2019 • Olga Kovaleva, Alexey Romanov, Anna Rogers, Anna Rumshisky
BERT-based architectures currently give state-of-the-art performance on many NLP tasks, but little is known about the exact mechanisms that contribute to its success.
no code implementations • 28 Aug 2019 • Yuanliang Meng, Anna Rumshisky
This paper proposes a Transformer-based model to generate equations for math word problems.
no code implementations • 29 Aug 2019 • Anna Rogers, Marzena Karpinska, Ankita Gupta, Vladislav Lialin, Gregory Smelkov, Anna Rumshisky
For the past decade, temporal annotation has been sparse: only a small portion of event pairs in a text was annotated.
no code implementations • 16 Oct 2019 • David Donahue, Yuanliang Meng, Anna Rumshisky
The first design features a sequence-to-sequence architecture with two separate NTM modules, one for each participant in the conversation.
no code implementations • WS 2019 • Anna Rogers, Olga Kovaleva, Anna Rumshisky
Calls to action on social media are known to be effective means of mobilization in social movements, and a frequent target of censorship.
no code implementations • 27 Feb 2020 • Anna Rogers, Olga Kovaleva, Anna Rumshisky
Transformer-based models have pushed state of the art in many areas of NLP, but our understanding of what is behind their success is still limited.
no code implementations • EMNLP 2020 • Sai Prasanna, Anna Rogers, Anna Rumshisky
Large Transformer-based models were shown to be reducible to a smaller number of self-attention heads and layers.
no code implementations • WS 2020 • Olga Kovaleva, Chaitanya Shivade, Satyan Kashyap, a, Karina Kanjaria, Joy Wu, Deddeh Ballah, Adam Coy, Alex Karargyris, ros, Yufan Guo, David Beymer Beymer, Anna Rumshisky, V Mukherjee, ana Mukherjee
Using MIMIC-CXR, an openly available database of chest X-ray images, we construct both a synthetic and a real-world dataset and provide baseline scores achieved by state-of-the-art models.
no code implementations • 15 Oct 2020 • Vladislav Lialin, Rahul Goel, Andrey Simanovsky, Anna Rumshisky, Rushin Shah
To reduce training time, one can fine-tune the previously trained model on each patch, but naive fine-tuning exhibits catastrophic forgetting - degradation of the model performance on the data not represented in the data patch.
no code implementations • Findings (ACL) 2021 • Olga Kovaleva, Saurabh Kulshreshtha, Anna Rogers, Anna Rumshisky
Multiple studies have shown that Transformers are remarkably robust to pruning.
no code implementations • 14 Jul 2021 • Christophe Dupuy, Radhika Arava, Rahul Gupta, Anna Rumshisky
However, the data used to train NLU models may contain private information such as addresses or phone numbers, particularly when drawn from human subjects.
no code implementations • COLING 2020 • Anna Rogers, Anna Rumshisky
Question answering, natural language inference and commonsense reasoning are increasingly popular as general NLP system benchmarks, driving both modeling and dataset work.
no code implementations • NAACL 2022 • Rahul Sharma, Anil Ramakrishna, Ansel MacLaughlin, Anna Rumshisky, Jimit Majmudar, Clement Chung, Salman Avestimehr, Rahul Gupta
Federated learning (FL) has recently emerged as a method for training ML models on edge devices using sensitive user data and is seen as a way to mitigate concerns over data privacy.
no code implementations • NAACL (ACL) 2022 • Manoj Kumar, Yuval Merhav, Haidar Khan, Rahul Gupta, Anna Rumshisky, Wael Hamza
Use of synthetic data is rapidly emerging as a realistic alternative to manually annotating live traffic for industry-scale model building.
no code implementations • 8 Oct 2022 • Tzu-Hsiang Lin, Ta-Chung Chi, Anna Rumshisky
Recent advancements in dialogue response selection (DRS) are based on the \textit{task-adaptive pre-training (TAP)} approach, by first initializing their model with BERT~\cite{devlin-etal-2019-bert}, and adapt to dialogue data with dialogue-specific or fine-grained pre-training tasks.
no code implementations • 15 Nov 2022 • Saurabh Kulshreshtha, Anna Rumshisky
Multi-hop Question Generation is the task of generating questions which require the reader to reason over and combine information spread across multiple passages using several reasoning steps.
no code implementations • 4 Apr 2023 • Vladislav Lialin, Stephen Rawls, David Chan, Shalini Ghosh, Anna Rumshisky, Wael Hamza
Currently popular video-text data mining approach via automatic speech recognition (ASR) used in HowTo100M provides low-quality captions that often do not refer to the video content.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • 14 Jun 2023 • Saleh Soltan, Andy Rosenbaum, Tobias Falke, Qin Lu, Anna Rumshisky, Wael Hamza
(2) Conversely, using an encoder to warm-start seq2seq training, we show that by unfreezing the encoder partway through training, we can match task performance of a from-scratch seq2seq model.
no code implementations • 10 Nov 2023 • Sarah Pan, Vladislav Lialin, Sherin Muckatira, Anna Rumshisky
While recent advances have boosted LM proficiency in linguistic benchmarks, LMs consistently struggle to reason correctly on complex tasks like mathematics.
no code implementations • 24 Feb 2024 • Yao Qiang, Subhrangshu Nandi, Ninareh Mehrabi, Greg Ver Steeg, Anoop Kumar, Anna Rumshisky, Aram Galstyan
However, their performance on sequence labeling tasks such as intent classification and slot filling (IC-SF), which is a central component in personal assistant systems, lags significantly behind discriminative models.
1 code implementation • 2 Apr 2024 • Namrata Shivagunde, Vladislav Lialin, Sherin Muckatira, Anna Rumshisky
In contrast, the underlying pre-trained LLMs they use as a backbone are known to be brittle in this respect.
1 code implementation • 28 Mar 2023 • Vladislav Lialin, Vijeta Deshpande, Anna Rumshisky
This paper presents a systematic overview and comparison of parameter-efficient fine-tuning methods covering over 40 papers published between February 2019 and February 2023.
1 code implementation • 26 May 2023 • Vijeta Deshpande, Dan Pechi, Shree Thatte, Vladislav Lialin, Anna Rumshisky
The majority of recent scaling laws studies focused on high-compute high-parameter count settings, leaving the question of when these abilities begin to emerge largely unanswered.
1 code implementation • 29 Mar 2023 • Namrata Shivagunde, Vladislav Lialin, Anna Rumshisky
Finally, we observe that while GPT3 has generated all the examples in ROLE-1500 is only able to solve 24. 6% of them during probing.
1 code implementation • 2 Apr 2024 • Sherin Muckatira, Vijeta Deshpande, Vladislav Lialin, Anna Rumshisky
Large language models can solve new tasks without task-specific fine-tuning.
1 code implementation • ACL 2022 • Saurabh Kulshreshtha, Olga Kovaleva, Namrata Shivagunde, Anna Rumshisky
Solving crossword puzzles requires diverse reasoning capabilities, access to a vast amount of knowledge about language and the world, and the ability to satisfy the constraints imposed by the structure of the puzzle.
Natural Language Understanding Open-Domain Question Answering +1
2 code implementations • 16 Oct 2019 • David Donahue, Vladislav Lialin, Anna Rumshisky
The Transformer architecture has become increasingly popular over the past two years, owing to its impressive performance on a number of natural language processing (NLP) tasks.
1 code implementation • ACL 2022 • Vladislav Lialin, Kevin Zhao, Namrata Shivagunde, Anna Rumshisky
Existing pre-trained transformer analysis works usually focus only on one or two model families at a time, overlooking the variability of the architecture and pre-training objectives.
1 code implementation • COLING 2018 • Yuanliang Meng, Anna Rumshisky
We propose a triad-based neural network system that generates affinity scores between entity mentions for coreference resolution.
1 code implementation • NAACL (ClinicalNLP) 2022 • Eric Lehman, Vladislav Lialin, Katelyn Y. Legaspi, Anne Janelle R. Sy, Patricia Therese S. Pile, Nicole Rose I. Alberto, Richard Raymund R. Ragasa, Corinna Victoria M. Puyat, Isabelle Rose I. Alberto, Pia Gabrielle I. Alfonso, Marianne Taliño, Dana Moukheiber, Byron C. Wallace, Anna Rumshisky, Jenifer J. Liang, Preethi Raghavan, Leo Anthony Celi, Peter Szolovits
The questions are generated by medical experts from 100+ MIMIC-III discharge summaries.
2 code implementations • NAACL 2019 • Alexey Romanov, Anna Rumshisky, Anna Rogers, David Donahue
We show that the proposed method is capable of fine-grained controlled change of these aspects of the input sentence.
1 code implementation • 21 Jul 2021 • Mikhail Burtsev, Anna Rumshisky
Transformer-based encoder-decoder models produce a fused token-wise representation after every encoder layer.
1 code implementation • 2 Aug 2022 • Saleh Soltan, Shankar Ananthakrishnan, Jack FitzGerald, Rahul Gupta, Wael Hamza, Haidar Khan, Charith Peris, Stephen Rawls, Andy Rosenbaum, Anna Rumshisky, Chandana Satya Prakash, Mukund Sridhar, Fabian Triefenbach, Apurv Verma, Gokhan Tur, Prem Natarajan
In this work, we demonstrate that multilingual large-scale sequence-to-sequence (seq2seq) models, pre-trained on a mixture of denoising and Causal Language Modeling (CLM) tasks, are more efficient few-shot learners than decoder-only models on various tasks.
Ranked #14 on Natural Language Inference on CommitmentBank
3 code implementations • 11 Jul 2023 • Vladislav Lialin, Namrata Shivagunde, Sherin Muckatira, Anna Rumshisky
Despite the dominance and effectiveness of scaling, resulting in large networks with hundreds of billions of parameters, the necessity to train overparameterized models remains poorly understood, while training costs grow exponentially.