1 code implementation • EMNLP 2020 • Sida Gao, Matthew R. Gormley
Most recent improvements in NLP come from changes to the neural network architectures modeling the text input.
no code implementations • EMNLP 2021 • Yang Liu, Hua Cheng, Russell Klopfer, Matthew R. Gormley, Thomas Schaaf
Multi-label document classification (MLDC) problems can be challenging, especially for long documents with a large label set and a long-tail distribution over labels.
Ranked #2 on Medical Code Prediction on MIMIC-III
no code implementations • 14 Nov 2023 • Yilin Wang, Xinyi Hu, Matthew R. Gormley
In this paper, we introduce the entanglement model, aiming to combine character and subword language models.
1 code implementation • 2 Oct 2023 • Amanda Bertsch, Alex Xie, Graham Neubig, Matthew R. Gormley
Minimum Bayes Risk (MBR) decoding is a method for choosing the outputs of a machine learning system based not on the output with the highest probability, but the output with the lowest risk (expected error) among multiple candidates.
1 code implementation • ACL 2023 • Hua Cheng, Rana Jafari, April Russell, Russell Klopfer, Edmond Lu, Benjamin Striner, Matthew R. Gormley
In this paper, we introduce MDACE, the first publicly available code evidence dataset, which is built on a subset of the MIMIC-III clinical records.
1 code implementation • 30 Jun 2023 • Yash Mathur, Sanketh Rangreji, Raghav Kapoor, Medha Palavalli, Amanda Bertsch, Matthew R. Gormley
For full-note summarization (Task B), we use a similar solution with k=1.
1 code implementation • NeurIPS 2023 • Amanda Bertsch, Uri Alon, Graham Neubig, Matthew R. Gormley
This kNN index can be kept on either the GPU or CPU memory and queried in sub-linear time; this way, we can index practically unlimited input sequences, while every attention head in every decoder layer retrieves its top-k keys, instead of attending to every key.
no code implementations • 30 Nov 2022 • John Glover, Federico Fancellu, Vasudevan Jagannathan, Matthew R. Gormley, Thomas Schaaf
In this paper we systematically compare different granularities of decomposition -- from document to sub-sentence level, and we show that the answer is no.
1 code implementation • 21 Nov 2022 • Arindam Ghosh, Thomas Schaaf, Matthew R. Gormley
In this paper, we propose a calibration-aware adaptive focal loss called AdaFocal that utilizes the calibration properties of focal (and inverse-focal) loss and adaptively modifies $\gamma_t$ for different groups of samples based on $\gamma_{t-1}$ from the previous step and the knowledge of model's under/over-confidence on the validation set.
1 code implementation • 27 Oct 2022 • Amanda Bertsch, Graham Neubig, Matthew R. Gormley
As a sample application, we demonstrate that applying perspective shifting to a dialogue summarization dataset (SAMSum) substantially improves the zero-shot performance of extractive news summarization models on this data.
1 code implementation • ACL 2022 • Joel Ruben Antony Moniz, Barun Patra, Matthew R. Gormley
When tasked with supporting multiple languages for a given problem, two approaches have arisen: training a model for each language with the annotation budget divided equally among them, and training on a high-resource language followed by zero-shot transfer to the remaining languages.
1 code implementation • Findings (EMNLP) 2021 • Longxiang Zhang, Renato Negrinho, Arindam Ghosh, Vasudevan Jagannathan, Hamid Reza Hassanzadeh, Thomas Schaaf, Matthew R. Gormley
We show that fluent and adequate summaries can be generated with limited training data by fine-tuning BART on a specially constructed dataset.
no code implementations • ACL (SIGMORPHON) 2021 • Maria Ryskina, Eduard Hovy, Taylor Berg-Kirkpatrick, Matthew R. Gormley
Traditionally, character-level transduction problems have been solved with finite-state models designed to encode structural and linguistic knowledge of the underlying process, whereas recent approaches rely on the power and flexibility of sequence-to-sequence models with attention.
no code implementations • NAACL 2021 • Chu-Cheng Lin, Aaron Jaech, Xin Li, Matthew R. Gormley, Jason Eisner
Standard autoregressive language models perform only polynomial-time computation to compute the probability of the next symbol.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Renato Negrinho, Matthew R. Gormley, Geoffrey J. Gordon
This approach leads to mismatches as, during training, the model is not exposed to its mistakes and does not use beam search.
1 code implementation • ACL 2020 • Maria Ryskina, Matthew R. Gormley, Taylor Berg-Kirkpatrick
Informal romanization is an idiosyncratic process used by humans in informal digital communication to encode non-Latin script languages into Latin character sets found on common keyboards.
1 code implementation • ACL 2019 • Barun Patra, Joel Ruben Antony Moniz, Sarthak Garg, Matthew R. Gormley, Graham Neubig
We then propose Bilingual Lexicon Induction with Semi-Supervision (BLISS) --- a semi-supervised approach that relaxes the isometric assumption while leveraging both limited aligned bilingual lexicons and a larger set of unaligned word embeddings, as well as a novel hubness filtering technique.
no code implementations • NAACL 2019 • Chu-Cheng Lin, Hao Zhu, Matthew R. Gormley, Jason Eisner
We introduce neural finite state transducers (NFSTs), a family of string transduction models defining joint and conditional probability distributions over pairs of strings.
1 code implementation • NeurIPS 2018 • Renato Negrinho, Matthew R. Gormley, Geoffrey J. Gordon
Beam search is widely used for approximate decoding in structured prediction problems.
no code implementations • ACL 2018 • Chaitanya Malaviya, Matthew R. Gormley, Graham Neubig
Morphological analysis involves predicting the syntactic traits of a word (e. g. {POS: Noun, Case: Acc, Gender: Fem}).
no code implementations • TACL 2015 • Matthew R. Gormley, Mark Dredze, Jason Eisner
We show how to adjust the model parameters to compensate for the errors introduced by this approximation, by following the gradient of the actual loss on training data.
1 code implementation • EMNLP 2015 • Matthew R. Gormley, Mo Yu, Mark Dredze
We propose a Feature-rich Compositional Embedding Model (FCM) for relation extraction that is expressive, generalizes to new domains, and is easy-to-implement.
Ranked #1 on Relation Extraction on ACE 2005 (Cross Sentence metric)