no code implementations • EMNLP 2021 • Clara Meister, Afra Amini, Tim Vieira, Ryan Cotterell
Beam search is the default decoding strategy for many sequence generation tasks in NLP.
no code implementations • NAACL 2022 • Zeerak Talat, Hagen Blix, Josef Valvoda, Maya Indira Ganesh, Ryan Cotterell, Adina Williams
Ethics is one of the longest standing intellectual endeavors of humanity.
1 code implementation • NAACL (SIGMORPHON) 2022 • Jordan Kodner, Salam Khalifa, Khuyagbaatar Batsuren, Hossep Dolatian, Ryan Cotterell, Faruk Akkus, Antonios Anastasopoulos, Taras Andrushko, Aryaman Arora, Nona Atanalov, Gábor Bella, Elena Budianskaya, Yustinus Ghanggo Ate, Omer Goldman, David Guriel, Simon Guriel, Silvia Guriel-Agiashvili, Witold Kieraś, Andrew Krizhanovsky, Natalia Krizhanovsky, Igor Marchenko, Magdalena Markowska, Polina Mashkovtseva, Maria Nepomniashchaya, Daria Rodionova, Karina Scheifer, Alexandra Sorova, Anastasia Yemelina, Jeremiah Young, Ekaterina Vylomova
The 2022 SIGMORPHON–UniMorph shared task on large scale morphological inflection generation included a wide range of typologically diverse languages: 33 languages from 11 top-level language families: Arabic (Modern Standard), Assamese, Braj, Chukchi, Eastern Armenian, Evenki, Georgian, Gothic, Gujarati, Hebrew, Hungarian, Itelmen, Karelian, Kazakh, Ket, Khalkha Mongolian, Kholosi, Korean, Lamahalot, Low German, Ludic, Magahi, Middle Low German, Old English, Old High German, Old Norse, Polish, Pomak, Slovak, Turkish, Upper Sorbian, Veps, and Xibe.
no code implementations • ACL 2022 • Clara Meister, Gian Wiher, Tiago Pimentel, Ryan Cotterell
When generating natural language from neural probabilistic models, high probability does not always coincide with high quality.
1 code implementation • EMNLP 2021 • Tiago Pimentel, Clara Meister, Elizabeth Salesky, Simone Teufel, Damián Blasi, Ryan Cotterell
We thus conclude that there is strong evidence of a surprisal–duration trade-off in operation, both across and within the world’s languages.
no code implementations • EMNLP 2020 • Arya D. McCarthy, Adina Williams, Shijia Liu, David Yarowsky, Ryan Cotterell
Of particular interest, languages on the same branch of our phylogenetic tree are notably similar, whereas languages from separate branches are no more similar than chance.
no code implementations • ACL (SIGMORPHON) 2021 • Tiago Pimentel, Maria Ryskina, Sabrina J. Mielke, Shijie Wu, Eleanor Chodroff, Brian Leonard, Garrett Nicolai, Yustinus Ghanggo Ate, Salam Khalifa, Nizar Habash, Charbel El-Khaissi, Omer Goldman, Michael Gasser, William Lane, Matt Coler, Arturo Oncevay, Jaime Rafael Montoya Samame, Gema Celeste Silva Villegas, Adam Ek, Jean-Philippe Bernardy, Andrey Shcherbakov, Aziyana Bayyr-ool, Karina Sheifer, Sofya Ganieva, Matvey Plugaryov, Elena Klyachko, Ali Salehi, Andrew Krizhanovsky, Natalia Krizhanovsky, Clara Vania, Sardana Ivanova, Aelita Salchak, Christopher Straughn, Zoey Liu, Jonathan North Washington, Duygu Ataman, Witold Kieraś, Marcin Woliński, Totok Suhardijanto, Niklas Stoehr, Zahroh Nuriah, Shyam Ratan, Francis M. Tyers, Edoardo M. Ponti, Grant Aiton, Richard J. Hatcher, Emily Prud'hommeaux, Ritesh Kumar, Mans Hulden, Botond Barta, Dorina Lakatos, Gábor Szolnok, Judit Ács, Mohit Raj, David Yarowsky, Ryan Cotterell, Ben Ambridge, Ekaterina Vylomova
This year's iteration of the SIGMORPHON Shared Task on morphological reinflection focuses on typological diversity and cross-lingual variation of morphosyntactic features.
1 code implementation • NAACL (SIGTYP) 2022 • Johann-Mattis List, Ekaterina Vylomova, Robert Forkel, Nathan Hill, Ryan Cotterell
This study describes the structure and the results of the SIGTYP 2022 shared task on the prediction of cognate reflexes from multilingual wordlists.
1 code implementation • EMNLP 2021 • Ran Zmigrod, Tim Vieira, Ryan Cotterell
In this paper, we adapt two spanning tree sampling algorithms to faithfully sample dependency trees from a graph subject to the root constraint.
1 code implementation • 14 Apr 2025 • Afra Amini, Tim Vieira, Ryan Cotterell
In this paper, we introduce a Rao--Blackwellized estimator that is also unbiased and provably has variance less than or equal to that of the standard Monte Carlo estimator.
1 code implementation • 10 Apr 2025 • Alex Warstadt, Aaron Mueller, Leshem Choshen, Ethan Wilcox, Chengxu Zhuang, Juan Ciro, Rafael Mosquera, Bhargavi Paranjape, Adina Williams, Tal Linzen, Ryan Cotterell
These intensive resource demands limit the ability of researchers to train new models and use existing models as developmentally plausible cognitive models.
no code implementations • 18 Mar 2025 • Selim Jerad, Anej Svete, Jiaoda Li, Ryan Cotterell
We show that this no longer holds with only leftmost-hard attention -- in that case, they correspond to a \emph{strictly weaker} fragment of LTL.
no code implementations • 15 Feb 2025 • Lucas Charpentier, Leshem Choshen, Ryan Cotterell, Mustafa Omer Gul, Michael Hu, Jaap Jumelet, Tal Linzen, Jing Liu, Aaron Mueller, Candace Ross, Raj Sanjay Shah, Alex Warstadt, Ethan Wilcox, Adina Williams
We also call for papers outside the competition in any relevant areas.
no code implementations • 6 Dec 2024 • Michael Y. Hu, Aaron Mueller, Candace Ross, Adina Williams, Tal Linzen, Chengxu Zhuang, Ryan Cotterell, Leshem Choshen, Alex Warstadt, Ethan Gotlieb Wilcox
No submissions outperformed the baselines in the multimodal track.
no code implementations • 4 Dec 2024 • Tim Vieira, Ben LeBrun, Mario Giulianelli, Juan Luis Gastaldi, Brian DuSell, John Terilla, Timothy J. O'Donnell, Ryan Cotterell
Modern language models are internally -- and mathematically -- distributions over token strings rather than \emph{character} strings, posing numerous challenges for programmers building user applications on top of them.
1 code implementation • 12 Nov 2024 • Tianyu Liu, Jirui Qi, Paul He, Arianna Bisazza, Mrinmaya Sachan, Ryan Cotterell
Based on these findings, we propose two methods that use question likelihood as a gauge for selecting and constructing prompts that lead to better performance.
1 code implementation • 11 Nov 2024 • Julian Minder, Kevin Du, Niklas Stoehr, Giovanni Monea, Chris Wendler, Robert West, Ryan Cotterell
In this paper, we search for a knob which controls this sensitivity, determining whether language models answer from the context or their prior knowledge.
2 code implementations • 11 Nov 2024 • Alexandra Butoi, Ghazal Khalighinejad, Anej Svete, Josef Valvoda, Ryan Cotterell, Brian DuSell
We provide results on a variety of languages across the Chomsky hierarchy for three neural architectures: a simple RNN, an LSTM, and a causally-masked transformer.
1 code implementation • 11 Nov 2024 • Shauli Ravfogel, Anej Svete, Vésteinn Snæbjarnarson, Ryan Cotterell
Based on this observation, we propose a framework for generating true string counterfactuals by reformulating language models as a structural equation model using the Gumbel-max trick, which we called Gumbel counterfactual generation.
no code implementations • 9 Nov 2024 • Clemente Pasti, Talu Karagöz, Anej Svete, Franz Nowak, Reda Boumasmoud, Ryan Cotterell
Extracting finite state automata (FSAs) from black-box models offers a powerful approach to gaining interpretable insights into complex model behaviors.
no code implementations • 21 Oct 2024 • Eleftheria Tsipidi, Franz Nowak, Ryan Cotterell, Ethan Wilcox, Mario Giulianelli, Alex Warstadt
While these fluctuations can be viewed as theoretically uninteresting noise on top of a uniform target, another explanation is that UID is not the only functional pressure regulating information content in a language.
no code implementations • 18 Oct 2024 • Tianyu Liu, Kevin Du, Mrinmaya Sachan, Ryan Cotterell
However, exactly computing susceptibility is difficult and, thus, Du et al. (2024) falls back on a Monte Carlo approximation.
1 code implementation • 16 Oct 2024 • Samuel Kiegeland, Ethan Gotlieb Wilcox, Afra Amini, David Robert Reich, Ryan Cotterell
Numerous previous studies have sought to determine to what extent language models, pretrained on natural language text, can serve as useful models of human cognition.
no code implementations • 7 Oct 2024 • Niklas Stoehr, Kevin Du, Vésteinn Snæbjarnarson, Robert West, Ryan Cotterell, Aaron Schein
Given the prompt "Rome is in", can we steer a language model to flip its prediction of an incorrect token "France" to a correct token "Italy" by only multiplying a few relevant activation vectors with scalars?
no code implementations • 3 Oct 2024 • Anej Svete, Nadav Borenstein, Mike Zhou, Isabelle Augenstein, Ryan Cotterell
Much theoretical work has described the ability of transformers to represent formal languages.
1 code implementation • 3 Oct 2024 • Mario Giulianelli, Luca Malagutti, Juan Luis Gastaldi, Brian DuSell, Tim Vieira, Ryan Cotterell
The paper argues that token-level language models should be (approximately) marginalized into character-level language models before they are used in psycholinguistic studies to compute the surprisal of a region of interest; then, the marginalized character-level language model can be used to compute the surprisal of an arbitrary character substring, which we term a focal area, that the experimenter may wish to use as a predictor.
1 code implementation • 16 Sep 2024 • Mario Giulianelli, Andreas Opedal, Ryan Cotterell
We introduce a generalization of classic information-theoretic measures of predictive uncertainty in online language processing, based on the simulation of expected continuations of incremental linguistic contexts.
1 code implementation • 12 Sep 2024 • Andreas Opedal, Eleanor Chodroff, Ryan Cotterell, Ethan Gotlieb Wilcox
Another one is the pointwise mutual information (PMI) between a unit and its context, which turns out to yield the same predictive power as surprisal when controlling for unigram frequency.
1 code implementation • 27 Jul 2024 • Ionut Constantinescu, Tiago Pimentel, Ryan Cotterell, Alex Warstadt
We vary the age of exposure by training LMs on language pairs in various experimental conditions, and find that LMs, which lack any direct analog to innate maturational stages, do not show CP effects when the age of exposure of L2 is delayed.
no code implementations • 16 Jul 2024 • Juan Luis Gastaldi, John Terilla, Luca Malagutti, Brian DuSell, Tim Vieira, Ryan Cotterell
The present paper contributes to addressing this theoretical gap by proposing a unified formal framework for representing and analyzing tokenizer models.
no code implementations • 8 Jul 2024 • Afra Amini, Tim Vieira, Ryan Cotterell
To the extent this fine-tuning is successful and we end up with a good approximation, we have reduced the inference cost by a factor of N. Our experiments on a controlled generation task suggest that while variational BoN is not as effective as BoN in aligning language models, it is close to BoN performance as vBoN appears more often on the Pareto frontier of reward and KL divergence compared to models trained with KL-constrained RL objective.
no code implementations • 20 Jun 2024 • Franz Nowak, Anej Svete, Alexandra Butoi, Ryan Cotterell
We present several results on the representational capacity of recurrent and transformer LMs with CoT reasoning, showing that they can represent the same family of distributions over strings as probabilistic Turing machines.
no code implementations • 14 Jun 2024 • Naaman Tan, Josef Valvoda, Tianyu Liu, Anej Svete, Yanxia Qin, Kan Min-Yen, Ryan Cotterell
The relationship between the quality of a string, as judged by a human reader, and its probability, $p(\boldsymbol{y})$ under a language model undergirds the development of better language models.
no code implementations • 7 Jun 2024 • Amanda Doucette, Ryan Cotterell, Morgan Sonderegger, Timothy J. O'Donnell
It has been claimed that within a language, morphologically irregular words are more likely to be phonotactically simple and morphologically regular words are more likely to be phonotactically complex.
1 code implementation • 6 Jun 2024 • Jiaoda Li, Yifan Hou, Mrinmaya Sachan, Ryan Cotterell
Large language models (LLMs) exhibit an intriguing ability to learn a novel task from in-context examples presented in a demonstration, termed in-context learning (ICL).
no code implementations • 6 Jun 2024 • Nadav Borenstein, Anej Svete, Robin Chan, Josef Valvoda, Franz Nowak, Isabelle Augenstein, Eleanor Chodroff, Ryan Cotterell
We find that the RLM rank, which corresponds to the size of linear space spanned by the logits of its conditional distributions, and the expected length of sampled strings are strong and significant predictors of learnability for both RNNs and Transformers.
no code implementations • 4 Jun 2024 • Robin SM Chan, Reda Boumasmoud, Anej Svete, Yuxin Ren, Qipeng Guo, Zhijing Jin, Shauli Ravfogel, Mrinmaya Sachan, Bernhard Schölkopf, Mennatallah El-Assady, Ryan Cotterell
In this spirit, we study the properties of \emph{affine} alignment of language encoders and its implications on extrinsic similarity.
1 code implementation • 29 May 2024 • Anej Svete, Franz Nowak, Anisha Mohamed Sahabdeen, Ryan Cotterell
The recent successes and spread of large neural language models (LMs) call for a thorough understanding of their computational ability.
no code implementations • EMNLP 2015 • Thomas Muller, Ryan Cotterell, Alexander Fraser, Hinrich Schütze
We present LEMMING, a modular log-linear model that jointly models lemmatization and tagging and supports the integration of arbitrary global features.
1 code implementation • 7 May 2024 • Jiaoda Li, Jennifer C. White, Mrinmaya Sachan, Ryan Cotterell
Natural languages are believed to be (mildly) context-sensitive.
no code implementations • 23 Apr 2024 • Anej Svete, Ryan Cotterell
This provides a first step towards understanding the mechanisms that transformer LMs can use to represent probability distributions over strings.
no code implementations • IJCNLP 2017 • Ryan Cotterell, Kevin Duh
Low-resource named entity recognition is still an open problem in NLP.
Low Resource Named Entity Recognition
named-entity-recognition
+2
no code implementations • CONLL 2015 • Ryan Cotterell, Thomas Müller, Alexander Fraser, Hinrich Schütze
We present labeled morphological segmentation, an alternative view of morphological processing that unifies several tasks.
1 code implementation • 9 Apr 2024 • Leshem Choshen, Ryan Cotterell, Michael Y. Hu, Tal Linzen, Aaron Mueller, Candace Ross, Alex Warstadt, Ethan Wilcox, Adina Williams, Chengxu Zhuang
The big changes for this year's competition are as follows: First, we replace the loose track with a paper track, which allows (for example) non-model-based submissions, novel cognitively-inspired benchmarks, or analysis techniques.
no code implementations • 6 Apr 2024 • Kevin Du, Vésteinn Snæbjarnarson, Niklas Stoehr, Jennifer C. White, Aaron Schein, Ryan Cotterell
To answer a question, language models often need to integrate prior knowledge learned during pretraining and new information presented in context.
1 code implementation • 25 Mar 2024 • Josef Valvoda, Ryan Cotterell
Current legal outcome prediction models - a staple of legal NLP - do not explain their reasoning.
no code implementations • 25 Mar 2024 • Luca Malagutti, Andrius Buinovskij, Anej Svete, Clara Meister, Afra Amini, Ryan Cotterell
For nearly three decades, language models derived from the $n$-gram assumption held the state of the art on the task.
no code implementations • 28 Feb 2024 • Laura Manduchi, Kushagra Pandey, Robert Bamler, Ryan Cotterell, Sina Däubener, Sophie Fellenz, Asja Fischer, Thomas Gärtner, Matthias Kirchler, Marius Kloft, Yingzhen Li, Christoph Lippert, Gerard de Melo, Eric Nalisnick, Björn Ommer, Rajesh Ranganath, Maja Rudolph, Karen Ullrich, Guy Van Den Broeck, Julia E Vogt, Yixin Wang, Florian Wenzel, Frank Wood, Stephan Mandt, Vincent Fortuin
The field of deep generative modeling has grown rapidly and consistently over the years.
no code implementations • 24 Feb 2024 • Anej Svete, Robin Shing Moon Chan, Ryan Cotterell
However, a closer inspection of Hewitt et al.'s (2020) construction shows that it is not inherently limited to hierarchical structures.
1 code implementation • 17 Feb 2024 • Matan Avitan, Ryan Cotterell, Yoav Goldberg, Shauli Ravfogel
Interventions targeting the representation space of language models (LMs) have emerged as an effective means to influence model behavior.
2 code implementations • 16 Feb 2024 • Afra Amini, Tim Vieira, Ryan Cotterell
DPO, as originally formulated, relies on binary preference data and fine-tunes a language model to increase the likelihood of a preferred response over a dispreferred response.
1 code implementation • 15 Feb 2024 • Shashwat Singh, Shauli Ravfogel, Jonathan Herzig, Roee Aharoni, Ryan Cotterell, Ponnurangam Kumaraguru
In the case of neural language models, an encoding of the undesirable behavior is often present in the model's representations.
1 code implementation • 31 Jan 2024 • Andreas Opedal, Alessandro Stolfo, Haruki Shirakami, Ying Jiao, Ryan Cotterell, Bernhard Schölkopf, Abulhair Saparov, Mrinmaya Sachan
We construct tests for each one in order to understand whether current LLMs display the same cognitive biases as children in these steps.
no code implementations • 29 Dec 2023 • Li Du, Afra Amini, Lucas Torroba Hennigen, Xinyan Velocity Yu, Jason Eisner, Holden Lee, Ryan Cotterell
Recent papers have demonstrated the possibility of energy-based text generation by adapting gradient-based sampling algorithms, a paradigm of MCMC algorithms that promises fast convergence.
1 code implementation • 6 Dec 2023 • Tiago Pimentel, Clara Meister, Ethan Gotlieb Wilcox, Kyle Mahowald, Ryan Cotterell
Under this method, we find that a language's word lengths should instead be proportional to the surprisal's expectation plus its variance-to-mean ratio.
no code implementations • 1 Dec 2023 • Josef Valvoda, Alec Thompson, Ryan Cotterell, Simone Teufel
The introduction of large public legal datasets has brought about a renaissance in legal NLP.
1 code implementation • 30 Nov 2023 • Karolina Stańczak, Kevin Du, Adina Williams, Isabelle Augenstein, Ryan Cotterell
However, when we control for the meaning of the noun, the relationship between grammatical gender and adjective choice is near zero and insignificant.
1 code implementation • 28 Nov 2023 • Lukas Wolf, Tiago Pimentel, Evelina Fedorenko, Ryan Cotterell, Alex Warstadt, Ethan Wilcox, Tamar Regev
Using a large spoken corpus of English audiobooks, we extract prosodic features aligned to individual words and test how well they can be predicted from LLM embeddings, compared to non-contextual word embeddings.
no code implementations • 27 Nov 2023 • Andreas Opedal, Eleftheria Tsipidi, Tiago Pimentel, Ryan Cotterell, Tim Vieira
The left-corner transformation (Rosenkrantz and Lewis, 1970) is used to remove left recursion from context-free grammars, which is an important step towards making the grammar parsable top-down with simple techniques.
no code implementations • 7 Nov 2023 • Ryan Cotterell, Anej Svete, Clara Meister, Tianyu Liu, Li Du
Large language models have become one of the most commonly deployed NLP inventions.
no code implementations • 23 Oct 2023 • Alexandra Butoi, Tim Vieira, Ryan Cotterell, David Chiang
From these, we also immediately obtain stringsum and allsum algorithms for TAG, LIG, PAA, and EPDA.
1 code implementation • 19 Oct 2023 • Franz Nowak, Anej Svete, Li Du, Ryan Cotterell
We extend the Turing completeness result to the probabilistic case, showing how a rationally weighted RLM with unbounded computation time can simulate any deterministic probabilistic Turing machine (PTM) with rationally weighted transitions.
1 code implementation • 8 Oct 2023 • Anej Svete, Ryan Cotterell
These results present a first step towards characterizing the classes of distributions RNN LMs can represent and thus help us understand their capabilities and limitations.
no code implementations • 27 Aug 2023 • Ivan Baburin, Ryan Cotterell
In this paper we establish an abstraction of on-the-fly determinization of finite-state automata using transition monoids and demonstrate how it can be applied to bound the asymptotics.
1 code implementation • 27 Jul 2023 • Clément Guerner, Tianyu Liu, Anej Svete, Alexander Warstadt, Ryan Cotterell
The linear subspace hypothesis (Bolukbasi et al., 2016) states that, in a language model's representation space, all information about a concept such as verbal number is encoded in a linear subspace.
no code implementations • 7 Jul 2023 • Ethan Gotlieb Wilcox, Tiago Pimentel, Clara Meister, Ryan Cotterell, Roger P. Levy
We address this gap in the current literature by investigating the relationship between surprisal and reading times in eleven different languages, distributed across five language families.
1 code implementation • 7 Jul 2023 • Clara Meister, Tiago Pimentel, Luca Malagutti, Ethan G. Wilcox, Ryan Cotterell
While this trade-off is not reflected in standard metrics of distribution quality (such as perplexity), we find that several precision-emphasizing measures indeed indicate that sampling adapters can lead to probability distributions more aligned with the true distribution.
1 code implementation • 6 Jul 2023 • Andreas Opedal, Ran Zmigrod, Tim Vieira, Ryan Cotterell, Jason Eisner
This paper provides a reference description, in the form of a deduction system, of Earley's (1970) context-free parsing algorithm with various speed-ups.
1 code implementation • 6 Jul 2023 • Kevin Du, Lucas Torroba Hennigen, Niklas Stoehr, Alexander Warstadt, Ryan Cotterell
Many popular feature-attribution methods for interpreting deep neural networks rely on computing the gradients of a model's output with respect to its inputs.
1 code implementation • 29 Jun 2023 • Vilém Zouhar, Clara Meister, Juan Luis Gastaldi, Li Du, Tim Vieira, Mrinmaya Sachan, Ryan Cotterell
Via submodular functions, we prove that the iterative greedy version is a $\frac{1}{{\sigma(\boldsymbol{\mu}^\star)}}(1-e^{-{\sigma(\boldsymbol{\mu}^\star)}})$-approximation of an optimal merge sequence, where ${\sigma(\boldsymbol{\mu}^\star)}$ is the total backward curvature with respect to the optimal merge sequence $\boldsymbol{\mu}^\star$.
1 code implementation • 29 Jun 2023 • Vilém Zouhar, Clara Meister, Juan Luis Gastaldi, Li Du, Mrinmaya Sachan, Ryan Cotterell
Subword tokenization is a key part of many NLP pipelines.
1 code implementation • 8 Jun 2023 • Afra Amini, Tianyu Liu, Ryan Cotterell
We introduce a novel dependency parser, the hexatagger, that constructs dependency trees by tagging the words in a sentence with elements from a finite set of possible tags.
1 code implementation • NeurIPS 2023 • Nora Belrose, David Schneider-Joseph, Shauli Ravfogel, Ryan Cotterell, Edward Raff, Stella Biderman
Concept erasure aims to remove specified features from an embedding.
1 code implementation • 6 Jun 2023 • Thomas Hikaru Clark, Clara Meister, Tiago Pimentel, Michael Hahn, Ryan Cotterell, Richard Futrell, Roger Levy
Here, we ask whether a pressure for UID may have influenced word order patterns cross-linguistically.
no code implementations • 6 Jun 2023 • Alexandra Butoi, Ryan Cotterell, David Chiang
Furthermore, using an even stricter notion of equivalence called d-strong equivalence, we make precise the intuition that a CFG controlling a CFG is a TAG, a PDA controlling a PDA is an embedded PDA, and a PDA controlling a CFG is a LIG.
1 code implementation • NeurIPS 2023 • Afra Amini, Li Du, Ryan Cotterell
In this paper, we take an important step toward building a principled approach for sampling from language models with gradient-based methods.
no code implementations • 24 May 2023 • Tianyu Liu, Afra Amini, Mrinmaya Sachan, Ryan Cotterell
We show that these exhaustive comparisons can be avoided, and, moreover, the complexity of such tasks can be reduced to linear by casting the relation between tokens as a partial order over the string.
1 code implementation • 23 May 2023 • Yuxin Ren, Qipeng Guo, Zhijing Jin, Shauli Ravfogel, Mrinmaya Sachan, Bernhard Schölkopf, Ryan Cotterell
Transformer models bring propelling advances in various NLP tasks, thus inducing lots of interpretability research on the learned representations of the models.
2 code implementations • 22 May 2023 • Wangchunshu Zhou, Yuchen Eleanor Jiang, Peng Cui, Tiannan Wang, Zhenxin Xiao, Yifan Hou, Ryan Cotterell, Mrinmaya Sachan
In addition to producing AI-generated content (AIGC), we also demonstrate the possibility of using RecurrentGPT as an interactive fiction that directly interacts with consumers.
no code implementations • 18 May 2023 • Wangchunshu Zhou, Yuchen Eleanor Jiang, Ryan Cotterell, Mrinmaya Sachan
To achieve this, we train a meta controller that predicts the number of in-context examples suitable for the generalist model to make a good prediction based on the performance-efficiency trade-off for a specific input.
1 code implementation • 18 May 2023 • Yuchen Eleanor Jiang, Tianyu Liu, Shuming Ma, Dongdong Zhang, Mrinmaya Sachan, Ryan Cotterell
Several recent papers claim human parity at sentence-level Machine Translation (MT), especially in high-resource languages.
no code implementations • 27 Apr 2023 • Wangchunshu Zhou, Yuchen Eleanor Jiang, Ethan Wilcox, Ryan Cotterell, Mrinmaya Sachan
Large language models generate fluent texts and can follow natural language instructions to solve a wide range of tasks without task-specific training.
1 code implementation • ICCV 2023 • Idan Schwartz, Vésteinn Snæbjarnarson, Hila Chefer, Ryan Cotterell, Serge Belongie, Lior Wolf, Sagie Benaim
This approach has two disadvantages: (i) supervised datasets are generally small compared to large-scale scraped text-image datasets on which text-to-image models are trained, affecting the quality and diversity of the generated images, or (ii) the input is a hard-coded label, as opposed to free-form text, limiting the control over the generated images.
1 code implementation • 17 Jan 2023 • Anej Svete, Benjamin Dayan, Tim Vieira, Ryan Cotterell, Jason Eisner
The pathsum in ordinary acyclic WFSAs is efficiently computed by the backward algorithm in time $O(|E|)$, where $E$ is the set of transitions.
no code implementations • 20 Dec 2022 • Li Du, Lucas Torroba Hennigen, Tiago Pimentel, Clara Meister, Jason Eisner, Ryan Cotterell
Language modeling, a central task in natural language processing, involves estimating a probability distribution over strings.
1 code implementation • 8 Dec 2022 • Niklas Stoehr, Benjamin J. Radford, Ryan Cotterell, Aaron Schein
For discrete data, SSMs commonly do so through a state-to-action emission matrix and a state-to-state transition matrix.
1 code implementation • 25 Nov 2022 • Tiago Pimentel, Clara Meister, Ethan G. Wilcox, Roger Levy, Ryan Cotterell
We assess the effect of anticipation on reading by comparing how well surprisal and contextual entropy predict reading times on four naturalistic reading datasets: two self-paced and two eye-tracking.
1 code implementation • 23 Nov 2022 • Jennifer C. White, Ryan Cotterell
Recent work has shown that despite their impressive capabilities, text-to-image diffusion models such as DALL-E 2 (Ramesh et al., 2022) can display strange behaviours when a prompt contains a word with multiple possible meanings, often generating images containing both senses of the word (Rassin et al., 2022).
1 code implementation • 14 Nov 2022 • Afra Amini, Ryan Cotterell
There have been many proposals to reduce constituency parsing to tagging in the literature.
no code implementations • 11 Nov 2022 • Tiago Pimentel, Josef Valvoda, Niklas Stoehr, Ryan Cotterell
This shift in perspective leads us to propose a new principle for probing, the architectural bottleneck principle: In order to estimate how much information a given component could extract, a probe should look exactly like the component.
1 code implementation • 26 Oct 2022 • Tianyu Liu, Yuchen Jiang, Nicholas Monath, Ryan Cotterell, Mrinmaya Sachan
Recent years have seen a paradigm shift in NLP towards using pretrained language models ({PLM}) for a wide range of tasks.
Ranked #1 on
Relation Extraction
on CoNLL04
(NER Micro F1 metric)
1 code implementation • 26 Oct 2022 • Yuchen Eleanor Jiang, Tianyu Liu, Shuming Ma, Dongdong Zhang, Mrinmaya Sachan, Ryan Cotterell
The BWB corpus consists of Chinese novels translated by experts into English, and the annotated test set is designed to probe the ability of machine translation systems to model various discourse phenomena.
no code implementations • 26 Oct 2022 • Yuchen Eleanor Jiang, Ryan Cotterell, Mrinmaya Sachan
Our analysis further shows that contextualized embeddings contain much of the coherence information, which helps explain why CT can only provide little gains to modern neural coreference resolvers which make use of pretrained representations.
3 code implementations • 24 Oct 2022 • Liam van der Poel, Ryan Cotterell, Clara Meister
Despite significant progress in the quality of language generated from abstractive summarization models, these models still exhibit the tendency to hallucinate, i. e., output content not supported by the source document.
no code implementations • 18 Oct 2022 • Shauli Ravfogel, Yoav Goldberg, Ryan Cotterell
Methods for erasing human-interpretable concepts from neural representations that assume linearity have been found to be tractable and useful.
1 code implementation • 13 Oct 2022 • Alexandra Butoi, Brian DuSell, Tim Vieira, Ryan Cotterell, David Chiang
Weighted pushdown automata (WPDAs) are at the core of many natural language processing tasks, like syntax-based statistical machine translation and transition-based dependency parsing.
1 code implementation • 8 Oct 2022 • Niklas Stoehr, Lucas Torroba Hennigen, Josef Valvoda, Robert West, Ryan Cotterell, Aaron Schein
It is based only on the action category ("what") and disregards the subject ("who") and object ("to whom") of an event, as well as contextual information, like associated casualty count, that should contribute to the perception of an event's "intensity".
no code implementations • 6 Oct 2022 • Dieuwke Hupkes, Mario Giulianelli, Verna Dankers, Mikel Artetxe, Yanai Elazar, Tiago Pimentel, Christos Christodoulopoulos, Karim Lasri, Naomi Saphra, Arabella Sinclair, Dennis Ulmer, Florian Schottmann, Khuyagbaatar Batsuren, Kaiser Sun, Koustuv Sinha, Leila Khalatbari, Maria Ryskina, Rita Frieske, Ryan Cotterell, Zhijing Jin
We present a taxonomy for characterising and understanding generalisation research in NLP.
1 code implementation • COLING 2022 • Jennifer C. White, Ryan Cotterell
The ability to generalize compositionally is key to understanding the potentially infinite number of sentences that can be constructed in a human language from only a finite number of words.
1 code implementation • 14 Sep 2022 • Clemente Pasti, Andreas Opedal, Tiago Pimentel, Tim Vieira, Jason Eisner, Ryan Cotterell
It shows, by a simple construction, that the intersection of a context-free language and a regular language is itself context-free.
no code implementations • 17 Aug 2022 • Rita Sevastjanova, Eren Cakmak, Shauli Ravfogel, Ryan Cotterell, Mennatallah El-Assady
The simplicity of adapter training and composition comes along with new challenges, such as maintaining an overview of adapter properties and effectively comparing their produced embedding spaces.
1 code implementation • COLING 2022 • Josef Valvoda, Naomi Saphra, Jonathan Rawski, Adina Williams, Ryan Cotterell
Recombining known primitive concepts into larger novel combinations is a quintessentially human cognitive capability.
1 code implementation • 17 Aug 2022 • Josef Valvoda, Ryan Cotterell, Simone Teufel
In contrast, we turn our focus to negative outcomes here, and introduce a new task of negative outcome prediction.
1 code implementation • NAACL 2022 • Jiaoda Li, Ryan Cotterell, Mrinmaya Sachan
We then examine the usefulness of a specific linguistic property for pre-training by removing the heads that are essential to that property and evaluating the resulting model's performance on language modeling.
1 code implementation • NAACL (SIGMORPHON) 2022 • Khuyagbaatar Batsuren, Gábor Bella, Aryaman Arora, Viktor Martinović, Kyle Gorman, Zdeněk Žabokrtský, Amarsanaa Ganbold, Šárka Dohnalová, Magda Ševčíková, Kateřina Pelegrinová, Fausto Giunchiglia, Ryan Cotterell, Ekaterina Vylomova
The SIGMORPHON 2022 shared task on morpheme segmentation challenged systems to decompose a word into a sequence of morphemes and covered most types of morphology: compounds, derivations, and inflections.
Ranked #8 on
Morpheme Segmentaiton
on UniMorph 4.0
1 code implementation • 31 May 2022 • Tiago Pimentel, Clara Meister, Ryan Cotterell
As we show, however, this is not a tight approximation -- in either theory or practice.
1 code implementation • 14 May 2022 • Afra Amini, Tiago Pimentel, Clara Meister, Ryan Cotterell
Probing has become a go-to methodology for interpreting and analyzing deep neural models in natural language processing.
1 code implementation • NAACL 2022 • Tianyu Liu, Yuchen Eleanor Jiang, Ryan Cotterell, Mrinmaya Sachan
Many natural language processing tasks, e. g., coreference resolution and semantic role labeling, require selecting text spans and making decisions about them.
no code implementations • LREC 2022 • Khuyagbaatar Batsuren, Omer Goldman, Salam Khalifa, Nizar Habash, Witold Kieraś, Gábor Bella, Brian Leonard, Garrett Nicolai, Kyle Gorman, Yustinus Ghanggo Ate, Maria Ryskina, Sabrina J. Mielke, Elena Budianskaya, Charbel El-Khaissi, Tiago Pimentel, Michael Gasser, William Lane, Mohit Raj, Matt Coler, Jaime Rafael Montoya Samame, Delio Siticonatzi Camaiteri, Benoît Sagot, Esaú Zumaeta Rojas, Didier López Francis, Arturo Oncevay, Juan López Bautista, Gema Celeste Silva Villegas, Lucas Torroba Hennigen, Adam Ek, David Guriel, Peter Dirix, Jean-Philippe Bernardy, Andrey Scherbakov, Aziyana Bayyr-ool, Antonios Anastasopoulos, Roberto Zariquiey, Karina Sheifer, Sofya Ganieva, Hilaria Cruz, Ritván Karahóǧa, Stella Markantonatou, George Pavlidis, Matvey Plugaryov, Elena Klyachko, Ali Salehi, Candy Angulo, Jatayu Baxi, Andrew Krizhanovsky, Natalia Krizhanovskaya, Elizabeth Salesky, Clara Vania, Sardana Ivanova, Jennifer White, Rowan Hall Maudslay, Josef Valvoda, Ran Zmigrod, Paula Czarnowska, Irene Nikkarinen, Aelita Salchak, Brijesh Bhatt, Christopher Straughn, Zoey Liu, Jonathan North Washington, Yuval Pinter, Duygu Ataman, Marcin Wolinski, Totok Suhardijanto, Anna Yablonskaya, Niklas Stoehr, Hossep Dolatian, Zahroh Nuriah, Shyam Ratan, Francis M. Tyers, Edoardo M. Ponti, Grant Aiton, Aryaman Arora, Richard J. Hatcher, Ritesh Kumar, Jeremiah Young, Daria Rodionova, Anastasia Yemelina, Taras Andrushko, Igor Marchenko, Polina Mashkovtseva, Alexandra Serova, Emily Prud'hommeaux, Maria Nepomniashchaya, Fausto Giunchiglia, Eleanor Chodroff, Mans Hulden, Miikka Silfverberg, Arya D. McCarthy, David Yarowsky, Ryan Cotterell, Reut Tsarfaty, Ekaterina Vylomova
The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema.
2 code implementations • NAACL 2022 • Karolina Stańczak, Edoardo Ponti, Lucas Torroba Hennigen, Ryan Cotterell, Isabelle Augenstein
The success of multilingual pre-trained models is underpinned by their ability to learn representations shared by multiple languages even in absence of any explicit supervision.
1 code implementation • NAACL 2022 • Ran Zmigrod, Tim Vieira, Ryan Cotterell
However, practitioners rely on Monte Carlo approximation to perform this test due to a lack of a suitable exact algorithm.
no code implementations • ACL 2022 • Karim Lasri, Tiago Pimentel, Alessandro Lenci, Thierry Poibeau, Ryan Cotterell
We also find that BERT uses a separate encoding of grammatical number for nouns and verbs.
no code implementations • ACL 2022 • Aryaman Arora, Clara Meister, Ryan Cotterell
Shannon entropy is often a quantity of interest to linguists studying the communicative capacity of human language.
no code implementations • 31 Mar 2022 • Clara Meister, Gian Wiher, Tiago Pimentel, Ryan Cotterell
Specifically, we posit that human-like language should contain an amount of information (quantified as negative log-probability) that is close to the entropy of the distribution over natural strings.
no code implementations • ACL 2022 • Clara Meister, Tiago Pimentel, Thomas Hikaru Clark, Ryan Cotterell, Roger Levy
Numerous analyses of reading time (RT) data have been implemented -- all in an effort to better understand the cognitive processes driving reading comprehension.
no code implementations • 29 Mar 2022 • Gian Wiher, Clara Meister, Ryan Cotterell
For example, the nature of the diversity-quality trade-off in language generation is very task-specific; the length bias often attributed to beam search is not constant across tasks.
3 code implementations • 1 Feb 2022 • Clara Meister, Tiago Pimentel, Gian Wiher, Ryan Cotterell
Automatic and human evaluations show that, in comparison to nucleus and top-k sampling, locally typical sampling offers competitive performance (in both abstractive summarization and story generation) in terms of quality while consistently reducing degenerate repetitions.
2 code implementations • 28 Jan 2022 • Shauli Ravfogel, Michael Twiton, Yoav Goldberg, Ryan Cotterell
Modern neural models trained on textual data rely on pre-trained representations that emerge without direct supervision.
1 code implementation • 28 Jan 2022 • Shauli Ravfogel, Francisco Vargas, Yoav Goldberg, Ryan Cotterell
One prominent approach for the identification of concepts in neural representations is searching for a linear subspace whose erasure prevents the prediction of the concept from the representations.
2 code implementations • 20 Jan 2022 • Karolina Stańczak, Lucas Torroba Hennigen, Adina Williams, Ryan Cotterell, Isabelle Augenstein
The success of pre-trained contextualized representations has prompted researchers to analyze them for the presence of linguistic information.
no code implementations • 7 Nov 2021 • Zeerak Talat, Hagen Blix, Josef Valvoda, Maya Indira Ganesh, Ryan Cotterell, Adina Williams
Ethics is one of the longest standing intellectual endeavors of humanity.
1 code implementation • ACL 2022 • Alexander Immer, Lucas Torroba Hennigen, Vincent Fortuin, Ryan Cotterell
Such performance improvements have motivated researchers to quantify and understand the linguistic information encoded in these representations.
1 code implementation • 30 Sep 2021 • Tiago Pimentel, Clara Meister, Elizabeth Salesky, Simone Teufel, Damián Blasi, Ryan Cotterell
We thus conclude that there is strong evidence of a surprisal--duration trade-off in operation, both across and within the world's languages.
1 code implementation • EMNLP 2021 • Tiago Pimentel, Clara Meister, Simone Teufel, Ryan Cotterell
Homophony's widespread presence in natural languages is a controversial topic.
1 code implementation • EMNLP 2021 • Niklas Stoehr, Lucas Torroba Hennigen, Samin Ahbab, Robert West, Ryan Cotterell
We do this by devising a set of textual and graph-based features which represent each of the causes.
no code implementations • EMNLP 2021 • Clara Meister, Tiago Pimentel, Patrick Haller, Lena Jäger, Ryan Cotterell, Roger Levy
The uniform information density (UID) hypothesis posits a preference among language users for utterances structured such that information is distributed uniformly across a signal.
1 code implementation • 22 Sep 2021 • Clara Meister, Afra Amini, Tim Vieira, Ryan Cotterell
In this work, we propose a new method for turning beam search into a stochastic process: Conditional Poisson stochastic beam search.
1 code implementation • Findings (EMNLP) 2021 • Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell, Roger Wattenhofer
Large pre-trained language models have repeatedly shown their ability to produce fluent text.
no code implementations • 14 Sep 2021 • Ran Zmigrod, Tim Vieira, Ryan Cotterell
Colbourn (1996)'s sampling algorithm has a running time of $\mathcal{O}(N^3)$, which is often greater than the mean hitting time of a directed graph.
no code implementations • Findings (EMNLP) 2021 • Tim Vieira, Ryan Cotterell, Jason Eisner
To this end, we describe a set of program transformations, a simple metric for assessing the efficiency of a transformed program, and a heuristic search procedure to improve this metric.
1 code implementation • EMNLP 2021 • Tiago Pimentel, Ryan Cotterell
Pimentel et al. (2020) recently analysed probing from an information-theoretic perspective.
2 code implementations • 10 Aug 2021 • Jiaoda Li, Ryan Cotterell, Mrinmaya Sachan
Multi-head attention, a collection of several attention mechanisms that independently attend to different parts of the input, is the key ingredient in the Transformer.
no code implementations • IJCNLP 2019 • Edoardo Maria Ponti, Ivan Vulić, Ryan Cotterell, Roi Reichart, Anna Korhonen
Motivated by this question, we aim at constructing an informative prior over neural weights, in order to adapt quickly to held-out languages in the task of character-level language modeling.
1 code implementation • ACL 2021 • Ran Zmigrod, Tim Vieira, Ryan Cotterell
Furthermore, we present a novel extension of the algorithm for decoding the K-best dependency trees of a graph which are subject to a root constraint.
no code implementations • ACL 2021 • Clara Meister, Martina Forster, Ryan Cotterell
Beam search is a go-to strategy for decoding neural sequence models.
no code implementations • NAACL (SIGTYP) 2021 • Elizabeth Salesky, Badr M. Abdullah, Sabrina J. Mielke, Elena Klyachko, Oleg Serikov, Edoardo Ponti, Ritesh Kumar, Ryan Cotterell, Ekaterina Vylomova
While language identification is a fundamental speech and language processing task, for many languages and language families it remains a challenging task.
1 code implementation • Findings (ACL) 2021 • Irene Nikkarinen, Tiago Pimentel, Damián E. Blasi, Ryan Cotterell
The unigram distribution is the non-contextual probability of finding a specific word form in a corpus.
no code implementations • NAACL 2021 • Rowan Hall Maudslay, Ryan Cotterell
One method of doing so, which is frequently cited to support the claim that models like BERT encode syntax, is called probing; probes are small supervised models trained to extract linguistic information from another model's output.
no code implementations • ACL 2021 • Clara Meister, Stefan Lazov, Isabelle Augenstein, Ryan Cotterell
Sparse attention has been claimed to increase model interpretability under the assumption that it highlights influential inputs.
1 code implementation • ACL 2021 • Jennifer C. White, Ryan Cotterell
Since language models are used to model a wide variety of languages, it is natural to ask whether the neural architectures used for the task have inductive biases towards modeling particular types of languages.
1 code implementation • 1 Jun 2021 • Ran Zmigrod, Tim Vieira, Ryan Cotterell
Furthermore, we present a novel extension of the algorithm for decoding the $K$-best dependency trees of a graph which are subject to a root constraint.
1 code implementation • ACL 2021 • Ran Zmigrod, Tim Vieira, Ryan Cotterell
In the case of second-order derivatives, our scheme runs in the optimal $\mathcal{O}(A^2 N^4)$ time where $A$ is the alphabet size and $N$ is the number of states.
no code implementations • ACL 2021 • Clara Meister, Ryan Cotterell
As concrete examples, text generated under the nucleus sampling scheme adheres more closely to the type--token relationship of natural language than text produced using standard ancestral sampling; text from LSTMs reflects the natural language distributions over length, stopwords, and symbols surprisingly well.
no code implementations • NAACL 2021 • Jennifer C. White, Tiago Pimentel, Naomi Saphra, Ryan Cotterell
Probes are models devised to investigate the encoding of knowledge -- e. g. syntactic structure -- in contextual representations.
no code implementations • ACL 2021 • Jason Wei, Clara Meister, Ryan Cotterell
The uniform information density (UID) hypothesis, which posits that speakers behaving optimally tend to distribute information uniformly across a linguistic signal, has gained traction in psycholinguistics as an explanation for certain syntactic, morphological, and prosodic choices.
no code implementations • NAACL 2021 • Tiago Pimentel, Irene Nikkarinen, Kyle Mahowald, Ryan Cotterell, Damián Blasi
Examining corpora from 7 typologically diverse languages, we use those upper bounds to quantify the lexicon's optimality and to explore the relative costs of major constraints on natural codes.
1 code implementation • 15 Apr 2021 • Karolina Stańczak, Sagnik Ray Choudhury, Tiago Pimentel, Ryan Cotterell, Isabelle Augenstein
Recent research has demonstrated that large pre-trained language models reflect societal biases expressed in natural language.
2 code implementations • NAACL 2021 • Tiago Pimentel, Brian Roark, Søren Wichmann, Ryan Cotterell, Damián Blasi
It is not a new idea that there are small, cross-linguistic associations between the forms and meanings of words.
2 code implementations • NAACL 2022 • Yuchen Eleanor Jiang, Tianyu Liu, Shuming Ma, Dongdong Zhang, Jian Yang, Haoyang Huang, Rico Sennrich, Ryan Cotterell, Mrinmaya Sachan, Ming Zhou
Standard automatic metrics, e. g. BLEU, are not reliable for document-level MT evaluation.
no code implementations • EACL 2021 • Martina Forster, Clara Meister, Ryan Cotterell
Yet, on word-level tasks, exact inference of these models reveals the empty string is often the global optimum.
1 code implementation • 10 Feb 2021 • Shijie Wu, Edoardo Maria Ponti, Ryan Cotterell
As the main contribution of our work, we implement the phonological generative system as a neural model differentiable end-to-end, rather than as a set of rules or constraints.
1 code implementation • EACL 2021 • Tiago Pimentel, Ryan Cotterell, Brian Roark
Psycholinguistic studies of human word processing and lexical access provide ample evidence of the preferred nature of word-initial versus word-final segments, e. g., in terms of attention paid by listeners (greater) or the likelihood of reduction by speakers (lower).
3 code implementations • 30 Nov 2020 • Emanuele Bugliarello, Ryan Cotterell, Naoaki Okazaki, Desmond Elliott
Large-scale pretraining and task-specific fine-tuning is now the standard methodology for many tasks in computer vision and natural language processing.
no code implementations • COLING 2020 • Paula Czarnowska, Sebastian Ruder, Ryan Cotterell, Ann Copestake
We propose a novel morphologically aware probability model for bilingual lexicon induction, which jointly models lexeme translation and inflectional morphology in a structured way.
no code implementations • EMNLP (SIGTYP) 2020 • Johannes Bjerva, Elizabeth Salesky, Sabrina J. Mielke, Aditi Chaudhary, Giuseppe G. A. Celano, Edoardo M. Ponti, Ekaterina Vylomova, Ryan Cotterell, Isabelle Augenstein
Typological knowledge bases (KBs) such as WALS (Dryer and Haspelmath, 2013) contain information about linguistic properties of the world's languages.
no code implementations • EMNLP 2020 • Jun Yen Leung, Guy Emerson, Ryan Cotterell
Across languages, multiple consecutive adjectives modifying a noun (e. g. "the big red dog") follow certain unmarked ordering rules.
1 code implementation • EMNLP 2020 • Clara Meister, Tim Vieira, Ryan Cotterell
This implies that the MAP objective alone does not express the properties we desire in text, which merits the question: if beam search is the answer, what was the question?
1 code implementation • EMNLP 2020 • Lucas Torroba Hennigen, Adina Williams, Ryan Cotterell
Most modern NLP systems make use of pre-trained contextual representations that attain astonishingly high performance on a variety of tasks.
1 code implementation • EMNLP 2020 • Ran Zmigrod, Tim Vieira, Ryan Cotterell
The connection between dependency trees and spanning trees is exploited by the NLP community to train and to decode graph-based dependency parsers.
1 code implementation • EMNLP 2020 • Tiago Pimentel, Naomi Saphra, Adina Williams, Ryan Cotterell
In our contribution to this discussion, we argue for a probe metric that reflects the fundamental trade-off between probe complexity and performance: the Pareto hypervolume.
1 code implementation • EMNLP 2020 • Tiago Pimentel, Rowan Hall Maudslay, Damián Blasi, Ryan Cotterell
For a language to be clear and efficiently encoded, we posit that the lexical ambiguity of a word type should correlate with how much information context provides about it, on average.
1 code implementation • EMNLP 2020 • Francisco Vargas, Ryan Cotterell
Their method takes pre-trained word representations as input and attempts to isolate a linear subspace that captures most of the gender bias in the representations.
no code implementations • 29 Aug 2020 • Ran Zmigrod, Tim Vieira, Ryan Cotterell
We propose unified algorithms for the important cases of first-order expectations and second-order expectations in edge-factored, non-projective spanning-tree models.
1 code implementation • 8 Jul 2020 • Clara Meister, Tim Vieira, Ryan Cotterell
Decoding for many NLP tasks requires an effective heuristic algorithm for approximating exact search since the problem of searching the full output space is often intractable, or impractical in many settings.
no code implementations • WS 2020 • Rowan Hall Maudslay, Tiago Pimentel, Ryan Cotterell, Simone Teufel
We report the results of our system on the Metaphor Detection Shared Task at the Second Workshop on Figurative Language Processing 2020.
1 code implementation • WS 2020 • Ekaterina Vylomova, Jennifer White, Elizabeth Salesky, Sabrina J. Mielke, Shijie Wu, Edoardo Ponti, Rowan Hall Maudslay, Ran Zmigrod, Josef Valvoda, Svetlana Toldova, Francis Tyers, Elena Klyachko, Ilya Yegorov, Natalia Krizhanovsky, Paula Czarnowska, Irene Nikkarinen, Andrew Krizhanovsky, Tiago Pimentel, Lucas Torroba Hennigen, Christo Kirov, Garrett Nicolai, Adina Williams, Antonios Anastasopoulos, Hilaria Cruz, Eleanor Chodroff, Ryan Cotterell, Miikka Silfverberg, Mans Hulden
Systems were developed using data from 45 languages and just 5 language families, fine-tuned with data from an additional 45 languages and 10 language families (13 in total), and evaluated on all 90 languages.
no code implementations • ACL 2020 • Elizabeth Salesky, Eleanor Chodroff, Tiago Pimentel, Matthew Wiesner, Ryan Cotterell, Alan W. black, Jason Eisner
A major hurdle in data-driven research on typology is having sufficient data in many languages to draw meaningful conclusions.
3 code implementations • EACL 2021 • Shijie Wu, Ryan Cotterell, Mans Hulden
The transformer has been shown to outperform recurrent neural network-based sequence-to-sequence models in various word-level NLP tasks.
1 code implementation • TACL 2020 • Tiago Pimentel, Brian Roark, Ryan Cotterell
We present methods for calculating a measure of phonotactic complexity---bits per phoneme---that permits a straightforward cross-linguistic comparison.
1 code implementation • ACL 2020 • Emanuele Bugliarello, Sabrina J. Mielke, Antonios Anastasopoulos, Ryan Cotterell, Naoaki Okazaki
The performance of neural machine translation systems is commonly evaluated in terms of BLEU.
1 code implementation • ACL 2020 • Alexander Erdmann, Micha Elsner, Shijie Wu, Ryan Cotterell, Nizar Habash
Our benchmark system first makes use of word embeddings and string similarity to cluster forms by cell and by paradigm.
1 code implementation • ACL 2020 • Rowan Hall Maudslay, Josef Valvoda, Tiago Pimentel, Adina Williams, Ryan Cotterell
One such probe is the structural probe (Hewitt and Manning, 2019), designed to quantify the extent to which syntactic information is encoded in contextualised word representations.
no code implementations • 3 May 2020 • Adina Williams, Ryan Cotterell, Lawrence Wolf-Sonkin, Damián Blasi, Hanna Wallach
We also find that there are statistically significant relationships between the grammatical genders of inanimate nouns and the verbs that take those nouns as direct objects, as indirect objects, and as subjects.
no code implementations • ACL 2020 • Clara Meister, Elizabeth Salesky, Ryan Cotterell
Prior work has explored directly regularizing the output distributions of probabilistic models to alleviate peaky (i. e. over-confident) predictions, a common sign of overfitting.
1 code implementation • ACL 2020 • Adina Williams, Tiago Pimentel, Arya D. McCarthy, Hagen Blix, Eleanor Chodroff, Ryan Cotterell
We find for two Indo-European languages (Czech and German) that form and meaning respectively share significant amounts of information with class (and contribute additional information above and beyond gender).
no code implementations • LREC 2020 • Arya D. McCarthy, Christo Kirov, Matteo Grella, Amrit Nidhi, Patrick Xia, Kyle Gorman, Ekaterina Vylomova, Sabrina J. Mielke, Garrett Nicolai, Miikka Silfverberg, Timofey Arkhangelskiy, Nataly Krizhanovsky, Andrew Krizhanovsky, Elena Klyachko, Alexey Sorokin, John Mansfield, Valts Ern{\v{s}}treits, Yuval Pinter, Cass Jacobs, ra L., Ryan Cotterell, Mans Hulden, David Yarowsky
The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema.
1 code implementation • ACL 2020 • Tiago Pimentel, Josef Valvoda, Rowan Hall Maudslay, Ran Zmigrod, Adina Williams, Ryan Cotterell
The success of neural networks on a diverse set of NLP tasks has led researchers to question how much these networks actually ``know'' about natural language.
1 code implementation • 30 Jan 2020 • Edoardo M. Ponti, Ivan Vulić, Ryan Cotterell, Marinela Parovic, Roi Reichart, Anna Korhonen
In this work, we propose a Bayesian generative model for the space of neural parameters.
no code implementations • EMNLP 2016 • Ryan Cotterell, Arun Kumar, Hinrich Schütze
Morphological segmentation has traditionally been modeled with non-hierarchical models, which yield flat segmentations as output.
no code implementations • CONLL 2019 • Kyle Gorman, Arya D. McCarthy, Ryan Cotterell, Ekaterina Vylomova, Miikka Silfverberg, Magdalena Markowska
We conduct a manual error analysis of the CoNLL-SIGMORPHON Shared Task on Morphological Reinflection.
no code implementations • IJCNLP 2019 • Adina Williams, Ryan Cotterell, Lawrence Wolf-Sonkin, Damián Blasi, Hanna Wallach
To that end, we use canonical correlation analysis to correlate the grammatical gender of inanimate nouns with an externally grounded definition of their lexical semantics.
no code implementations • WS 2019 • Arya D. McCarthy, Ekaterina Vylomova, Shijie Wu, Chaitanya Malaviya, Lawrence Wolf-Sonkin, Garrett Nicolai, Christo Kirov, Miikka Silfverberg, Sabrina J. Mielke, Jeffrey Heinz, Ryan Cotterell, Mans Hulden
The SIGMORPHON 2019 shared task on cross-lingual transfer and contextual analysis in morphology examined transfer learning of inflection between 100 language pairs, as well as contextual lemmatization and morphosyntactic description in 66 languages.
no code implementations • IJCNLP 2019 • Paula Czarnowska, Sebastian Ruder, Edouard Grave, Ryan Cotterell, Ann Copestake
Human translators routinely have to translate rare inflections of words - due to the Zipfian distribution of words in a language.
1 code implementation • IJCNLP 2019 • Pei Zhou, Weijia Shi, Jieyu Zhao, Kuan-Hao Huang, Muhao Chen, Ryan Cotterell, Kai-Wei Chang
Recent studies have shown that word embeddings exhibit gender bias inherited from the training corpora.
no code implementations • IJCNLP 2019 • Rowan Hall Maudslay, Hila Gonen, Ryan Cotterell, Simone Teufel
An alternative approach is Counterfactual Data Augmentation (CDA), in which a corpus is duplicated and augmented to remove bias, e. g. by swapping all inherently-gendered words in the copy.
no code implementations • WS 2019 • Tiago Pimentel, Brian Roark, Ryan Cotterell
In this work, we propose the use of phone-level language models to estimate phonotactic complexity{---}measured in bits per phoneme{---}which makes cross-linguistic comparison straightforward.
no code implementations • 4 Jul 2019 • Ryan Cotterell, Hinrich Schütze
Linguistic similarity is multi-faceted.
no code implementations • ACL 2019 • Damian Blasi, Ryan Cotterell, Lawrence Wolf-Sonkin, Sabine Stoll, Balthasar Bickel, Marco Baroni
Embedding a clause inside another ({``}the girl [who likes cars [that run fast]] has arrived{''}) is a fundamental resource that has been argued to be a key driver of linguistic expressiveness.
1 code implementation • ACL 2019 • Shijie Wu, Ryan Cotterell, Timothy J. O'Donnell
We present a study of morphological irregularity.
no code implementations • ACL 2019 • Johannes Bjerva, Yova Kementchedjhieva, Ryan Cotterell, Isabelle Augenstein
The study of linguistic typology is rooted in the implications we find between linguistic features, such as the fact that languages with object-verb word ordering tend to have post-positions.
1 code implementation • ACL 2019 • Tiago Pimentel, Arya D. McCarthy, Damián E. Blasi, Brian Roark, Ryan Cotterell
A longstanding debate in semiotics centers on the relationship between linguistic signs and their corresponding semantics: is there an arbitrary relationship between a word form and its meaning, or does some systematic phenomenon pervade?
no code implementations • ACL 2019 • Sabrina J. Mielke, Ryan Cotterell, Kyle Gorman, Brian Roark, Jason Eisner
Trying to answer the question of what features difficult languages have in common, we try and fail to reproduce our earlier (Cotterell et al., 2018) observation about morphological complexity and instead reveal far simpler statistics of the data that seem to drive complexity in a much larger sample.
no code implementations • ACL 2019 • Alexander Hoyle, Wolf-Sonkin, Hanna Wallach, Isabelle Augenstein, Ryan Cotterell
Studying the ways in which language is gendered has long been an area of interest in sociolinguistics.
no code implementations • ACL 2019 • Ran Zmigrod, Sabrina J. Mielke, Hanna Wallach, Ryan Cotterell
Gender stereotypes are manifest in most of the world's languages and are consequently propagated or amplified by NLP systems.
2 code implementations • ACL 2019 • Shijie Wu, Ryan Cotterell
Our models achieve state-of-the-art performance on morphological inflection.
no code implementations • NAACL 2019 • Ekaterina Vylomova, Ryan Cotterell, Timothy Baldwin, Trevor Cohn, Jason Eisner
Critical to natural language generation is the production of correctly inflected text.
2 code implementations • NAACL 2019 • Jieyu Zhao, Tianlu Wang, Mark Yatskar, Ryan Cotterell, Vicente Ordonez, Kai-Wei Chang
In this paper, we quantify, analyze and mitigate gender bias exhibited in ELMo's contextualized word vectors.
1 code implementation • NAACL 2019 • Alexander Hoyle, Lawrence Wolf-Sonkin, Hanna Wallach, Ryan Cotterell, Isabelle Augenstein
When assigning quantitative labels to a dataset, different methodologies may rely on different scales.
no code implementations • NAACL 2019 • Chaitanya Malaviya, Shijie Wu, Ryan Cotterell
English verbs have multiple forms.
1 code implementation • NAACL 2019 • Johannes Bjerva, Yova Kementchedjhieva, Ryan Cotterell, Isabelle Augenstein
In the principles-and-parameters framework, the structural features of languages depend on parameters that may be toggled on or off, with a single parameter often dictating the status of multiple features.