Search Results for author: Clara Meister

Found 39 papers, 18 papers with code

Conditional Poisson Stochastic Beams

no code implementations EMNLP 2021 Clara Meister, Afra Amini, Tim Vieira, Ryan Cotterell

Beam search is the default decoding strategy for many sequence generation tasks in NLP.

A surprisal–duration trade-off across and within the world’s languages

1 code implementation EMNLP 2021 Tiago Pimentel, Clara Meister, Elizabeth Salesky, Simone Teufel, Damián Blasi, Ryan Cotterell

We thus conclude that there is strong evidence of a surprisal–duration trade-off in operation, both across and within the world’s languages.

High probability or low information? The probability–quality paradox in language generation

no code implementations ACL 2022 Clara Meister, Gian Wiher, Tiago Pimentel, Ryan Cotterell

When generating natural language from neural probabilistic models, high probability does not always coincide with high quality.

Text Generation

Causal Estimation of Memorisation Profiles

1 code implementation6 Jun 2024 Pietro Lesci, Clara Meister, Thomas Hofmann, Andreas Vlachos, Tiago Pimentel

Understanding memorisation in language models has practical and societal implications, e. g., studying models' training dynamics or preventing copyright infringements.

counterfactual Econometrics

The Role of $n$-gram Smoothing in the Age of Neural Networks

no code implementations25 Mar 2024 Luca Malagutti, Andrius Buinovskij, Anej Svete, Clara Meister, Afra Amini, Ryan Cotterell

For nearly three decades, language models derived from the $n$-gram assumption held the state of the art on the task.

Language Modelling Machine Translation

Revisiting the Optimality of Word Lengths

no code implementations6 Dec 2023 Tiago Pimentel, Clara Meister, Ethan Gotlieb Wilcox, Kyle Mahowald, Ryan Cotterell

Under this method, we find that a language's word lengths should instead be proportional to the surprisal's expectation plus its variance-to-mean ratio.

Formal Aspects of Language Modeling

no code implementations7 Nov 2023 Ryan Cotterell, Anej Svete, Clara Meister, Tianyu Liu, Li Du

Large language models have become one of the most commonly deployed NLP inventions.

Language Modelling

Testing the Predictions of Surprisal Theory in 11 Languages

no code implementations7 Jul 2023 Ethan Gotlieb Wilcox, Tiago Pimentel, Clara Meister, Ryan Cotterell, Roger P. Levy

We address this gap in the current literature by investigating the relationship between surprisal and reading times in eleven different languages, distributed across five language families.

On the Efficacy of Sampling Adapters

1 code implementation7 Jul 2023 Clara Meister, Tiago Pimentel, Luca Malagutti, Ethan G. Wilcox, Ryan Cotterell

While this trade-off is not reflected in standard metrics of distribution quality (such as perplexity), we find that several precision-emphasizing measures indeed indicate that sampling adapters can lead to probability distributions more aligned with the true distribution.

Text Generation

A Formal Perspective on Byte-Pair Encoding

1 code implementation29 Jun 2023 Vilém Zouhar, Clara Meister, Juan Luis Gastaldi, Li Du, Tim Vieira, Mrinmaya Sachan, Ryan Cotterell

Via submodular functions, we prove that the iterative greedy version is a $\frac{1}{{\sigma(\boldsymbol{\mu}^\star)}}(1-e^{-{\sigma(\boldsymbol{\mu}^\star)}})$-approximation of an optimal merge sequence, where ${\sigma(\boldsymbol{\mu}^\star)}$ is the total backward curvature with respect to the optimal merge sequence $\boldsymbol{\mu}^\star$.

Combinatorial Optimization

A Measure-Theoretic Characterization of Tight Language Models

no code implementations20 Dec 2022 Li Du, Lucas Torroba Hennigen, Tiago Pimentel, Clara Meister, Jason Eisner, Ryan Cotterell

Language modeling, a central task in natural language processing, involves estimating a probability distribution over strings.

Language Modelling

A Natural Bias for Language Generation Models

no code implementations19 Dec 2022 Clara Meister, Wojciech Stokowiec, Tiago Pimentel, Lei Yu, Laura Rimell, Adhiguna Kuncoro

After just a few hundred training updates, a standard probabilistic model for language generation has likely not yet learnt many semantic or syntactic rules of natural language, making it difficult to estimate the probability distribution over next tokens.

Machine Translation Text Generation

On the Effect of Anticipation on Reading Times

1 code implementation25 Nov 2022 Tiago Pimentel, Clara Meister, Ethan G. Wilcox, Roger Levy, Ryan Cotterell

We assess the effect of anticipation on reading by comparing how well surprisal and contextual entropy predict reading times on four naturalistic reading datasets: two self-paced and two eye-tracking.

Mutual Information Alleviates Hallucinations in Abstractive Summarization

2 code implementations24 Oct 2022 Liam van der Poel, Ryan Cotterell, Clara Meister

Despite significant progress in the quality of language generated from abstractive summarization models, these models still exhibit the tendency to hallucinate, i. e., output content not supported by the source document.

Abstractive Text Summarization

Naturalistic Causal Probing for Morpho-Syntax

1 code implementation14 May 2022 Afra Amini, Tiago Pimentel, Clara Meister, Ryan Cotterell

Probing has become a go-to methodology for interpreting and analyzing deep neural models in natural language processing.

Sentence

Estimating the Entropy of Linguistic Distributions

no code implementations ACL 2022 Aryaman Arora, Clara Meister, Ryan Cotterell

Shannon entropy is often a quantity of interest to linguists studying the communicative capacity of human language.

On the probability-quality paradox in language generation

no code implementations31 Mar 2022 Clara Meister, Gian Wiher, Tiago Pimentel, Ryan Cotterell

Specifically, we posit that human-like language should contain an amount of information (quantified as negative log-probability) that is close to the entropy of the distribution over natural strings.

Text Generation

Analyzing Wrap-Up Effects through an Information-Theoretic Lens

no code implementations ACL 2022 Clara Meister, Tiago Pimentel, Thomas Hikaru Clark, Ryan Cotterell, Roger Levy

Numerous analyses of reading time (RT) data have been implemented -- all in an effort to better understand the cognitive processes driving reading comprehension.

Reading Comprehension Sentence

On Decoding Strategies for Neural Text Generators

no code implementations29 Mar 2022 Gian Wiher, Clara Meister, Ryan Cotterell

For example, the nature of the diversity-quality trade-off in language generation is very task-specific; the length bias often attributed to beam search is not constant across tasks.

Machine Translation Story Generation

Locally Typical Sampling

3 code implementations1 Feb 2022 Clara Meister, Tiago Pimentel, Gian Wiher, Ryan Cotterell

Automatic and human evaluations show that, in comparison to nucleus and top-k sampling, locally typical sampling offers competitive performance (in both abstractive summarization and story generation) in terms of quality while consistently reducing degenerate repetitions.

Abstractive Text Summarization Story Generation

A surprisal--duration trade-off across and within the world's languages

1 code implementation30 Sep 2021 Tiago Pimentel, Clara Meister, Elizabeth Salesky, Simone Teufel, Damián Blasi, Ryan Cotterell

We thus conclude that there is strong evidence of a surprisal--duration trade-off in operation, both across and within the world's languages.

On Homophony and Rényi Entropy

1 code implementation EMNLP 2021 Tiago Pimentel, Clara Meister, Simone Teufel, Ryan Cotterell

Homophony's widespread presence in natural languages is a controversial topic.

Revisiting the Uniform Information Density Hypothesis

no code implementations EMNLP 2021 Clara Meister, Tiago Pimentel, Patrick Haller, Lena Jäger, Ryan Cotterell, Roger Levy

The uniform information density (UID) hypothesis posits a preference among language users for utterances structured such that information is distributed uniformly across a signal.

Linguistic Acceptability Sentence

Conditional Poisson Stochastic Beam Search

1 code implementation22 Sep 2021 Clara Meister, Afra Amini, Tim Vieira, Ryan Cotterell

In this work, we propose a new method for turning beam search into a stochastic process: Conditional Poisson stochastic beam search.

Is Sparse Attention more Interpretable?

no code implementations ACL 2021 Clara Meister, Stefan Lazov, Isabelle Augenstein, Ryan Cotterell

Sparse attention has been claimed to increase model interpretability under the assumption that it highlights influential inputs.

text-classification Text Classification

Language Model Evaluation Beyond Perplexity

no code implementations ACL 2021 Clara Meister, Ryan Cotterell

As concrete examples, text generated under the nucleus sampling scheme adheres more closely to the type--token relationship of natural language than text produced using standard ancestral sampling; text from LSTMs reflects the natural language distributions over length, stopwords, and symbols surprisingly well.

Language Modelling

A Cognitive Regularizer for Language Modeling

no code implementations ACL 2021 Jason Wei, Clara Meister, Ryan Cotterell

The uniform information density (UID) hypothesis, which posits that speakers behaving optimally tend to distribute information uniformly across a linguistic signal, has gained traction in psycholinguistics as an explanation for certain syntactic, morphological, and prosodic choices.

Inductive Bias Language Modelling

If beam search is the answer, what was the question?

1 code implementation EMNLP 2020 Clara Meister, Tim Vieira, Ryan Cotterell

This implies that the MAP objective alone does not express the properties we desire in text, which merits the question: if beam search is the answer, what was the question?

Machine Translation Text Generation +1

Best-First Beam Search

1 code implementation8 Jul 2020 Clara Meister, Tim Vieira, Ryan Cotterell

Decoding for many NLP tasks requires an effective heuristic algorithm for approximating exact search since the problem of searching the full output space is often intractable, or impractical in many settings.

Generalized Entropy Regularization or: There's Nothing Special about Label Smoothing

no code implementations ACL 2020 Clara Meister, Elizabeth Salesky, Ryan Cotterell

Prior work has explored directly regularizing the output distributions of probabilistic models to alleviate peaky (i. e. over-confident) predictions, a common sign of overfitting.

Text Generation

Testing Machine Translation via Referential Transparency

no code implementations22 Apr 2020 Pinjia He, Clara Meister, Zhendong Su

Machine translation software has seen rapid progress in recent years due to the advancement of deep neural networks.

Machine Translation Medical Diagnosis +1

Structure-Invariant Testing for Machine Translation

2 code implementations19 Jul 2019 Pinjia He, Clara Meister, Zhendong Su

Despite its apparent importance, validating the robustness of machine translation systems is very difficult and has, therefore, been much under-explored.

Dependency Parsing Machine Translation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.