Search Results for author: Maxime Peyrard

Found 36 papers, 27 papers with code

The Era of Semantic Decoding

no code implementations21 Mar 2024 Maxime Peyrard, Martin Josifoski, Robert West

We refer to these orchestrated interactions among semantic processors, optimizing and searching in semantic space, as semantic decoding algorithms.

Symbolic Autoencoding for Self-Supervised Sequence Learning

no code implementations16 Feb 2024 Mohammad Hossein Amani, Nicolas Mario Baldwin, Amin Mansouri, Martin Josifoski, Maxime Peyrard, Robert West

Traditional language models, adept at next-token prediction in text sequences, often struggle with transduction tasks between distinct symbolic systems, particularly when parallel data is scarce.

Weakly-supervised Learning

A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia

1 code implementation4 Dec 2023 Giovanni Monea, Maxime Peyrard, Martin Josifoski, Vishrav Chaudhary, Jason Eisner, Emre Kiciman, Hamid Palangi, Barun Patra, Robert West

Yet the mechanisms underlying this contextual grounding remain unknown, especially in situations where contextual information contradicts factual knowledge stored in the parameters, which LLMs also excel at recalling.

counterfactual Language Modelling +1

Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling

2 code implementations11 Aug 2023 Marija Šakota, Maxime Peyrard, Robert West

For a wide variety of tasks, inputs can be phrased as natural language prompts for an LM, from whose output the solution can then be extracted.

Language Modelling

Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning

2 code implementations23 May 2023 Saibo Geng, Martin Josifoski, Maxime Peyrard, Robert West

In this work, we demonstrate that formal grammars can describe the output space for a much wider range of tasks and argue that GCD can serve as a unified framework for structured NLP tasks in general.

Code Generation Constituency Parsing +2

REFINER: Reasoning Feedback on Intermediate Representations

1 code implementation4 Apr 2023 Debjit Paul, Mete Ismayilzada, Maxime Peyrard, Beatriz Borges, Antoine Bosselut, Robert West, Boi Faltings

Language models (LMs) have recently shown remarkable performance on reasoning tasks by explicitly generating intermediate inferences, e. g., chain-of-thought prompting.

Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and the Case of Information Extraction

1 code implementation7 Mar 2023 Martin Josifoski, Marija Sakota, Maxime Peyrard, Robert West

This work shows that useful data can be synthetically generated even for tasks that cannot be solved directly by LLMs: for problems with structured outputs, it is possible to prompt an LLM to perform the task in the reverse direction, by generating plausible input text for a target output structure.

Synthetic Data Generation

Language Model Decoding as Likelihood-Utility Alignment

1 code implementation13 Oct 2022 Martin Josifoski, Maxime Peyrard, Frano Rajic, Jiheng Wei, Debjit Paul, Valentin Hartmann, Barun Patra, Vishrav Chaudhary, Emre Kiciman, Boi Faltings, Robert West

Specifically, by analyzing the correlation between the likelihood and the utility of predictions across a diverse set of tasks, we provide empirical evidence supporting the proposed taxonomy and a set of principles to structure reasoning when choosing a decoding algorithm.

Language Modelling Text Generation

Distribution inference risks: Identifying and mitigating sources of leakage

2 code implementations18 Sep 2022 Valentin Hartmann, Léo Meynent, Maxime Peyrard, Dimitrios Dimitriadis, Shruti Tople, Robert West

We identify three sources of leakage: (1) memorizing specific information about the $\mathbb{E}[Y|X]$ (expected label given the feature values) of interest to the adversary, (2) wrong inductive bias of the model, and (3) finiteness of the training data.

Inductive Bias

The Glass Ceiling of Automatic Evaluation in Natural Language Generation

no code implementations31 Aug 2022 Pierre Colombo, Maxime Peyrard, Nathan Noiry, Robert West, Pablo Piantanida

Automatic evaluation metrics capable of replacing human judgments are critical to allowing fast development of new methods.

Text Generation

Predicting is not Understanding: Recognizing and Addressing Underspecification in Machine Learning

no code implementations6 Jul 2022 Damien Teney, Maxime Peyrard, Ehsan Abbasnejad

Underspecification refers to the existence of multiple models that are indistinguishable in their in-domain accuracy, even though they differ in other desirable properties such as out-of-distribution (OOD) performance.

BIG-bench Machine Learning Model Selection

Descartes: Generating Short Descriptions of Wikipedia Articles

1 code implementation20 May 2022 Marija Sakota, Maxime Peyrard, Robert West

Wikipedia is one of the richest knowledge sources on the Web today.

On the Context-Free Ambiguity of Emoji

1 code implementation17 Jan 2022 Justyna Czestochowska, Kristina Gligoric, Maxime Peyrard, Yann Mentha, Michal Bien, Andrea Grutter, Anita Auer, Aris Xanthos, Robert West

We find that with 30 annotations per emoji, 16 emojis (1. 2%) are completely unambiguous, whereas 55 emojis (4. 3%) are so ambiguous that their descriptions are indistinguishable from randomly chosen descriptions.

GenIE: Generative Information Extraction

1 code implementation NAACL 2022 Martin Josifoski, Nicola De Cao, Maxime Peyrard, Fabio Petroni, Robert West

Structured and grounded representation of text is typically formalized by closed information extraction, the problem of extracting an exhaustive set of (subject, relation, object) triplets that are consistent with a predefined set of entities and relations from a knowledge base schema.

Better than Average: Paired Evaluation of NLP Systems

1 code implementation ACL 2021 Maxime Peyrard, Wei Zhao, Steffen Eger, Robert West

Evaluation in NLP is usually done by comparing the scores of competing systems independently averaged over a common set of test instances.

Invariant Language Modeling

1 code implementation16 Oct 2021 Maxime Peyrard, Sarvjeet Singh Ghotra, Martin Josifoski, Vidhan Agarwal, Barun Patra, Dean Carignan, Emre Kiciman, Robert West

In particular, we adapt a game-theoretic formulation of IRM (IRM-games) to language models, where the invariance emerges from a specific training schedule in which all the environments compete to optimize their own environment-specific loss by updating subsets of the model in a round-robin fashion.

Domain Generalization Language Modelling

Laughing Heads: Can Transformers Detect What Makes a Sentence Funny?

1 code implementation19 May 2021 Maxime Peyrard, Beatriz Borges, Kristina Gligorić, Robert West

We make progress in both respects by training and analyzing transformer-based humor recognition models on a recently introduced dataset consisting of minimal pairs of aligned sentences, one serious, the other humorous.

Benchmarking Sentence

KLearn: Background Knowledge Inference from Summarization Data

1 code implementation Findings of the Association for Computational Linguistics 2020 Maxime Peyrard, Robert West

The goal of text summarization is to compress documents to the relevant information while excluding background information already known to the receiver.

Text Summarization

A Ladder of Causal Distances

1 code implementation5 May 2020 Maxime Peyrard, Robert West

Causal discovery, the task of automatically constructing a causal model from data, is of major significance across the sciences.

Benchmarking Causal Discovery +1

On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation

1 code implementation ACL 2020 Wei Zhao, Goran Glavaš, Maxime Peyrard, Yang Gao, Robert West, Steffen Eger

We systematically investigate a range of metrics based on state-of-the-art cross-lingual semantic representations obtained with pretrained M-BERT and LASER.

Language Modelling Machine Translation +4

Studying Summarization Evaluation Metrics in the Appropriate Scoring Range

1 code implementation ACL 2019 Maxime Peyrard

In summarization, automatic evaluation metrics are usually compared based on their ability to correlate with human judgments.

Objective Function Learning to Match Human Judgements for Optimization-Based Summarization

no code implementations NAACL 2018 Maxime Peyrard, Iryna Gurevych

Supervised summarization systems usually rely on supervision at the sentence or n-gram level provided by automatic metrics like ROUGE, which act as noisy proxies for human judgments.


Live Blog Corpus for Summarization

1 code implementation LREC 2018 Avinesh P. V. S., Maxime Peyrard, Christian M. Meyer

Live blogs are an increasingly popular news format to cover breaking news and live events in online journalism.

A Simple Theoretical Model of Importance for Summarization

no code implementations ACL 2019 Maxime Peyrard

Research on summarization has mainly been driven by empirical approaches, crafting systems to perform well on standard datasets with the notion of information Importance remaining latent.


Supervised Learning of Automatic Pyramid for Optimization-Based Multi-Document Summarization

no code implementations ACL 2017 Maxime Peyrard, Judith Eckle-Kohler

We present a new supervised framework that learns to estimate automatic Pyramid scores and uses them for optimization-based extractive multi-document summarization.

Document Summarization Multi-Document Summarization +1

A Principled Framework for Evaluating Summarizers: Comparing Models of Summary Quality against Human Judgments

1 code implementation ACL 2017 Maxime Peyrard, Judith Eckle-Kohler

We present a new framework for evaluating extractive summarizers, which is based on a principled representation as optimization problem.

The Next Step for Multi-Document Summarization: A Heterogeneous Multi-Genre Corpus Built with a Novel Construction Approach

1 code implementation COLING 2016 Markus Zopf, Maxime Peyrard, Judith Eckle-Kohler

In a detailed analysis, we show that our new corpus is significantly different from the homogeneous corpora commonly used, and that it is heterogeneous along several dimensions.

Document Summarization Multi-Document Summarization +1

Cannot find the paper you are looking for? You can Submit a new open access paper.