Search Results for author: Martin Josifoski

Found 17 papers, 11 papers with code

The Era of Semantic Decoding

no code implementations21 Mar 2024 Maxime Peyrard, Martin Josifoski, Robert West

We refer to these orchestrated interactions among semantic processors, optimizing and searching in semantic space, as semantic decoding algorithms.

Symbolic Autoencoding for Self-Supervised Sequence Learning

no code implementations16 Feb 2024 Mohammad Hossein Amani, Nicolas Mario Baldwin, Amin Mansouri, Martin Josifoski, Maxime Peyrard, Robert West

Traditional language models, adept at next-token prediction in text sequences, often struggle with transduction tasks between distinct symbolic systems, particularly when parallel data is scarce.

Weakly-supervised Learning

Sketch-Guided Constrained Decoding for Boosting Blackbox Large Language Models without Logit Access

no code implementations18 Jan 2024 Saibo Geng, Berkay Döner, Chris Wendler, Martin Josifoski, Robert West

This paper introduces sketch-guided constrained decoding (SGCD), a novel approach to constrained decoding for blackbox LLMs, which operates without access to the logits of the blackbox LLM.

Constituency Parsing Language Modelling +1

A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia

1 code implementation4 Dec 2023 Giovanni Monea, Maxime Peyrard, Martin Josifoski, Vishrav Chaudhary, Jason Eisner, Emre Kiciman, Hamid Palangi, Barun Patra, Robert West

Yet the mechanisms underlying this contextual grounding remain unknown, especially in situations where contextual information contradicts factual knowledge stored in the parameters, which LLMs also excel at recalling.

counterfactual Language Modelling +1

Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning

2 code implementations23 May 2023 Saibo Geng, Martin Josifoski, Maxime Peyrard, Robert West

In this work, we demonstrate that formal grammars can describe the output space for a much wider range of tasks and argue that GCD can serve as a unified framework for structured NLP tasks in general.

Code Generation Constituency Parsing +2

Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and the Case of Information Extraction

1 code implementation7 Mar 2023 Martin Josifoski, Marija Sakota, Maxime Peyrard, Robert West

This work shows that useful data can be synthetically generated even for tasks that cannot be solved directly by LLMs: for problems with structured outputs, it is possible to prompt an LLM to perform the task in the reverse direction, by generating plausible input text for a target output structure.

Synthetic Data Generation

Scalable PAC-Bayesian Meta-Learning via the PAC-Optimal Hyper-Posterior: From Theory to Practice

no code implementations14 Nov 2022 Jonas Rothfuss, Martin Josifoski, Vincent Fortuin, Andreas Krause

Meta-Learning aims to speed up the learning process on new tasks by acquiring useful inductive biases from datasets of related learning tasks.

Gaussian Processes Meta-Learning +1

Language Model Decoding as Likelihood-Utility Alignment

1 code implementation13 Oct 2022 Martin Josifoski, Maxime Peyrard, Frano Rajic, Jiheng Wei, Debjit Paul, Valentin Hartmann, Barun Patra, Vishrav Chaudhary, Emre Kiciman, Boi Faltings, Robert West

Specifically, by analyzing the correlation between the likelihood and the utility of predictions across a diverse set of tasks, we provide empirical evidence supporting the proposed taxonomy and a set of principles to structure reasoning when choosing a decoding algorithm.

Language Modelling Text Generation

GenIE: Generative Information Extraction

1 code implementation NAACL 2022 Martin Josifoski, Nicola De Cao, Maxime Peyrard, Fabio Petroni, Robert West

Structured and grounded representation of text is typically formalized by closed information extraction, the problem of extracting an exhaustive set of (subject, relation, object) triplets that are consistent with a predefined set of entities and relations from a knowledge base schema.

Invariant Language Modeling

1 code implementation16 Oct 2021 Maxime Peyrard, Sarvjeet Singh Ghotra, Martin Josifoski, Vidhan Agarwal, Barun Patra, Dean Carignan, Emre Kiciman, Robert West

In particular, we adapt a game-theoretic formulation of IRM (IRM-games) to language models, where the invariance emerges from a specific training schedule in which all the environments compete to optimize their own environment-specific loss by updating subsets of the model in a round-robin fashion.

Domain Generalization Language Modelling

Meta-Learning Bayesian Neural Network Priors Based on PAC-Bayesian Theory

no code implementations1 Jan 2021 Jonas Rothfuss, Martin Josifoski, Andreas Krause

Bayesian deep learning is a promising approach towards improved uncertainty quantification and sample efficiency.

Meta-Learning Uncertainty Quantification +1

Scalable Zero-shot Entity Linking with Dense Entity Retrieval

3 code implementations EMNLP 2020 Ledell Wu, Fabio Petroni, Martin Josifoski, Sebastian Riedel, Luke Zettlemoyer

This paper introduces a conceptually simple, scalable, and highly effective BERT-based entity linking model, along with an extensive evaluation of its accuracy-speed trade-off.

Entity Embeddings Entity Linking +3

Crosslingual Document Embedding as Reduced-Rank Ridge Regression

1 code implementation8 Apr 2019 Martin Josifoski, Ivan S. Paskov, Hristo S. Paskov, Martin Jaggi, Robert West

Finally, although not trained for embedding sentences and words, it also achieves competitive performance on crosslingual sentence and word retrieval tasks.

Document Embedding regression +2

Cannot find the paper you are looking for? You can Submit a new open access paper.