1 code implementation • 11 Mar 2025 • Jason Becker, Chris Wendler, Peter Baylies, Robert West, Christian Wressnegger
We train Latent-CLIP on 2. 7B pairs of latent images and descriptive texts, and show that it matches zero-shot classification performance of similarly sized CLIP models on both the ImageNet benchmark and a LDM-generated version of it, demonstrating its effectiveness in assessing both real and generated content.
1 code implementation • 10 Jan 2025 • Jannik Brinkmann, Chris Wendler, Christian Bartelt, Aaron Mueller
In large language models (LLMs), how are multiple languages learned and encoded?
no code implementations • 4 Dec 2024 • Saibo Geng, Sankalp Gambhir, Chris Wendler, Robert West
Tokenization is an important preprocessing step in the training and inference of large language models (LLMs).
1 code implementation • 13 Nov 2024 • Clément Dumas, Chris Wendler, Veniamin Veselovsky, Giovanni Monea, Robert West
In this paper, we address this question by analyzing latent representations (latents) during a word translation task in transformer-based LLMs.
1 code implementation • 11 Nov 2024 • Julian Minder, Kevin Du, Niklas Stoehr, Giovanni Monea, Chris Wendler, Robert West, Ryan Cotterell
In this paper, we search for a knob which controls this sensitivity, determining whether language models answer from the context or their prior knowledge.
1 code implementation • 28 Oct 2024 • Viacheslav Surkov, Chris Wendler, Mikhail Terekhov, Justin Deschenaux, Robert West, Caglar Gulcehre
We investigated the possibility of using SAEs to learn interpretable features for a few-step text-to-image diffusion models, such as SDXL Turbo.
1 code implementation • 15 Jul 2024 • Wenhao Zhu, Sizhe Liu, ShuJian Huang, Shuaijie She, Chris Wendler, Jiajun Chen
Decoding by contrasting layers (DoLa), is designed to improve the generation quality of large language models (LLMs) by contrasting the prediction probabilities between an early exit output (amateur logits) and the final output (expert logits).
1 code implementation • 16 Feb 2024 • Chris Wendler, Veniamin Veselovsky, Giovanni Monea, Robert West
Tracking intermediate embeddings through their high-dimensional space reveals three distinct phases, whereby intermediate embeddings (1) start far away from output token embeddings; (2) already allow for decoding a semantically correct next token in the middle layers, but give higher probability to its version in English than in the input language; (3) finally move into an input-language-specific region of the embedding space.
1 code implementation • 18 Jan 2024 • Saibo Geng, Berkay Döner, Chris Wendler, Martin Josifoski, Robert West
This paper introduces sketch-guided constrained decoding (SGCD), a novel approach to constrained decoding for blackbox LLMs, which operates without access to the logits of the blackbox LLM.
no code implementations • 29 Aug 2023 • Mathieu Chevalley, Jacob Sackett-Sanders, Yusuf Roohani, Pascal Notin, Artemy Bakulin, Dariusz Brzezinski, Kaiwen Deng, Yuanfang Guan, Justin Hong, Michael Ibrahim, Wojciech Kotlowski, Marcin Kowiel, Panagiotis Misiakos, Achille Nazaret, Markus Püschel, Chris Wendler, Arash Mehrjou, Patrick Schwab
In drug discovery, mapping interactions between genes within cellular systems is a crucial early step.
1 code implementation • NeurIPS 2023 • Panagiotis Misiakos, Chris Wendler, Markus Püschel
We prove identifiability in this new setting and show that the true DAG is the global minimizer of the $L^0$-norm of the vector of root causes.
no code implementations • 16 Sep 2022 • Bastian Seifert, Chris Wendler, Markus Püschel
Specifically, we model the spread of an infection on such a DAG obtained from real-world contact tracing data and learn the infection signal from samples assuming sparsity in the Fourier domain.
1 code implementation • 10 Feb 2022 • Romeo Valentin, Claudio Ferrari, Jérémy Scheurer, Andisheh Amrollahi, Chris Wendler, Max B. Paulus
We pose this task as a supervised learning problem: First, we compile a large dataset of the solver performance for various configurations and all provided MILP instances.
3 code implementations • 1 Oct 2020 • Chris Wendler, Andisheh Amrollahi, Bastian Seifert, Andreas Krause, Markus Püschel
Many applications of machine learning on discrete domains, such as learning preference functions in recommender systems or auctions, can be reduced to estimating a set function that is sparse in the Fourier domain.
no code implementations • 28 Jan 2020 • Markus Püschel, Chris Wendler
Set functions are functions (or signals) indexed by the powerset (set of all subsets) of a finite set N. They are fundamental and ubiquitous in many application domains and have been used, for example, to formally describe or quantify loss functions for semantic image segmentation, the informativeness of sensors in sensor networks the utility of sets of items in recommender systems, cooperative games in game theory, or bidders in combinatorial auctions.
1 code implementation • NeurIPS 2019 • Chris Wendler, Dan Alistarh, Markus Püschel
We present a novel class of convolutional neural networks (CNNs) for set functions, i. e., data indexed with the powerset of a finite set.