Search Results for author: Björn Deiseroth

Found 10 papers, 5 papers with code

Mechanistic Design and Scaling of Hybrid Architectures

no code implementations26 Mar 2024 Michael Poli, Armin W Thomas, Eric Nguyen, Pragaash Ponnusamy, Björn Deiseroth, Kristian Kersting, Taiji Suzuki, Brian Hie, Stefano Ermon, Christopher Ré, Ce Zhang, Stefano Massaroli

The development of deep learning architectures is a resource-demanding process, due to a vast design space, long prototyping times, and high compute costs associated with at-scale model training and evaluation.

AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation

1 code implementation NeurIPS 2023 Björn Deiseroth, Mayukh Deb, Samuel Weinbach, Manuel Brack, Patrick Schramowski, Kristian Kersting

Generative transformer models have become increasingly complex, with large numbers of parameters and the ability to process multiple input modalities.

Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models

2 code implementations CVPR 2023 Patrick Schramowski, Manuel Brack, Björn Deiseroth, Kristian Kersting

Text-conditioned image generation models have recently achieved astonishing results in image quality and text alignment and are consequently employed in a fast-growing number of applications.

Image Generation

ILLUME: Rationalizing Vision-Language Models through Human Interactions

1 code implementation17 Aug 2022 Manuel Brack, Patrick Schramowski, Björn Deiseroth, Kristian Kersting

Bootstrapping from pre-trained language models has been proven to be an efficient approach for building vision-language models (VLM) for tasks such as image captioning or visual question answering.

Image Captioning Question Answering +2

Do Multilingual Language Models Capture Differing Moral Norms?

no code implementations18 Mar 2022 Katharina Hämmerl, Björn Deiseroth, Patrick Schramowski, Jindřich Libovický, Alexander Fraser, Kristian Kersting

Massively multilingual sentence representations are trained on large corpora of uncurated data, with a very imbalanced proportion of languages included in the training.

Sentence XLM-R

Cannot find the paper you are looking for? You can Submit a new open access paper.