Search Results for author: Samuel Weinbach

Found 8 papers, 5 papers with code

Efficient Parallelization Layouts for Large-Scale Distributed Model Training

1 code implementation • 9 Nov 2023 • Johannes Hagemann, Samuel Weinbach, Konstantin Dobler, Maximilian Schall, Gerard de Melo

In this work, we conduct a comprehensive ablation study of possible training configurations for large language models.

Paper
Code

Tokenizer Choice For LLM Training: Negligible or Crucial?

no code implementations • 12 Oct 2023 • Mehdi Ali, Michael Fromm, Klaudia Thellmann, Richard Rutmann, Max Lübbering, Johannes Leveling, Katrin Klug, Jan Ebert, Niclas Doll, Jasper Schulze Buschhoff, Charvi Jain, Alexander Arno Weber, Lena Jurkschat, Hammam Abdelwahab, Chelsea John, Pedro Ortiz Suarez, Malte Ostendorff, Samuel Weinbach, Rafet Sifa, Stefan Kesselheim, Nicolas Flores-Herr

The recent success of Large Language Models (LLMs) has been predominantly driven by curating the training dataset composition, scaling of model architectures and dataset sizes and advancements in pretraining objectives, leaving tokenizer influence as a blind spot.

Paper
Add Code

MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation

1 code implementation • NeurIPS 2023 • Marco Bellagente, Manuel Brack, Hannah Teufel, Felix Friedrich, Björn Deiseroth, Constantin Eichenberg, Andrew Dai, Robert Baldock, Souradeep Nanda, Koen Oostermeijer, Andres Felipe Cruz-Salinas, Patrick Schramowski, Kristian Kersting, Samuel Weinbach

The recent popularity of text-to-image diffusion models (DM) can largely be attributed to the intuitive interface they provide to users.

Image Generation

Paper
Code

AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation

1 code implementation • NeurIPS 2023 • Björn Deiseroth, Mayukh Deb, Samuel Weinbach, Manuel Brack, Patrick Schramowski, Kristian Kersting

Generative transformer models have become increasingly complex, with large numbers of parameters and the ability to process multiple input modalities.

Paper
Code

M-VADER: A Model for Diffusion with Multimodal Context

no code implementations • 6 Dec 2022 • Samuel Weinbach, Marco Bellagente, Constantin Eichenberg, Andrew Dai, Robert Baldock, Souradeep Nanda, Björn Deiseroth, Koen Oostermeijer, Hannah Teufel, Andres Felipe Cruz-Salinas

We introduce M-VADER: a diffusion model (DM) for image generation where the output can be specified using arbitrary combinations of images and text.

Image Generation Language Modelling

Paper
Add Code

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

5 code implementations • BigScience (ACL) 2022 • Sid Black, Stella Biderman, Eric Hallahan, Quentin Anthony, Leo Gao, Laurence Golding, Horace He, Connor Leahy, Kyle McDonell, Jason Phang, Michael Pieler, USVSN Sai Prashanth, Shivanshu Purohit, Laria Reynolds, Jonathan Tow, Ben Wang, Samuel Weinbach

We introduce GPT-NeoX-20B, a 20 billion parameter autoregressive language model trained on the Pile, whose weights will be made freely and openly available to the public through a permissive license.

Ranked #86 on Multi-task Language Understanding on MMLU

Language Modelling Multi-task Language Understanding

48,096

Paper
Code

MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning

1 code implementation • 9 Dec 2021 • Constantin Eichenberg, Sidney Black, Samuel Weinbach, Letitia Parcalabescu, Anette Frank

Large-scale pretraining is fast becoming the norm in Vision-Language (VL) modeling.

In-Context Learning Language Modelling

462

Paper
Code

Domain-Level Explainability -- A Challenge for Creating Trust in Superhuman AI Strategies

no code implementations • 12 Nov 2020 • Jonas Andrulis, Ole Meyer, Grégory Schott, Samuel Weinbach, Volker Gruhn

For strategic problems, intelligent systems based on Deep Reinforcement Learning (DRL) have demonstrated an impressive ability to learn advanced solutions that can go far beyond human capabilities, especially when dealing with complex scenarios.

Explainable Artificial Intelligence (XAI)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.