no code implementations • 19 Feb 2025 • Shaona Ghosh, Heather Frase, Adina Williams, Sarah Luger, Paul Röttger, Fazl Barez, Sean McGregor, Kenneth Fricklas, Mala Kumar, Quentin Feuillade--Montixi, Kurt Bollacker, Felix Friedrich, Ryan Tsang, Bertie Vidgen, Alicia Parrish, Chris Knotz, Eleonora Presani, Jonathan Bennion, Marisa Ferrara Boston, Mike Kuniavsky, Wiebke Hutiri, James Ezick, Malek Ben Salem, Rajat Sahay, Sujata Goswami, Usman Gohar, Ben Huang, Supheakmungkol Sarin, Elie Alhajjar, Canyu Chen, Roman Eng, Kashyap Ramanandula Manjusha, Virendra Mehta, Eileen Long, Murali Emani, Natan Vidra, Benjamin Rukundo, Abolfazl Shahbazi, Kongtao Chen, Rajat Ghosh, Vithursan Thangarasa, Pierre Peigné, Abhinav Singh, Max Bartolo, Satyapriya Krishna, Mubashara Akhtar, Rafael Gold, Cody Coleman, Luis Oala, Vassil Tashev, Joseph Marvin Imperial, Amy Russ, Sasidhar Kunapuli, Nicolas Miailhe, Julien Delaunay, Bhaktipriya Radharapu, Rajat Shinde, Tuesday, Debojyoti Dutta, Declan Grabb, Ananya Gangavarapu, Saurav Sahay, Agasthya Gangavarapu, Patrick Schramowski, Stephen Singam, Tom David, Xudong Han, Priyanka Mary Mammen, Tarunima Prabhakar, Venelin Kovatchev, Ahmed Ahmed, Kelvin N. Manyeki, Sandeep Madireddy, Foutse khomh, Fedor Zhdanov, Joachim Baumann, Nina Vasan, Xianjun Yang, Carlos Mougn, Jibin Rajan Varghese, Hussain Chinoy, Seshakrishna Jitendar, Manil Maskey, Claire V. Hardgrove, TianHao Li, Aakash Gupta, Emil Joswin, Yifan Mai, Shachi H Kumar, Cigdem Patlak, Kevin Lu, Vincent Alessi, Sree Bhargavi Balija, Chenhe Gu, Robert Sullivan, James Gealy, Matt Lavrisa, James Goel, Peter Mattson, Percy Liang, Joaquin Vanschoren
This work represents a crucial step toward establishing global standards for AI risk and reliability evaluation while acknowledging the need for continued development in areas such as multiturn interactions, multimodal understanding, coverage of additional languages, and emerging hazard categories.
1 code implementation • 17 Jan 2025 • Paul Röttger, Giuseppe Attanasio, Felix Friedrich, Janis Goldzycher, Alicia Parrish, Rishabh Bhardwaj, Chiara Di Bonaventura, Roman Eng, Gaia El Khoury Geagea, Sujata Goswami, Jieun Han, Dirk Hovy, Seogyeong Jeong, Paloma Jeretič, Flor Miriam Plaza-del-Arco, Donya Rooein, Patrick Schramowski, Anastassia Shaitarova, Xudong Shen, Richard Willats, Andrea Zugarini, Bertie Vidgen
Finally, we explore the automation of VLM safety assessments, finding even the best safety classifiers to be lacking.
no code implementations • 19 Dec 2024 • Felix Friedrich, Simone Tedeschi, Patrick Schramowski, Manuel Brack, Roberto Navigli, Huu Nguyen, Bo Li, Kristian Kersting
Building safe Large Language Models (LLMs) across multiple languages is essential in ensuring both safe access and linguistic diversity.
1 code implementation • 11 Nov 2024 • Ruben Härle, Felix Friedrich, Manuel Brack, Björn Deiseroth, Patrick Schramowski, Kristian Kersting
Large Language Models (LLMs) have demonstrated remarkable capabilities in generating human-like text, but their output may not be aligned with the user or even produce harmful content.
no code implementations • 8 Oct 2024 • Subarnaduti Paul, Manuel Brack, Patrick Schramowski, Kristian Kersting, Martin Mundt
Deep networks are frequently tuned to novel tasks and continue learning from ongoing data streams.
no code implementations • 3 Jul 2024 • Simon Ostermann, Kevin Baum, Christoph Endres, Julia Masloh, Patrick Schramowski
Prompt injection (both direct and indirect) and jailbreaking are now recognized as significant issues for large language models (LLMs), particularly due to their potential for harm in application-integrated contexts.
1 code implementation • 27 Jun 2024 • Björn Deiseroth, Manuel Brack, Patrick Schramowski, Kristian Kersting, Samuel Weinbach
Tokenizers are crucial for encoding information in Large Language Models, but their development has recently stagnated, and they contain inherent weaknesses.
1 code implementation • 7 Jun 2024 • Lukas Helff, Felix Friedrich, Manuel Brack, Kristian Kersting, Patrick Schramowski
This paper introduces LlavaGuard, a suite of VLM-based vision safeguards that address the critical need for reliable guardrails in the era of large-scale data and models.
1 code implementation • 18 Apr 2024 • Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Max Bartolo, Borhane Blili-Hamelin, Kurt Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller, Ram Gandikota, Agasthya Gangavarapu, Ananya Gangavarapu, James Gealy, Rajat Ghosh, James Goel, Usman Gohar, Sujata Goswami, Scott A. Hale, Wiebke Hutiri, Joseph Marvin Imperial, Surgan Jandial, Nick Judd, Felix Juefei-Xu, Foutse khomh, Bhavya Kailkhura, Hannah Rose Kirk, Kevin Klyman, Chris Knotz, Michael Kuchnik, Shachi H. Kumar, Srijan Kumar, Chris Lengerich, Bo Li, Zeyi Liao, Eileen Peters Long, Victor Lu, Sarah Luger, Yifan Mai, Priyanka Mary Mammen, Kelvin Manyeki, Sean McGregor, Virendra Mehta, Shafee Mohammed, Emanuel Moss, Lama Nachman, Dinesh Jinenhally Naganna, Amin Nikanjam, Besmira Nushi, Luis Oala, Iftach Orr, Alicia Parrish, Cigdem Patlak, William Pietri, Forough Poursabzi-Sangdeh, Eleonora Presani, Fabrizio Puletti, Paul Röttger, Saurav Sahay, Tim Santos, Nino Scherrer, Alice Schoenauer Sebag, Patrick Schramowski, Abolfazl Shahbazi, Vin Sharma, Xudong Shen, Vamsi Sistla, Leonard Tang, Davide Testuggine, Vithursan Thangarasa, Elizabeth Anne Watkins, Rebecca Weiss, Chris Welty, Tyler Wilbers, Adina Williams, Carole-Jean Wu, Poonam Yadav, Xianjun Yang, Yi Zeng, Wenhui Zhang, Fedor Zhdanov, Jiacheng Zhu, Percy Liang, Peter Mattson, Joaquin Vanschoren
We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0. 5 benchmark.
2 code implementations • 6 Apr 2024 • Simone Tedeschi, Felix Friedrich, Patrick Schramowski, Kristian Kersting, Roberto Navigli, Huu Nguyen, Bo Li
When building Large Language Models (LLMs), it is paramount to bear safety in mind and protect them with guardrails.
1 code implementation • 21 Feb 2024 • Hikaru Shindo, Manuel Brack, Gopika Sudhakaran, Devendra Singh Dhami, Patrick Schramowski, Kristian Kersting
To remedy this issue, we propose DeiSAM -- a combination of large pre-trained neural networks with differentiable logic reasoners -- for deictic promptable segmentation.
1 code implementation • 29 Jan 2024 • Felix Friedrich, Katharina Hämmerl, Patrick Schramowski, Manuel Brack, Jindrich Libovicky, Kristian Kersting, Alexander Fraser
Our results show that not only do models exhibit strong gender biases but they also behave differently across languages.
1 code implementation • CVPR 2024 • Manuel Brack, Felix Friedrich, Katharina Kornmeier, Linoy Tsaban, Patrick Schramowski, Kristian Kersting, Apolinário Passos
Our results demonstrate the capabilities of LEDITS++ and its improvements over previous methods.
no code implementations • 2 Nov 2023 • Björn Deiseroth, Max Meuer, Nikolas Gritsch, Constantin Eichenberg, Patrick Schramowski, Matthias Aßenmacher, Kristian Kersting
Large Language Models (LLMs) have reshaped natural language processing with their impressive capabilities.
no code implementations • 20 Sep 2023 • Manuel Brack, Patrick Schramowski, Kristian Kersting
Text-conditioned image generation models have recently achieved astonishing image quality and alignment results.
no code implementations • 28 May 2023 • Manuel Brack, Felix Friedrich, Patrick Schramowski, Kristian Kersting
Text-conditioned image generation models have recently achieved astonishing results in image quality and text alignment and are consequently employed in a fast-growing number of applications.
1 code implementation • NeurIPS 2023 • Marco Bellagente, Manuel Brack, Hannah Teufel, Felix Friedrich, Björn Deiseroth, Constantin Eichenberg, Andrew Dai, Robert Baldock, Souradeep Nanda, Koen Oostermeijer, Andres Felipe Cruz-Salinas, Patrick Schramowski, Kristian Kersting, Samuel Weinbach
The recent popularity of text-to-image diffusion models (DM) can largely be attributed to the intuitive interface they provide to users.
1 code implementation • 16 Mar 2023 • Lukas Struppek, Dominik Hintersdorf, Felix Friedrich, Manuel Brack, Patrick Schramowski, Kristian Kersting
Neural network-based image classifiers are powerful tools for computer vision tasks, but they inadvertently reveal sensitive attribute information about their classes, raising concerns about their privacy.
1 code implementation • 7 Feb 2023 • Felix Friedrich, Manuel Brack, Lukas Struppek, Dominik Hintersdorf, Patrick Schramowski, Sasha Luccioni, Kristian Kersting
Generative AI models have recently achieved astonishing results in quality and are consequently employed in a fast-growing number of applications.
1 code implementation • NeurIPS 2023 • Manuel Brack, Felix Friedrich, Dominik Hintersdorf, Lukas Struppek, Patrick Schramowski, Kristian Kersting
This leaves the user with little semantic control.
1 code implementation • NeurIPS 2023 • Björn Deiseroth, Mayukh Deb, Samuel Weinbach, Manuel Brack, Patrick Schramowski, Kristian Kersting
Generative transformer models have become increasingly complex, with large numbers of parameters and the ability to process multiple input modalities.
2 code implementations • 12 Dec 2022 • Manuel Brack, Patrick Schramowski, Felix Friedrich, Dominik Hintersdorf, Kristian Kersting
Large, text-conditioned generative diffusion models have recently gained a lot of attention for their impressive performance in generating high-fidelity images from text alone.
1 code implementation • 14 Nov 2022 • Katharina Hämmerl, Björn Deiseroth, Patrick Schramowski, Jindřich Libovický, Constantin A. Rothkopf, Alexander Fraser, Kristian Kersting
Do the models capture moral norms from English and impose them on other languages?
2 code implementations • CVPR 2023 • Patrick Schramowski, Manuel Brack, Björn Deiseroth, Kristian Kersting
Text-conditioned image generation models have recently achieved astonishing results in image quality and text alignment and are consequently employed in a fast-growing number of applications.
1 code implementation • 19 Oct 2022 • Felix Friedrich, Wolfgang Stammer, Patrick Schramowski, Kristian Kersting
In this work, we question the current common practice of storing all information in the model parameters and propose the Revision Transformer (RiT) to facilitate easy model updating.
4 code implementations • NeurIPS 2022 Datasets and Benchmarks 2022 • Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, Patrick Schramowski, Srivatsa Kundurthy, Katherine Crowson, Ludwig Schmidt, Robert Kaczmarczyk, Jenia Jitsev
We show successful replication and fine-tuning of foundational models like CLIP, GLIDE and Stable Diffusion using the dataset, and discuss further experiments enabled with an openly available dataset of this scale.
2 code implementations • 19 Sep 2022 • Lukas Struppek, Dominik Hintersdorf, Felix Friedrich, Manuel Brack, Patrick Schramowski, Kristian Kersting
Models for text-to-image synthesis, such as DALL-E~2 and Stable Diffusion, have recently drawn a lot of interest from academia and the general public.
3 code implementations • 15 Sep 2022 • Dominik Hintersdorf, Lukas Struppek, Manuel Brack, Felix Friedrich, Patrick Schramowski, Kristian Kersting
Our large-scale experiments on CLIP demonstrate that individuals used for training can be identified with very high accuracy.
no code implementations • 29 Aug 2022 • Björn Deiseroth, Patrick Schramowski, Hikaru Shindo, Devendra Singh Dhami, Kristian Kersting
Text-to-image models have recently achieved remarkable success with seemingly accurate samples in photo-realistic quality.
1 code implementation • 17 Aug 2022 • Manuel Brack, Patrick Schramowski, Björn Deiseroth, Kristian Kersting
Bootstrapping from pre-trained language models has been proven to be an efficient approach for building vision-language models (VLM) for tasks such as image captioning or visual question answering.
no code implementations • 18 Mar 2022 • Katharina Hämmerl, Björn Deiseroth, Patrick Schramowski, Jindřich Libovický, Alexander Fraser, Kristian Kersting
Massively multilingual sentence representations are trained on large corpora of uncurated data, with a very imbalanced proportion of languages included in the training.
3 code implementations • 4 Mar 2022 • Felix Friedrich, Wolfgang Stammer, Patrick Schramowski, Kristian Kersting
In addition, we discuss existing and introduce novel measures and benchmarks for evaluating the overall abilities of a XIL method.
2 code implementations • 14 Feb 2022 • Patrick Schramowski, Christopher Tauchmann, Kristian Kersting
This calls for increased dataset documentation, e. g., using datasheets.
1 code implementation • CVPR 2022 • Wolfgang Stammer, Marius Memmel, Patrick Schramowski, Kristian Kersting
In this work, we show the advantages of prototype representations for understanding and revising the latent space of neural concept learners.
1 code implementation • 8 Oct 2021 • Patrick Schramowski, Kristian Kersting
Probing or fine-tuning (large-scale) pre-trained models results in state-of-the-art performance for many NLP tasks and, more recently, even for computer vision tasks when combined with image data.
1 code implementation • 2 Sep 2021 • Felix Friedrich, Patrick Schramowski, Christopher Tauchmann, Kristian Kersting
Transformer language models are state of the art in a multitude of NLP tasks.
1 code implementation • 8 Mar 2021 • Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting
That is, we show that these norms can be captured geometrically by a direction, which can be computed, e. g., by a PCA, in the embedding space, reflecting well the agreement of phrases to social norms implicitly expressed in the training texts and providing a path for attenuating or even preventing toxic degeneration in LMs.
4 code implementations • 18 Feb 2021 • Quentin Delfosse, Patrick Schramowski, Martin Mundt, Alejandro Molina, Kristian Kersting
Latest insights from biology show that intelligence not only emerges from the connections between neurons but that individual neurons shoulder more computational responsibility than previously anticipated.
Ranked #3 on
Atari Games
on Atari 2600 Skiing
(using extra training data)
3 code implementations • CVPR 2021 • Wolfgang Stammer, Patrick Schramowski, Kristian Kersting
Most explanation methods in deep learning map importance estimates for a model's prediction back to the original input space.
1 code implementation • 15 Jan 2020 • Patrick Schramowski, Wolfgang Stammer, Stefano Teso, Anna Brugger, Xiaoting Shao, Hans-Georg Luigs, Anne-Katrin Mahlein, Kristian Kersting
Deep neural networks have shown excellent performances in many real-world applications.
no code implementations • 11 Dec 2019 • Patrick Schramowski, Cigdem Turan, Sophie Jentzsch, Constantin Rothkopf, Kristian Kersting
But has BERT also a better moral compass?
no code implementations • 25 Sep 2019 • Nadine Behrmann, Patrick Schramowski, Kristian Kersting
However, by studying the characteristics of the local error function we show that including the partial derivatives of the initial value problem is favorable.
5 code implementations • ICLR 2020 • Alejandro Molina, Patrick Schramowski, Kristian Kersting
The performance of deep network learning strongly depends on the choice of the non-linear activation function associated with each neuron.
no code implementations • 12 Mar 2018 • Patrick Schramowski, Christian Bauckhage, Kristian Kersting
The move from hand-designed to learned optimizers in machine learning has been quite successful for gradient-based and -free optimizers.