Search Results for author: Patrick Schramowski

Found 44 papers, 32 papers with code

AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons

no code implementations19 Feb 2025 Shaona Ghosh, Heather Frase, Adina Williams, Sarah Luger, Paul Röttger, Fazl Barez, Sean McGregor, Kenneth Fricklas, Mala Kumar, Quentin Feuillade--Montixi, Kurt Bollacker, Felix Friedrich, Ryan Tsang, Bertie Vidgen, Alicia Parrish, Chris Knotz, Eleonora Presani, Jonathan Bennion, Marisa Ferrara Boston, Mike Kuniavsky, Wiebke Hutiri, James Ezick, Malek Ben Salem, Rajat Sahay, Sujata Goswami, Usman Gohar, Ben Huang, Supheakmungkol Sarin, Elie Alhajjar, Canyu Chen, Roman Eng, Kashyap Ramanandula Manjusha, Virendra Mehta, Eileen Long, Murali Emani, Natan Vidra, Benjamin Rukundo, Abolfazl Shahbazi, Kongtao Chen, Rajat Ghosh, Vithursan Thangarasa, Pierre Peigné, Abhinav Singh, Max Bartolo, Satyapriya Krishna, Mubashara Akhtar, Rafael Gold, Cody Coleman, Luis Oala, Vassil Tashev, Joseph Marvin Imperial, Amy Russ, Sasidhar Kunapuli, Nicolas Miailhe, Julien Delaunay, Bhaktipriya Radharapu, Rajat Shinde, Tuesday, Debojyoti Dutta, Declan Grabb, Ananya Gangavarapu, Saurav Sahay, Agasthya Gangavarapu, Patrick Schramowski, Stephen Singam, Tom David, Xudong Han, Priyanka Mary Mammen, Tarunima Prabhakar, Venelin Kovatchev, Ahmed Ahmed, Kelvin N. Manyeki, Sandeep Madireddy, Foutse khomh, Fedor Zhdanov, Joachim Baumann, Nina Vasan, Xianjun Yang, Carlos Mougn, Jibin Rajan Varghese, Hussain Chinoy, Seshakrishna Jitendar, Manil Maskey, Claire V. Hardgrove, TianHao Li, Aakash Gupta, Emil Joswin, Yifan Mai, Shachi H Kumar, Cigdem Patlak, Kevin Lu, Vincent Alessi, Sree Bhargavi Balija, Chenhe Gu, Robert Sullivan, James Gealy, Matt Lavrisa, James Goel, Peter Mattson, Percy Liang, Joaquin Vanschoren

This work represents a crucial step toward establishing global standards for AI risk and reliability evaluation while acknowledging the need for continued development in areas such as multiturn interactions, multimodal understanding, coverage of additional languages, and emerging hazard categories.

LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps

no code implementations19 Dec 2024 Felix Friedrich, Simone Tedeschi, Patrick Schramowski, Manuel Brack, Roberto Navigli, Huu Nguyen, Bo Li, Kristian Kersting

Building safe Large Language Models (LLMs) across multiple languages is essential in ensuring both safe access and linguistic diversity.

Diversity

SCAR: Sparse Conditioned Autoencoders for Concept Detection and Steering in LLMs

1 code implementation11 Nov 2024 Ruben Härle, Felix Friedrich, Manuel Brack, Björn Deiseroth, Patrick Schramowski, Kristian Kersting

Large Language Models (LLMs) have demonstrated remarkable capabilities in generating human-like text, but their output may not be aligned with the user or even produce harmful content.

Text Generation

Soft Begging: Modular and Efficient Shielding of LLMs against Prompt Injection and Jailbreaking based on Prompt Tuning

no code implementations3 Jul 2024 Simon Ostermann, Kevin Baum, Christoph Endres, Julia Masloh, Patrick Schramowski

Prompt injection (both direct and indirect) and jailbreaking are now recognized as significant issues for large language models (LLMs), particularly due to their potential for harm in application-integrated contexts.

T-FREE: Subword Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings

1 code implementation27 Jun 2024 Björn Deiseroth, Manuel Brack, Patrick Schramowski, Kristian Kersting, Samuel Weinbach

Tokenizers are crucial for encoding information in Large Language Models, but their development has recently stagnated, and they contain inherent weaknesses.

Cross-Lingual Transfer Transfer Learning

LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models

1 code implementation7 Jun 2024 Lukas Helff, Felix Friedrich, Manuel Brack, Kristian Kersting, Patrick Schramowski

This paper introduces LlavaGuard, a suite of VLM-based vision safeguards that address the critical need for reliable guardrails in the era of large-scale data and models.

Introducing v0.5 of the AI Safety Benchmark from MLCommons

1 code implementation18 Apr 2024 Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Max Bartolo, Borhane Blili-Hamelin, Kurt Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller, Ram Gandikota, Agasthya Gangavarapu, Ananya Gangavarapu, James Gealy, Rajat Ghosh, James Goel, Usman Gohar, Sujata Goswami, Scott A. Hale, Wiebke Hutiri, Joseph Marvin Imperial, Surgan Jandial, Nick Judd, Felix Juefei-Xu, Foutse khomh, Bhavya Kailkhura, Hannah Rose Kirk, Kevin Klyman, Chris Knotz, Michael Kuchnik, Shachi H. Kumar, Srijan Kumar, Chris Lengerich, Bo Li, Zeyi Liao, Eileen Peters Long, Victor Lu, Sarah Luger, Yifan Mai, Priyanka Mary Mammen, Kelvin Manyeki, Sean McGregor, Virendra Mehta, Shafee Mohammed, Emanuel Moss, Lama Nachman, Dinesh Jinenhally Naganna, Amin Nikanjam, Besmira Nushi, Luis Oala, Iftach Orr, Alicia Parrish, Cigdem Patlak, William Pietri, Forough Poursabzi-Sangdeh, Eleonora Presani, Fabrizio Puletti, Paul Röttger, Saurav Sahay, Tim Santos, Nino Scherrer, Alice Schoenauer Sebag, Patrick Schramowski, Abolfazl Shahbazi, Vin Sharma, Xudong Shen, Vamsi Sistla, Leonard Tang, Davide Testuggine, Vithursan Thangarasa, Elizabeth Anne Watkins, Rebecca Weiss, Chris Welty, Tyler Wilbers, Adina Williams, Carole-Jean Wu, Poonam Yadav, Xianjun Yang, Yi Zeng, Wenhui Zhang, Fedor Zhdanov, Jiacheng Zhu, Percy Liang, Peter Mattson, Joaquin Vanschoren

We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0. 5 benchmark.

DeiSAM: Segment Anything with Deictic Prompting

1 code implementation21 Feb 2024 Hikaru Shindo, Manuel Brack, Gopika Sudhakaran, Devendra Singh Dhami, Patrick Schramowski, Kristian Kersting

To remedy this issue, we propose DeiSAM -- a combination of large pre-trained neural networks with differentiable logic reasoners -- for deictic promptable segmentation.

Image Segmentation Segmentation +1

Distilling Adversarial Prompts from Safety Benchmarks: Report for the Adversarial Nibbler Challenge

no code implementations20 Sep 2023 Manuel Brack, Patrick Schramowski, Kristian Kersting

Text-conditioned image generation models have recently achieved astonishing image quality and alignment results.

Image Generation

Mitigating Inappropriateness in Image Generation: Can there be Value in Reflecting the World's Ugliness?

no code implementations28 May 2023 Manuel Brack, Felix Friedrich, Patrick Schramowski, Kristian Kersting

Text-conditioned image generation models have recently achieved astonishing results in image quality and text alignment and are consequently employed in a fast-growing number of applications.

Image Generation

Class Attribute Inference Attacks: Inferring Sensitive Class Information by Diffusion-Based Attribute Manipulations

1 code implementation16 Mar 2023 Lukas Struppek, Dominik Hintersdorf, Felix Friedrich, Manuel Brack, Patrick Schramowski, Kristian Kersting

Neural network-based image classifiers are powerful tools for computer vision tasks, but they inadvertently reveal sensitive attribute information about their classes, raising concerns about their privacy.

Attribute Face Recognition +2

Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness

1 code implementation7 Feb 2023 Felix Friedrich, Manuel Brack, Lukas Struppek, Dominik Hintersdorf, Patrick Schramowski, Sasha Luccioni, Kristian Kersting

Generative AI models have recently achieved astonishing results in quality and are consequently employed in a fast-growing number of applications.

Fairness Text-to-Image Generation

AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation

1 code implementation NeurIPS 2023 Björn Deiseroth, Mayukh Deb, Samuel Weinbach, Manuel Brack, Patrick Schramowski, Kristian Kersting

Generative transformer models have become increasingly complex, with large numbers of parameters and the ability to process multiple input modalities.

The Stable Artist: Steering Semantics in Diffusion Latent Space

2 code implementations12 Dec 2022 Manuel Brack, Patrick Schramowski, Felix Friedrich, Dominik Hintersdorf, Kristian Kersting

Large, text-conditioned generative diffusion models have recently gained a lot of attention for their impressive performance in generating high-fidelity images from text alone.

Image Generation

Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models

2 code implementations CVPR 2023 Patrick Schramowski, Manuel Brack, Björn Deiseroth, Kristian Kersting

Text-conditioned image generation models have recently achieved astonishing results in image quality and text alignment and are consequently employed in a fast-growing number of applications.

Image Generation Image to text

Revision Transformers: Instructing Language Models to Change their Values

1 code implementation19 Oct 2022 Felix Friedrich, Wolfgang Stammer, Patrick Schramowski, Kristian Kersting

In this work, we question the current common practice of storing all information in the model parameters and propose the Revision Transformer (RiT) to facilitate easy model updating.

Information Retrieval Retrieval +1

Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis

2 code implementations19 Sep 2022 Lukas Struppek, Dominik Hintersdorf, Felix Friedrich, Manuel Brack, Patrick Schramowski, Kristian Kersting

Models for text-to-image synthesis, such as DALL-E~2 and Stable Diffusion, have recently drawn a lot of interest from academia and the general public.

Image Generation

Does CLIP Know My Face?

3 code implementations15 Sep 2022 Dominik Hintersdorf, Lukas Struppek, Manuel Brack, Felix Friedrich, Patrick Schramowski, Kristian Kersting

Our large-scale experiments on CLIP demonstrate that individuals used for training can be identified with very high accuracy.

Inference Attack

ILLUME: Rationalizing Vision-Language Models through Human Interactions

1 code implementation17 Aug 2022 Manuel Brack, Patrick Schramowski, Björn Deiseroth, Kristian Kersting

Bootstrapping from pre-trained language models has been proven to be an efficient approach for building vision-language models (VLM) for tasks such as image captioning or visual question answering.

Image Captioning Question Answering +2

Do Multilingual Language Models Capture Differing Moral Norms?

no code implementations18 Mar 2022 Katharina Hämmerl, Björn Deiseroth, Patrick Schramowski, Jindřich Libovický, Alexander Fraser, Kristian Kersting

Massively multilingual sentence representations are trained on large corpora of uncurated data, with a very imbalanced proportion of languages included in the training.

Sentence XLM-R

A Typology for Exploring the Mitigation of Shortcut Behavior

3 code implementations4 Mar 2022 Felix Friedrich, Wolfgang Stammer, Patrick Schramowski, Kristian Kersting

In addition, we discuss existing and introduce novel measures and benchmarks for evaluating the overall abilities of a XIL method.

BIG-bench Machine Learning

Interactive Disentanglement: Learning Concepts by Interacting with their Prototype Representations

1 code implementation CVPR 2022 Wolfgang Stammer, Marius Memmel, Patrick Schramowski, Kristian Kersting

In this work, we show the advantages of prototype representations for understanding and revising the latent space of neural concept learners.

Disentanglement

Inferring Offensiveness In Images From Natural Language Supervision

1 code implementation8 Oct 2021 Patrick Schramowski, Kristian Kersting

Probing or fine-tuning (large-scale) pre-trained models results in state-of-the-art performance for many NLP tasks and, more recently, even for computer vision tasks when combined with image data.

Large Pre-trained Language Models Contain Human-like Biases of What is Right and Wrong to Do

1 code implementation8 Mar 2021 Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting

That is, we show that these norms can be captured geometrically by a direction, which can be computed, e. g., by a PCA, in the embedding space, reflecting well the agreement of phrases to social norms implicitly expressed in the training texts and providing a path for attenuating or even preventing toxic degeneration in LMs.

General Knowledge

Adaptive Rational Activations to Boost Deep Reinforcement Learning

4 code implementations18 Feb 2021 Quentin Delfosse, Patrick Schramowski, Martin Mundt, Alejandro Molina, Kristian Kersting

Latest insights from biology show that intelligence not only emerges from the connections between neurons but that individual neurons shoulder more computational responsibility than previously anticipated.

Ranked #3 on Atari Games on Atari 2600 Skiing (using extra training data)

Atari Games Deep Reinforcement Learning +4

Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations

3 code implementations CVPR 2021 Wolfgang Stammer, Patrick Schramowski, Kristian Kersting

Most explanation methods in deep learning map importance estimates for a model's prediction back to the original input space.

Meta-Learning Runge-Kutta

no code implementations25 Sep 2019 Nadine Behrmann, Patrick Schramowski, Kristian Kersting

However, by studying the characteristics of the local error function we show that including the partial derivatives of the initial value problem is favorable.

Meta-Learning Numerical Integration

Padé Activation Units: End-to-end Learning of Flexible Activation Functions in Deep Networks

5 code implementations ICLR 2020 Alejandro Molina, Patrick Schramowski, Kristian Kersting

The performance of deep network learning strongly depends on the choice of the non-linear activation function associated with each neuron.

Neural Conditional Gradients

no code implementations12 Mar 2018 Patrick Schramowski, Christian Bauckhage, Kristian Kersting

The move from hand-designed to learned optimizers in machine learning has been quite successful for gradient-based and -free optimizers.

Cannot find the paper you are looking for? You can Submit a new open access paper.