Search Results for author: Björn Ommer

Found 55 papers, 35 papers with code

Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions

1 code implementation25 Mar 2024 Stefan Andreas Baumann, Felix Krause, Michael Neumayr, Nick Stracke, Vincent Tao Hu, Björn Ommer

We demonstrate that these directions can be used to augment the prompt text input with fine-grained control over attributes of specific subjects in a compositional manner (control over multiple attributes of a single subject) without having to adapt the diffusion model.

Attribute

ZigMa: A DiT-style Zigzag Mamba Diffusion Model

1 code implementation20 Mar 2024 Vincent Tao Hu, Stefan Andreas Baumann, Ming Gui, Olga Grebenkova, Pingchuan Ma, Johannes Fischer, Björn Ommer

The diffusion model has long been plagued by scalability and quadratic complexity issues, especially within transformer-based structures.

Quantum Denoising Diffusion Models

no code implementations13 Jan 2024 Michael Kölle, Gerhard Stenzel, Jonas Stein, Sebastian Zielinski, Björn Ommer, Claudia Linnhoff-Popien

In recent years, machine learning models like DALL-E, Craiyon, and Stable Diffusion have gained significant attention for their ability to generate high-resolution images from concise descriptions.

Denoising Image Generation +2

Boosting Latent Diffusion with Flow Matching

2 code implementations12 Dec 2023 Johannes S. Fischer, Ming Gui, Pingchuan Ma, Nick Stracke, Stefan A. Baumann, Björn Ommer

We demonstrate that introducing FM between the Diffusion model and the convolutional decoder offers high-resolution image synthesis with reduced computational cost and model size.

Image Generation

State of the Art on Diffusion Models for Visual Computing

no code implementations11 Oct 2023 Ryan Po, Wang Yifan, Vladislav Golyanik, Kfir Aberman, Jonathan T. Barron, Amit H. Bermano, Eric Ryan Chan, Tali Dekel, Aleksander Holynski, Angjoo Kanazawa, C. Karen Liu, Lingjie Liu, Ben Mildenhall, Matthias Nießner, Björn Ommer, Christian Theobalt, Peter Wonka, Gordon Wetzstein

The field of visual computing is rapidly advancing due to the emergence of generative artificial intelligence (AI), which unlocks unprecedented capabilities for the generation, editing, and reconstruction of images, videos, and 3D scenes.

SceneGenie: Scene Graph Guided Diffusion Models for Image Synthesis

no code implementations28 Apr 2023 Azade Farshad, Yousef Yeganeh, Yu Chi, Chengzhi Shen, Björn Ommer, Nassir Navab

To address this limitation, we propose a novel guidance approach for the sampling process in the diffusion model that leverages bounding box and segmentation map information at inference time without additional training data.

Image Generation from Scene Graphs Segmentation +1

Cross-Image-Attention for Conditional Embeddings in Deep Metric Learning

no code implementations CVPR 2023 Dmytro Kotovenko, Pingchuan Ma, Timo Milbich, Björn Ommer

Experiments on established DML benchmarks show that our cross-attention conditional embedding during training improves the underlying standard DML pipeline significantly so that it outperforms the state-of-the-art.

Metric Learning

Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models

1 code implementation26 Jul 2022 Robin Rombach, Andreas Blattmann, Björn Ommer

In RDMs, a set of nearest neighbors is retrieved from an external database during training for each training instance, and the diffusion model is conditioned on these informative samples.

Image Generation Prompt Engineering +1

ArtFID: Quantitative Evaluation of Neural Style Transfer

1 code implementation25 Jul 2022 Matthias Wright, Björn Ommer

The field of neural style transfer has experienced a surge of research exploring different avenues ranging from optimization-based approaches and feed-forward models to meta-learning methods.

Benchmarking Meta-Learning +1

Semi-Parametric Neural Image Synthesis

2 code implementations25 Apr 2022 Andreas Blattmann, Robin Rombach, Kaan Oktay, Jonas Müller, Björn Ommer

Much of this success is due to the scalability of these architectures and hence caused by a dramatic increase in model complexity and in the computational resources invested in training these models.

Image Generation Retrieval

High-Resolution Image Synthesis with Latent Diffusion Models

32 code implementations CVPR 2022 Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, Björn Ommer

By decomposing the image formation process into a sequential application of denoising autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image data and beyond.

Denoising Image Inpainting +5

Unsupervised View-Invariant Human Posture Representation

no code implementations17 Sep 2021 Faegheh Sardari, Björn Ommer, Majid Mirmehdi

Most recent view-invariant action recognition and performance assessment approaches rely on a large amount of annotated 3D skeleton data to extract view-invariant features.

3D Action Recognition 3D Pose Estimation +4

Improving Deep Metric Learning by Divide and Conquer

1 code implementation9 Sep 2021 Artsiom Sanakoyeu, Pingchuan Ma, Vadim Tschernezki, Björn Ommer

We propose to build a more expressive representation by jointly splitting the embedding space and the data hierarchically into smaller sub-parts.

Image Retrieval Metric Learning +1

ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis

no code implementations NeurIPS 2021 Patrick Esser, Robin Rombach, Andreas Blattmann, Björn Ommer

Thus, in contrast to pure autoregressive models, it can solve free-form image inpainting and, in the case of conditional models, local, text-guided image modification without requiring mask-specific training.

Image Inpainting Text-to-Image Generation

Characterizing Generalization under Out-Of-Distribution Shifts in Deep Metric Learning

2 code implementations NeurIPS 2021 Timo Milbich, Karsten Roth, Samarth Sinha, Ludwig Schmidt, Marzyeh Ghassemi, Björn Ommer

Finally, we propose few-shot DML as an efficient way to consistently improve generalization in response to unknown test shifts presented in ooDML.

Metric Learning

Object Retrieval and Localization in Large Art Collections using Deep Multi-Style Feature Fusion and Iterative Voting

no code implementations14 Jul 2021 Nikolai Ufer, Sabine Lang, Björn Ommer

In the following, we introduce an algorithm that allows users to search for image regions containing specific motifs or objects and find similar regions in an extensive dataset, helping art historians to analyze large digitized art collections.

Retrieval

iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis

2 code implementations ICCV 2021 Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Björn Ommer

There will be distinctive movement, despite evident variations caused by the stochastic nature of our world.

Object

Understanding Object Dynamics for Interactive Image-to-Video Synthesis

1 code implementation CVPR 2021 Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Björn Ommer

Given a static image of an object and a local poking of a pixel, the approach then predicts how the object would deform over time.

Object Video Prediction

High-Resolution Complex Scene Synthesis with Transformers

no code implementations13 May 2021 Manuel Jahn, Robin Rombach, Björn Ommer

The use of coarse-grained layouts for controllable synthesis of complex scene images via deep generative models has recently gained popularity.

Vocal Bursts Intensity Prediction

Stochastic Image-to-Video Synthesis using cINNs

1 code implementation CVPR 2021 Michael Dorkenwald, Timo Milbich, Andreas Blattmann, Robin Rombach, Konstantinos G. Derpanis, Björn Ommer

Video understanding calls for a model to learn the characteristic interplay between static scene content and its dynamics: Given an image, the model must be able to predict a future progression of the portrayed scene and, conversely, a video should be explained in terms of its static image content and all the remaining characteristics not present in the initial frame.

Video Understanding

Behavior-Driven Synthesis of Human Dynamics

1 code implementation CVPR 2021 Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Björn Ommer

Using this representation, we are able to change the behavior of a person depicted in an arbitrary posture, or to even directly transfer behavior observed in a given video sequence.

Human Dynamics

Shape or Texture: Disentangling Discriminative Features in CNNs

no code implementations ICLR 2021 Md Amirul Islam, Matthew Kowal, Patrick Esser, Sen Jia, Björn Ommer, Konstantinos G. Derpanis, Neil Bruce

Contrasting the previous evidence that neurons in the later layers of a Convolutional Neural Network (CNN) respond to complex object shapes, recent studies have shown that CNNs actually exhibit a 'texture bias': given an image with both texture and shape cues (e. g., a stylized image), a CNN is biased towards predicting the category corresponding to the texture.

Taming Transformers for High-Resolution Image Synthesis

12 code implementations CVPR 2021 Patrick Esser, Robin Rombach, Björn Ommer

We demonstrate how combining the effectiveness of the inductive bias of CNNs with the expressivity of transformers enables them to model and thereby synthesize high-resolution images.

DeepFake Detection Image Outpainting +4

A Note on Data Biases in Generative Models

1 code implementation4 Dec 2020 Patrick Esser, Robin Rombach, Björn Ommer

It is tempting to think that machines are less prone to unfairness and prejudice.

S2SD: Simultaneous Similarity-based Self-Distillation for Deep Metric Learning

1 code implementation17 Sep 2020 Karsten Roth, Timo Milbich, Björn Ommer, Joseph Paul Cohen, Marzyeh Ghassemi

Deep Metric Learning (DML) provides a crucial tool for visual similarity and zero-shot applications by learning generalizing embedding spaces, although recent work in DML has shown strong performance saturation across training objectives.

Ranked #10 on Metric Learning on CARS196 (using extra training data)

Knowledge Distillation Metric Learning +1

Unsupervised Part Discovery by Unsupervised Disentanglement

1 code implementation9 Sep 2020 Sandro Braun, Patrick Esser, Björn Ommer

Our approach leverages a generative model consisting of two disentangled representations for an object's shape and appearance and a latent variable for the part segmentation.

Disentanglement Segmentation

Making Sense of CNNs: Interpreting Deep Representations & Their Invariances with INNs

1 code implementation ECCV 2020 Robin Rombach, Patrick Esser, Björn Ommer

To open such a black box, it is, therefore, crucial to uncover the different semantic concepts a model has learned as well as those that it has learned to be invariant to.

Network-to-Network Translation with Conditional Invertible Neural Networks

1 code implementation NeurIPS 2020 Robin Rombach, Patrick Esser, Björn Ommer

Given the ever-increasing computational costs of modern machine learning models, we need to find new ways to reuse such expert models and thus tap into the resources that have been invested in their creation.

Image-to-Image Translation Text-to-Image Generation +1

A Disentangling Invertible Interpretation Network for Explaining Latent Representations

2 code implementations CVPR 2020 Patrick Esser, Robin Rombach, Björn Ommer

We formulate interpretation as a translation of hidden representations onto semantic concepts that are comprehensible to the user.

Image Generation Image Manipulation

Sharing Matters for Generalization in Deep Metric Learning

no code implementations12 Apr 2020 Timo Milbich, Karsten Roth, Biagio Brattoli, Björn Ommer

The common paradigm is discriminative metric learning, which seeks an embedding that separates different training classes.

Metric Learning

PADS: Policy-Adapted Sampling for Visual Similarity Learning

1 code implementation CVPR 2020 Karsten Roth, Timo Milbich, Björn Ommer

Learning visual similarity requires to learn relations, typically between triplets of images.

Ranked #17 on Metric Learning on CUB-200-2011 (using extra training data)

Metric Learning

A Content Transformation Block For Image Style Transfer

1 code implementation CVPR 2019 Dmytro Kotovenko, Artsiom Sanakoyeu, Pingchuan Ma, Sabine Lang, Björn Ommer

Recent work has significantly improved the representation of color and texture and computational speed and image resolution.

Image Generation Style Transfer

Revisiting Training Strategies and Generalization Performance in Deep Metric Learning

8 code implementations ICML 2020 Karsten Roth, Timo Milbich, Samarth Sinha, Prateek Gupta, Björn Ommer, Joseph Paul Cohen

Deep Metric Learning (DML) is arguably one of the most influential lines of research for learning visual similarities with many proposed approaches every year.

Metric Learning

Unsupervised Representation Learning by Discovering Reliable Image Relations

no code implementations18 Nov 2019 Timo Milbich, Omair Ghori, Ferran Diego, Björn Ommer

To nevertheless find those relations which can be reliably utilized for learning, we follow a divide-and-conquer strategy: We find reliable similarities by extracting compact groups of images and reliable dissimilarities by partitioning these groups into subsets, converting the complicated overall problem into few reliable local subproblems.

Representation Learning Transfer Learning

Unsupervised Robust Disentangling of Latent Characteristics for Image Synthesis

no code implementations ICCV 2019 Patrick Esser, Johannes Haux, Björn Ommer

In experiments on diverse object categories, the approach successfully recombines pose and appearance to reconstruct and retarget novel synthesized images.

Disentanglement Image Generation +1

MIC: Mining Interclass Characteristics for Improved Metric Learning

2 code implementations ICCV 2019 Karsten Roth, Biagio Brattoli, Björn Ommer

In contrast, we propose to explicitly learn the latent characteristics that are shared by and go across object classes.

Ranked #19 on Metric Learning on CUB-200-2011 (using extra training data)

Image Retrieval Metric Learning +1

Divide and Conquer the Embedding Space for Metric Learning

1 code implementation CVPR 2019 Artsiom Sanakoyeu, Vadim Tschernezki, Uta Büchler, Björn Ommer

Approaches for learning a single distance metric often struggle to encode all different types of relationships and do not generalize well.

Clustering Metric Learning +1

Cross and Learn: Cross-Modal Self-Supervision

1 code implementation9 Nov 2018 Nawid Sayed, Biagio Brattoli, Björn Ommer

In this paper we present a self-supervised method for representation learning utilizing two different modalities.

Action Recognition Optical Flow Estimation +3

Improving Spatiotemporal Self-Supervision by Deep Reinforcement Learning

no code implementations ECCV 2018 Uta Büchler, Biagio Brattoli, Björn Ommer

Self-supervised learning of convolutional neural networks can harness large amounts of cheap unlabeled data to train powerful feature representations.

General Classification reinforcement-learning +5

A Style-Aware Content Loss for Real-time HD Style Transfer

9 code implementations ECCV 2018 Artsiom Sanakoyeu, Dmytro Kotovenko, Sabine Lang, Björn Ommer

These and our qualitative results ranging from small image patches to megapixel stylistic images and videos show that our approach better captures the subtle nature in which a style affects content.

Image Stylization Video Style Transfer

Deep Unsupervised Learning of Visual Similarities

no code implementations22 Feb 2018 Artsiom Sanakoyeu, Miguel A. Bautista, Björn Ommer

Exemplar learning of visual similarities in an unsupervised manner is a problem of paramount importance to Computer Vision.

Self-supervised Learning of Pose Embeddings from Spatiotemporal Relations in Videos

no code implementations ICCV 2017 Ömer Sümer, Tobias Dencker, Björn Ommer

Human pose analysis is presently dominated by deep convolutional networks trained with extensive manual annotations of joint locations and beyond.

Pose Estimation Retrieval +1

Deep Unsupervised Similarity Learning using Partially Ordered Sets

2 code implementations CVPR 2017 Miguel A. Bautista, Artsiom Sanakoyeu, Björn Ommer

Similarity learning is then formulated as a partial ordering task with soft correspondences of all samples to classes.

Pose Estimation

CliqueCNN: Deep Unsupervised Exemplar Learning

1 code implementation NeurIPS 2016 Miguel A. Bautista, Artsiom Sanakoyeu, Ekaterina Sutter, Björn Ommer

Exemplar learning is a powerful paradigm for discovering visual similarities in an unsupervised manner.

Spatio-temporal Video Parsing for Abnormality Detection

no code implementations22 Feb 2015 Borislav Antić, Björn Ommer

The goal of video parsing is to find a set of indispensable normal spatio-temporal object hypotheses that jointly explain all the foreground of a video, while, at the same time, being supported by normal training samples.

Anomaly Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.