Search Results for author: Björn Ommer

Found 55 papers, 35 papers with code

Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions

1 code implementation • 25 Mar 2024 • Stefan Andreas Baumann, Felix Krause, Michael Neumayr, Nick Stracke, Vincent Tao Hu, Björn Ommer

We demonstrate that these directions can be used to augment the prompt text input with fine-grained control over attributes of specific subjects in a compositional manner (control over multiple attributes of a single subject) without having to adapt the diffusion model.

Attribute

Paper
Code

Enabling Visual Composition and Animation in Unsupervised Video Generation

no code implementations • 21 Mar 2024 • Aram Davtyan, Sepehr Sameni, Björn Ommer, Paolo Favaro

We call our model CAGE for visual Composition and Animation for video GEneration.

Video Generation

Paper
Add Code

DepthFM: Fast Monocular Depth Estimation with Flow Matching

no code implementations • 20 Mar 2024 • Ming Gui, Johannes S. Fischer, Ulrich Prestel, Pingchuan Ma, Dmytro Kotovenko, Olga Grebenkova, Stefan Andreas Baumann, Vincent Tao Hu, Björn Ommer

Due to the generative nature of our approach, our model reliably predicts the confidence of its depth estimates.

Monocular Depth Estimation

Paper
Add Code

ZigMa: A DiT-style Zigzag Mamba Diffusion Model

1 code implementation • 20 Mar 2024 • Vincent Tao Hu, Stefan Andreas Baumann, Ming Gui, Olga Grebenkova, Pingchuan Ma, Johannes Fischer, Björn Ommer

The diffusion model has long been plagued by scalability and quadratic complexity issues, especially within transformer-based structures.

123

Paper
Code

On the Challenges and Opportunities in Generative AI

no code implementations • 28 Feb 2024 • Laura Manduchi, Kushagra Pandey, Robert Bamler, Ryan Cotterell, Sina Däubener, Sophie Fellenz, Asja Fischer, Thomas Gärtner, Matthias Kirchler, Marius Kloft, Yingzhen Li, Christoph Lippert, Gerard de Melo, Eric Nalisnick, Björn Ommer, Rajesh Ranganath, Maja Rudolph, Karen Ullrich, Guy Van Den Broeck, Julia E Vogt, Yixin Wang, Florian Wenzel, Frank Wood, Stephan Mandt, Vincent Fortuin

The field of deep generative modeling has grown rapidly and consistently over the years.

Paper
Add Code

Quantum Denoising Diffusion Models

no code implementations • 13 Jan 2024 • Michael Kölle, Gerhard Stenzel, Jonas Stein, Sebastian Zielinski, Björn Ommer, Claudia Linnhoff-Popien

In recent years, machine learning models like DALL-E, Craiyon, and Stable Diffusion have gained significant attention for their ability to generate high-resolution images from concise descriptions.

Denoising Image Generation +2

Paper
Add Code

Boosting Latent Diffusion with Flow Matching

2 code implementations • 12 Dec 2023 • Johannes S. Fischer, Ming Gui, Pingchuan Ma, Nick Stracke, Stefan A. Baumann, Björn Ommer

We demonstrate that introducing FM between the Diffusion model and the convolutional decoder offers high-resolution image synthesis with reduced computational cost and model size.

Image Generation

Paper
Code

State of the Art on Diffusion Models for Visual Computing

no code implementations • 11 Oct 2023 • Ryan Po, Wang Yifan, Vladislav Golyanik, Kfir Aberman, Jonathan T. Barron, Amit H. Bermano, Eric Ryan Chan, Tali Dekel, Aleksander Holynski, Angjoo Kanazawa, C. Karen Liu, Lingjie Liu, Ben Mildenhall, Matthias Nießner, Björn Ommer, Christian Theobalt, Peter Wonka, Gordon Wetzstein

The field of visual computing is rapidly advancing due to the emergence of generative artificial intelligence (AI), which unlocks unprecedented capabilities for the generation, editing, and reconstruction of images, videos, and 3D scenes.

Paper
Add Code

SceneGenie: Scene Graph Guided Diffusion Models for Image Synthesis

no code implementations • 28 Apr 2023 • Azade Farshad, Yousef Yeganeh, Yu Chi, Chengzhi Shen, Björn Ommer, Nassir Navab

To address this limitation, we propose a novel guidance approach for the sampling process in the diffusion model that leverages bounding box and segmentation map information at inference time without additional training data.

Image Generation from Scene Graphs Segmentation +1

Paper
Add Code

Cross-Image-Attention for Conditional Embeddings in Deep Metric Learning

no code implementations • CVPR 2023 • Dmytro Kotovenko, Pingchuan Ma, Timo Milbich, Björn Ommer

Experiments on established DML benchmarks show that our cross-attention conditional embedding during training improves the underlying standard DML pipeline significantly so that it outperforms the state-of-the-art.

Metric Learning

Paper
Add Code

Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models

1 code implementation • 26 Jul 2022 • Robin Rombach, Andreas Blattmann, Björn Ommer

In RDMs, a set of nearest neighbors is retrieved from an external database during training for each training instance, and the diffusion model is conditioned on these informative samples.

Image Generation Prompt Engineering +1

10,543

Paper
Code

ArtFID: Quantitative Evaluation of Neural Style Transfer

1 code implementation • 25 Jul 2022 • Matthias Wright, Björn Ommer

The field of neural style transfer has experienced a surge of research exploring different avenues ranging from optimization-based approaches and feed-forward models to meta-learning methods.

Benchmarking Meta-Learning +1

Paper
Code

Semi-Parametric Neural Image Synthesis

2 code implementations • 25 Apr 2022 • Andreas Blattmann, Robin Rombach, Kaan Oktay, Jonas Müller, Björn Ommer

Much of this success is due to the scalability of these architectures and hence caused by a dramatic increase in model complexity and in the computational resources invested in training these models.

Image Generation Retrieval

10,543

Paper
Code

High-Resolution Image Synthesis with Latent Diffusion Models

32 code implementations • CVPR 2022 • Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, Björn Ommer

By decomposing the image formation process into a sequential application of denoising autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image data and beyond.

Ranked #2 on Layout-to-Image Generation on COCO-Stuff 256x256

Denoising Image Inpainting +5

65,347

Paper
Code

Unsupervised View-Invariant Human Posture Representation

no code implementations • 17 Sep 2021 • Faegheh Sardari, Björn Ommer, Majid Mirmehdi

Most recent view-invariant action recognition and performance assessment approaches rely on a large amount of annotated 3D skeleton data to extract view-invariant features.

3D Action Recognition 3D Pose Estimation +4

Paper
Add Code

Improving Deep Metric Learning by Divide and Conquer

1 code implementation • 9 Sep 2021 • Artsiom Sanakoyeu, Pingchuan Ma, Vadim Tschernezki, Björn Ommer

We propose to build a more expressive representation by jointly splitting the embedding space and the data hierarchically into smaller sub-parts.

Image Retrieval Metric Learning +1

Paper
Code

ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis

no code implementations • NeurIPS 2021 • Patrick Esser, Robin Rombach, Andreas Blattmann, Björn Ommer

Thus, in contrast to pure autoregressive models, it can solve free-form image inpainting and, in the case of conditional models, local, text-guided image modification without requiring mask-specific training.

Ranked #4 on Text-to-Image Generation on Conceptual Captions

Image Inpainting Text-to-Image Generation

Paper
Add Code

Characterizing Generalization under Out-Of-Distribution Shifts in Deep Metric Learning

2 code implementations • NeurIPS 2021 • Timo Milbich, Karsten Roth, Samarth Sinha, Ludwig Schmidt, Marzyeh Ghassemi, Björn Ommer

Finally, we propose few-shot DML as an efficient way to consistently improve generalization in response to unknown test shifts presented in ooDML.

Metric Learning

Paper
Code

Object Retrieval and Localization in Large Art Collections using Deep Multi-Style Feature Fusion and Iterative Voting

no code implementations • 14 Jul 2021 • Nikolai Ufer, Sabine Lang, Björn Ommer

In the following, we introduce an algorithm that allows users to search for image regions containing specific motifs or objects and find similar regions in an extensive dataset, helping art historians to analyze large digitized art collections.

Retrieval

Paper
Add Code

iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis

2 code implementations • ICCV 2021 • Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Björn Ommer

There will be distinctive movement, despite evident variations caused by the stochastic nature of our world.

Object

Paper
Code

Understanding Object Dynamics for Interactive Image-to-Video Synthesis

1 code implementation • CVPR 2021 • Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Björn Ommer

Given a static image of an object and a local poking of a pixel, the approach then predicts how the object would deform over time.

Object Video Prediction

Paper
Code

High-Resolution Complex Scene Synthesis with Transformers

no code implementations • 13 May 2021 • Manuel Jahn, Robin Rombach, Björn Ommer

The use of coarse-grained layouts for controllable synthesis of complex scene images via deep generative models has recently gained popularity.

Vocal Bursts Intensity Prediction

Paper
Add Code

Stochastic Image-to-Video Synthesis using cINNs

1 code implementation • CVPR 2021 • Michael Dorkenwald, Timo Milbich, Andreas Blattmann, Robin Rombach, Konstantinos G. Derpanis, Björn Ommer

Video understanding calls for a model to learn the characteristic interplay between static scene content and its dynamics: Given an image, the model must be able to predict a future progression of the portrayed scene and, conversely, a video should be explained in terms of its static image content and all the remaining characteristics not present in the initial frame.

Video Understanding

179

Paper
Code

Geometry-Free View Synthesis: Transformers and no 3D Priors

1 code implementation • ICCV 2021 • Robin Rombach, Patrick Esser, Björn Ommer

Is a geometric model required to synthesize novel views from a single image?

Ranked #1 on Novel View Synthesis on RealEstate10K

Novel View Synthesis

361

Paper
Code

Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes

2 code implementations • CVPR 2021 • Dmytro Kotovenko, Matthias Wright, Arthur Heimbrecht, Björn Ommer

There have been many successful implementations of neural style transfer in recent years.

Style Transfer

160

Paper
Code

Behavior-Driven Synthesis of Human Dynamics

1 code implementation • CVPR 2021 • Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Björn Ommer

Using this representation, we are able to change the behavior of a person depicted in an arbitrary posture, or to even directly transfer behavior observed in a given video sequence.

Human Dynamics

Paper
Code

Shape or Texture: Disentangling Discriminative Features in CNNs

no code implementations • ICLR 2021 • Md Amirul Islam, Matthew Kowal, Patrick Esser, Sen Jia, Björn Ommer, Konstantinos G. Derpanis, Neil Bruce

Contrasting the previous evidence that neurons in the later layers of a Convolutional Neural Network (CNN) respond to complex object shapes, recent studies have shown that CNNs actually exhibit a 'texture bias': given an image with both texture and shape cues (e. g., a stylized image), a CNN is biased towards predicting the category corresponding to the texture.

Paper
Add Code

Taming Transformers for High-Resolution Image Synthesis

12 code implementations • CVPR 2021 • Patrick Esser, Robin Rombach, Björn Ommer

We demonstrate how combining the effectiveness of the inductive bias of CNNs with the expressivity of transformers enables them to model and thereby synthesize high-resolution images.

Ranked #3 on Text-to-Image Generation on LHQC

DeepFake Detection Image Outpainting +4

5,356

Paper
Code

A Note on Data Biases in Generative Models

1 code implementation • 4 Dec 2020 • Patrick Esser, Robin Rombach, Björn Ommer

It is tempting to think that machines are less prone to unfairness and prejudice.

219

Paper
Code

S2SD: Simultaneous Similarity-based Self-Distillation for Deep Metric Learning

1 code implementation • 17 Sep 2020 • Karsten Roth, Timo Milbich, Björn Ommer, Joseph Paul Cohen, Marzyeh Ghassemi

Deep Metric Learning (DML) provides a crucial tool for visual similarity and zero-shot applications by learning generalizing embedding spaces, although recent work in DML has shown strong performance saturation across training objectives.

Ranked #10 on Metric Learning on CARS196 (using extra training data)

Knowledge Distillation Metric Learning +1

Paper
Code

Unsupervised Part Discovery by Unsupervised Disentanglement

1 code implementation • 9 Sep 2020 • Sandro Braun, Patrick Esser, Björn Ommer

Our approach leverages a generative model consisting of two disentangled representations for an object's shape and appearance and a latent variable for the part segmentation.

Disentanglement Segmentation

Paper
Code

Making Sense of CNNs: Interpreting Deep Representations & Their Invariances with INNs

1 code implementation • ECCV 2020 • Robin Rombach, Patrick Esser, Björn Ommer

To open such a black box, it is, therefore, crucial to uncover the different semantic concepts a model has learned as well as those that it has learned to be invariant to.

Paper
Code

Network-to-Network Translation with Conditional Invertible Neural Networks

1 code implementation • NeurIPS 2020 • Robin Rombach, Patrick Esser, Björn Ommer

Given the ever-increasing computational costs of modern machine learning models, we need to find new ways to reuse such expert models and thus tap into the resources that have been invested in their creation.

Image-to-Image Translation Text-to-Image Generation +1

219

Paper
Code

DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning

2 code implementations • ECCV 2020 • Timo Milbich, Karsten Roth, Homanga Bharadhwaj, Samarth Sinha, Yoshua Bengio, Björn Ommer, Joseph Paul Cohen

Visual Similarity plays an important role in many computer vision applications.

Ranked #13 on Metric Learning on CUB-200-2011 (using extra training data)

Metric Learning

Paper
Code

A Disentangling Invertible Interpretation Network for Explaining Latent Representations

2 code implementations • CVPR 2020 • Patrick Esser, Robin Rombach, Björn Ommer

We formulate interpretation as a translation of hidden representations onto semantic concepts that are comprehensible to the user.

Image Generation Image Manipulation

120

Paper
Code

Sharing Matters for Generalization in Deep Metric Learning

no code implementations • 12 Apr 2020 • Timo Milbich, Karsten Roth, Biagio Brattoli, Björn Ommer

The common paradigm is discriminative metric learning, which seeks an embedding that separates different training classes.

Metric Learning

Paper
Add Code

Learning Multi-Scale Photo Exposure Correction

2 code implementations • CVPR 2021 • Mahmoud Afifi, Konstantinos G. Derpanis, Björn Ommer, Michael S. Brown

In contrast, our proposed method targets both over- and underexposure errors in photographs.

Ranked #3 on Image Enhancement on Exposure-Errors

Image Enhancement

482

Paper
Code

PADS: Policy-Adapted Sampling for Visual Similarity Learning

1 code implementation • CVPR 2020 • Karsten Roth, Timo Milbich, Björn Ommer

Learning visual similarity requires to learn relations, typically between triplets of images.

Ranked #17 on Metric Learning on CUB-200-2011 (using extra training data)

Metric Learning

Paper
Code

A Content Transformation Block For Image Style Transfer

1 code implementation • CVPR 2019 • Dmytro Kotovenko, Artsiom Sanakoyeu, Pingchuan Ma, Sabine Lang, Björn Ommer

Recent work has significantly improved the representation of color and texture and computational speed and image resolution.

Image Generation Style Transfer

Paper
Code

Revisiting Training Strategies and Generalization Performance in Deep Metric Learning

8 code implementations • ICML 2020 • Karsten Roth, Timo Milbich, Samarth Sinha, Prateek Gupta, Björn Ommer, Joseph Paul Cohen

Deep Metric Learning (DML) is arguably one of the most influential lines of research for learning visual similarities with many proposed approaches every year.

Metric Learning

570

Paper
Code

Unsupervised Representation Learning by Discovering Reliable Image Relations

no code implementations • 18 Nov 2019 • Timo Milbich, Omair Ghori, Ferran Diego, Björn Ommer

To nevertheless find those relations which can be reliably utilized for learning, we follow a divide-and-conquer strategy: We find reliable similarities by extracting compact groups of images and reliable dissimilarities by partitioning these groups into subsets, converting the complicated overall problem into few reliable local subproblems.

Representation Learning Transfer Learning

Paper
Add Code

Unsupervised Robust Disentangling of Latent Characteristics for Image Synthesis

no code implementations • ICCV 2019 • Patrick Esser, Johannes Haux, Björn Ommer

In experiments on diverse object categories, the approach successfully recombines pose and appearance to reconstruct and retarget novel synthesized images.

Disentanglement Image Generation +1

Paper
Add Code

MIC: Mining Interclass Characteristics for Improved Metric Learning

2 code implementations • ICCV 2019 • Karsten Roth, Biagio Brattoli, Björn Ommer

In contrast, we propose to explicitly learn the latent characteristics that are shared by and go across object classes.

Ranked #19 on Metric Learning on CUB-200-2011 (using extra training data)

Image Retrieval Metric Learning +1

Paper
Code

Multi-Scale Convolutions for Learning Context Aware Feature Representations

no code implementations • 17 Jun 2019 • Nikolai Ufer, Kam To Lui, Katja Schwarz, Paul Warkentin, Björn Ommer

Finding semantic correspondences is a challenging problem.

Metric Learning

Paper
Add Code

Divide and Conquer the Embedding Space for Metric Learning

1 code implementation • CVPR 2019 • Artsiom Sanakoyeu, Vadim Tschernezki, Uta Büchler, Björn Ommer

Approaches for learning a single distance metric often struggle to encode all different types of relationships and do not generalize well.

Clustering Metric Learning +1

262

Paper
Code

Unsupervised Part-Based Disentangling of Object Shape and Appearance

2 code implementations • CVPR 2019 • Dominik Lorenz, Leonard Bereska, Timo Milbich, Björn Ommer

Large intra-class variation is the result of changes in multiple object characteristics.

Ranked #3 on Unsupervised Human Pose Estimation on Human3.6M

Image Generation Object +4

Paper
Code

Cross and Learn: Cross-Modal Self-Supervision

1 code implementation • 9 Nov 2018 • Nawid Sayed, Biagio Brattoli, Björn Ommer

In this paper we present a self-supervised method for representation learning utilizing two different modalities.

Action Recognition Optical Flow Estimation +3

Paper
Code

Improving Spatiotemporal Self-Supervision by Deep Reinforcement Learning

no code implementations • ECCV 2018 • Uta Büchler, Biagio Brattoli, Björn Ommer

Self-supervised learning of convolutional neural networks can harness large amounts of cheap unlabeled data to train powerful feature representations.

General Classification reinforcement-learning +5

Paper
Add Code

A Style-Aware Content Loss for Real-time HD Style Transfer

9 code implementations • ECCV 2018 • Artsiom Sanakoyeu, Dmytro Kotovenko, Sabine Lang, Björn Ommer

These and our qualitative results ranging from small image patches to megapixel stylistic images and videos show that our approach better captures the subtle nature in which a style affects content.

Image Stylization Video Style Transfer

722

Paper
Code

A Variational U-Net for Conditional Appearance and Shape Generation

2 code implementations • CVPR 2018 • Patrick Esser, Ekaterina Sutter, Björn Ommer

Experiments show that the model enables conditional image generation and transfer.

Conditional Image Generation

497

Paper
Code

Deep Unsupervised Learning of Visual Similarities

no code implementations • 22 Feb 2018 • Artsiom Sanakoyeu, Miguel A. Bautista, Björn Ommer

Exemplar learning of visual similarities in an unsupervised manner is a problem of paramount importance to Computer Vision.

Paper
Add Code

Self-supervised Learning of Pose Embeddings from Spatiotemporal Relations in Videos

no code implementations • ICCV 2017 • Ömer Sümer, Tobias Dencker, Björn Ommer

Human pose analysis is presently dominated by deep convolutional networks trained with extensive manual annotations of joint locations and beyond.

Pose Estimation Retrieval +1

Paper
Add Code

Deep Unsupervised Similarity Learning using Partially Ordered Sets

2 code implementations • CVPR 2017 • Miguel A. Bautista, Artsiom Sanakoyeu, Björn Ommer

Similarity learning is then formulated as a partial ordering task with soft correspondences of all samples to classes.

Pose Estimation

143

Paper
Code

CliqueCNN: Deep Unsupervised Exemplar Learning

1 code implementation • NeurIPS 2016 • Miguel A. Bautista, Artsiom Sanakoyeu, Ekaterina Sutter, Björn Ommer

Exemplar learning is a powerful paradigm for discovering visual similarities in an unsupervised manner.

Paper
Code

Spatio-temporal Video Parsing for Abnormality Detection

no code implementations • 22 Feb 2015 • Borislav Antić, Björn Ommer

The goal of video parsing is to find a set of indispensable normal spatio-temporal object hypotheses that jointly explain all the foreground of a video, while, at the same time, being supported by normal training samples.

Anomaly Detection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.