1 code implementation • 25 Mar 2024 • Stefan Andreas Baumann, Felix Krause, Michael Neumayr, Nick Stracke, Vincent Tao Hu, Björn Ommer
We demonstrate that these directions can be used to augment the prompt text input with fine-grained control over attributes of specific subjects in a compositional manner (control over multiple attributes of a single subject) without having to adapt the diffusion model.
1 code implementation • 20 Mar 2024 • Vincent Tao Hu, Stefan Andreas Baumann, Ming Gui, Olga Grebenkova, Pingchuan Ma, Johannes Fischer, Björn Ommer
The diffusion model has long been plagued by scalability and quadratic complexity issues, especially within transformer-based structures.
no code implementations • 20 Mar 2024 • Ming Gui, Johannes S. Fischer, Ulrich Prestel, Pingchuan Ma, Dmytro Kotovenko, Olga Grebenkova, Stefan Andreas Baumann, Vincent Tao Hu, Björn Ommer
Due to the generative nature of our approach, our model reliably predicts the confidence of its depth estimates.
1 code implementation • 16 Feb 2024 • Divin Yan, Lu Qi, Vincent Tao Hu, Ming-Hsuan Yang, Meng Tang
To address the observed appearance overlap between synthesized images of rare classes and tail classes, we propose a method based on contrastive learning to minimize the overlap between distributions of synthetic images for different classes.
no code implementations • 17 Dec 2023 • Vincent Tao Hu, David W Zhang, Pascal Mettes, Meng Tang, Deli Zhao, Cees G. M. Snoek
Flow Matching is an emerging generative modeling technique that offers the advantage of simple and efficient training.
no code implementations • 14 Dec 2023 • Vincent Tao Hu, Wenzhe Yin, Pingchuan Ma, Yunlu Chen, Basura Fernando, Yuki M Asano, Efstratios Gavves, Pascal Mettes, Bjorn Ommer, Cees G. M. Snoek
In this paper, we propose \emph{Motion Flow Matching}, a novel generative model designed for human motion generation featuring efficient sampling and effectiveness in motion editing applications.
no code implementations • 14 Dec 2023 • Vincent Tao Hu, Yunlu Chen, Mathilde Caron, Yuki M. Asano, Cees G. M. Snoek, Bjorn Ommer
However, recent studies have revealed that the feature representation derived from diffusion model itself is discriminative for numerous downstream tasks as well, which prompts us to propose a framework to extract guidance from, and specifically for, diffusion models.
no code implementations • 28 Nov 2023 • Jacob Schnell, Jieke Wang, Lu Qi, Vincent Tao Hu, Meng Tang
We propose a generative data augmentation method that leverages a ControlNet diffusion model conditioned on semantic scribbles to produce high-quality training data.
no code implementations • 24 Nov 2023 • Eslam Mohamed BAKR, Liangbing Zhao, Vincent Tao Hu, Matthieu Cord, Patrick Perez, Mohamed Elhoseiny
Diffusion-based generative models excel in perceptually impressive synthesis but face challenges in interpretability.
1 code implementation • CVPR 2023 • Vincent Tao Hu, David W Zhang, Yuki M. Asano, Gertjan J. Burghouts, Cees G. M. Snoek
Diffusion models have demonstrated remarkable progress in image generation quality, especially when guidance is used to control the generative process.
1 code implementation • 18 Jun 2021 • Martine Toering, Ioannis Gatopoulos, Maarten Stol, Vincent Tao Hu
Instance-level contrastive learning techniques, which rely on data augmentation and a contrastive loss function, have found great success in the domain of visual representation learning.
Ranked #3 on Self-supervised Video Retrieval on HMDB51
1 code implementation • ECCV 2020 • Yunlu Chen, Vincent Tao Hu, Efstratios Gavves, Thomas Mensink, Pascal Mettes, Pengwan Yang, Cees G. M. Snoek
In this paper, we define data augmentation between point clouds as a shortest path linear interpolation.
Ranked #3 on 3D Point Cloud Data Augmentation on ModelNet40
3D Point Cloud Classification 3D Point Cloud Data Augmentation +2
1 code implementation • ECCV 2020 • Pengwan Yang, Vincent Tao Hu, Pascal Mettes, Cees G. M. Snoek
The start and end of an action in a long untrimmed video is determined based on just a hand-full of trimmed video examples containing the same action, without knowing their common class label.