no code implementations • 23 Dec 2022 • Hang Li, Jindong Gu, Rajat Koner, Sahand Sharifzadeh, Volker Tresp
In this work, we argue that the best text or caption for a given image is the text which would generate the image which is the most similar to that image.
1 code implementation • 22 Aug 2022 • Rajat Koner, Tanveer Hannan, Suprosanna Shit, Sahand Sharifzadeh, Matthias Schubert, Thomas Seidl, Volker Tresp
We propose three novel components to model short-term and long-term dependency and temporal coherence.
Ranked #1 on Video Instance Segmentation on Youtube-VIS 2022 Validation (using extra training data)
1 code implementation • 19 Mar 2022 • Suprosanna Shit, Rajat Koner, Bastian Wittmann, Johannes Paetzold, Ivan Ezhov, Hongwei Li, Jiazhen Pan, Sahand Sharifzadeh, Georgios Kaissis, Volker Tresp, Bjoern Menze
We leverage direct set-based object prediction and incorporate the interaction among the objects to learn an object-relation representation jointly.
no code implementations • 16 Mar 2022 • Poulami Sinhamahapatra, Rajat Koner, Karsten Roscher, Stephan Günnemann
It is essential for safety-critical applications of deep neural networks to determine when new inputs are significantly different from the training distribution.
1 code implementation • 14 Feb 2022 • Tanveer Hannan, Rajat Koner, Jonathan Kobold, Matthias Schubert
Video Object Segmentation (VOS) has been targeted by various fully-supervised and self-supervised approaches.
1 code implementation • 19 Jul 2021 • Rajat Koner, Poulami Sinhamahapatra, Karsten Roscher, Stephan Günnemann, Volker Tresp
A serious problem in image classification is that a trained model might perform well for input data that originates from the same distribution as the data available for model training, but performs much worse for out-of-distribution (OOD) samples.
1 code implementation • 13 Jul 2021 • Rajat Koner, Hang Li, Marcel Hildebrandt, Deepan Das, Volker Tresp, Stephan Günnemann
We conduct an experimental study on the challenging dataset GQA, based on both manually curated and automatically generated scene graphs.
1 code implementation • 12 Jul 2021 • Rajat Koner, Poulami Sinhamahapatra, Volker Tresp
Identifying objects in an image and their mutual relationships as a scene graph leads to a deep understanding of image content.
no code implementations • 2 Jul 2020 • Marcel Hildebrandt, Hang Li, Rajat Koner, Volker Tresp, Stephan Günnemann
We propose a novel method that approaches the task by performing context-driven, sequential reasoning based on the objects and their semantic and spatial relationships present in the scene.
2 code implementations • 13 Apr 2020 • Rajat Koner, Suprosanna Shit, Volker Tresp
In this work, we propose a novel transformer formulation for scene graph generation and relation prediction.
1 code implementation • 2 May 2019 • Sahand Sharifzadeh, Sina Moayed Baharlou, Max Berrendorf, Rajat Koner, Volker Tresp
We argue that depth maps can additionally provide valuable information on object relations, e. g. helping to detect not only spatial relations, such as standing behind, but also non-spatial relations, such as holding.
Ranked #1 on Relationship Detection on VRD