Search Results for author: Rajat Koner

Found 13 papers, 9 papers with code

LookupViT: Compressing visual information to a limited number of tokens

no code implementations17 Jul 2024 Rajat Koner, Gagan Jain, Prateek Jain, Volker Tresp, Sujoy Paul

We show LookupViT's effectiveness on multiple domains - (a) for image-classification (ImageNet-1K and ImageNet-21K), (b) video classification (Kinetics400 and Something-Something V2), (c) image captioning (COCO-Captions) with a frozen encoder.

Image Captioning Image Classification +1

Do DALL-E and Flamingo Understand Each Other?

no code implementations ICCV 2023 Hang Li, Jindong Gu, Rajat Koner, Sahand Sharifzadeh, Volker Tresp

To study this question, we propose a reconstruction task where Flamingo generates a description for a given image and DALL-E uses this description as input to synthesize a new image.

Image Captioning Image Reconstruction +3

Relationformer: A Unified Framework for Image-to-Graph Generation

1 code implementation19 Mar 2022 Suprosanna Shit, Rajat Koner, Bastian Wittmann, Johannes Paetzold, Ivan Ezhov, Hongwei Li, Jiazhen Pan, Sahand Sharifzadeh, Georgios Kaissis, Volker Tresp, Bjoern Menze

We leverage direct set-based object prediction and incorporate the interaction among the objects to learn an object-relation representation jointly.

Graph Generation Object +4

Is it all a cluster game? -- Exploring Out-of-Distribution Detection based on Clustering in the Embedding Space

no code implementations16 Mar 2022 Poulami Sinhamahapatra, Rajat Koner, Karsten Roscher, Stephan Günnemann

It is essential for safety-critical applications of deep neural networks to determine when new inputs are significantly different from the training distribution.

Contrastive Learning Out-of-Distribution Detection +1

Box Supervised Video Segmentation Proposal Network

1 code implementation14 Feb 2022 Tanveer Hannan, Rajat Koner, Jonathan Kobold, Matthias Schubert

Video Object Segmentation (VOS) has been targeted by various fully-supervised and self-supervised approaches.

Image Segmentation Motion Compensation +6

OODformer: Out-Of-Distribution Detection Transformer

1 code implementation19 Jul 2021 Rajat Koner, Poulami Sinhamahapatra, Karsten Roscher, Stephan Günnemann, Volker Tresp

A serious problem in image classification is that a trained model might perform well for input data that originates from the same distribution as the data available for model training, but performs much worse for out-of-distribution (OOD) samples.

Contrastive Learning Out-of-Distribution Detection +1

Graphhopper: Multi-Hop Scene Graph Reasoning for Visual Question Answering

1 code implementation13 Jul 2021 Rajat Koner, Hang Li, Marcel Hildebrandt, Deepan Das, Volker Tresp, Stephan Günnemann

We conduct an experimental study on the challenging dataset GQA, based on both manually curated and automatically generated scene graphs.

Navigate Question Answering +1

Scenes and Surroundings: Scene Graph Generation using Relation Transformer

1 code implementation12 Jul 2021 Rajat Koner, Poulami Sinhamahapatra, Volker Tresp

Identifying objects in an image and their mutual relationships as a scene graph leads to a deep understanding of image content.

Graph Generation Object +2

Scene Graph Reasoning for Visual Question Answering

no code implementations2 Jul 2020 Marcel Hildebrandt, Hang Li, Rajat Koner, Volker Tresp, Stephan Günnemann

We propose a novel method that approaches the task by performing context-driven, sequential reasoning based on the objects and their semantic and spatial relationships present in the scene.

Navigate Question Answering +1

Relation Transformer Network

2 code implementations13 Apr 2020 Rajat Koner, Suprosanna Shit, Volker Tresp

In this work, we propose a novel transformer formulation for scene graph generation and relation prediction.

Decoder Graph Generation +3

Improving Visual Relation Detection using Depth Maps

1 code implementation2 May 2019 Sahand Sharifzadeh, Sina Moayed Baharlou, Max Berrendorf, Rajat Koner, Volker Tresp

We argue that depth maps can additionally provide valuable information on object relations, e. g. helping to detect not only spatial relations, such as standing behind, but also non-spatial relations, such as holding.

Object Relation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.