no code implementations • 27 Oct 2019 • Shyamgopal Karthik, Abhinav Moudgil, Vineet Gandhi
Recent works have proposed several long term tracking benchmarks and highlight the importance of moving towards long-duration tracking to bridge the gap with application requirements.
no code implementations • 4 Jun 2020 • Shyamgopal Karthik, Ameya Prabhu, Vineet Gandhi
Multi-object tracking has seen a lot of progress recently, albeit with substantial annotation costs for developing better and larger labeled datasets.
no code implementations • ICLR 2021 • Shyamgopal Karthik, Ameya Prabhu, Puneet K. Dokania, Vineet Gandhi
There has been increasing interest in building deep hierarchy-aware classifiers, aiming to quantify and reduce the severity of mistakes and not just count the number of errors.
no code implementations • 25 Aug 2021 • Shyamgopal Karthik, Jérome Revaud, Boris Chidlovskii
In addition, the resulting learned representations are also remarkably robust to label noise, when fine-tuned with an imbalance- and noise-resistant loss function.
1 code implementation • NeurIPS 2023 • Kanishk Jain, Shyamgopal Karthik, Vineet Gandhi
We investigate the problem of reducing mistake severity for fine-grained classification.
1 code implementation • 1 Apr 2021 • Shyamgopal Karthik, Ameya Prabhu, Puneet K. Dokania, Vineet Gandhi
There has been increasing interest in building deep hierarchy-aware classifiers that aim to quantify and reduce the severity of mistakes, and not just reduce the number of errors.
1 code implementation • CVPR 2022 • Shyamgopal Karthik, Massimiliano Mancini, Zeynep Akata
The goal of open-world compositional zero-shot learning (OW-CZSL) is to recognize compositions of state and objects in images, given only a subset of them during training and no prior on the unseen compositions.
1 code implementation • 24 Sep 2021 • Jeet Vora, Swetanjal Dutta, Kanishk Jain, Shyamgopal Karthik, Vineet Gandhi
Multi-view Detection (MVD) is highly effective for occlusion reasoning in a crowded environment.
Ranked #2 on Multiview Detection on GMVD
1 code implementation • 22 May 2023 • Shyamgopal Karthik, Karsten Roth, Massimiliano Mancini, Zeynep Akata
Despite their impressive capabilities, diffusion-based text-to-image (T2I) models can lack faithfulness to the text prompt, where generated images may not contain all the mentioned objects, attributes or relations.
1 code implementation • ICCV 2023 • Uddeshya Upadhyay, Shyamgopal Karthik, Massimiliano Mancini, Zeynep Akata
We propose ProbVLM, a probabilistic adapter that estimates probability distributions for the embeddings of pre-trained VLMs via inter/intra-modal alignment in a post-hoc manner without needing large-scale datasets or computing.
1 code implementation • 13 Oct 2023 • Shyamgopal Karthik, Karsten Roth, Massimiliano Mancini, Zeynep Akata
Finally, we show that CIReVL makes CIR human-understandable by composing image and text in a modular fashion in the language domain, thereby making it intervenable, allowing to post-hoc re-align failure cases.
Ranked #1 on Zero-Shot Composed Image Retrieval (ZS-CIR) on CIRCO
1 code implementation • 14 Jul 2022 • Uddeshya Upadhyay, Shyamgopal Karthik, Yanbei Chen, Massimiliano Mancini, Zeynep Akata
Moreover, many of the high-performing deep learning models that are already trained and deployed are non-Bayesian in nature and do not provide uncertainty estimates.
1 code implementation • 11 Dec 2020 • Samyak Jain, Pradeep Yarlagadda, Shreyank Jyoti, Shyamgopal Karthik, Ramanathan Subramanian, Vineet Gandhi
We also explore a variation of ViNet architecture by augmenting audio features into the decoder.