1 code implementation • 28 Sep 2022 • Nasib Ullah, Partha Pratim Mohanta
In video captioning, there are two kinds of hallucination: object and action hallucination.
no code implementations • 7 Feb 2022 • Prithwish Jana, Partha Pratim Mohanta
Object detection serves as a significant step in improving performance of complex downstream computer vision tasks.
no code implementations • 14 Nov 2021 • Prithwish Jana, Swarnabja Bhaumik, Partha Pratim Mohanta
To corroborate the effectiveness of the proposed method, we evaluate the video classification task by comparing our dynamic cropping technique with random cropping on three benchmark datasets, viz.
no code implementations • 3 Nov 2021 • Swarnabja Bhaumik, Prithwish Jana, Partha Pratim Mohanta
Further exploring multistream models, we conceive a multi-tier fusion strategy for the spatial and temporal wings of a network.
no code implementations • 25 Jul 2021 • Nasib Ullah, Partha Pratim Mohanta
A significant drawback with existing video captioning methods is that they are optimized over cross-entropy loss function, which is uncorrelated to the de facto evaluation metrics (BLEU, METEOR, CIDER, ROUGE).