1 code implementation • 28 Sep 2022 • Nasib Ullah, Partha Pratim Mohanta
In video captioning, there are two kinds of hallucination: object and action hallucination.
1 code implementation • 28 Mar 2022 • Gnana Praveen Rajasekar, Wheidima Carneiro de Melo, Nasib Ullah, Haseeb Aslam, Osama Zeeshan, Théo Denorme, Marco Pedersoli, Alessandro Koerich, Simon Bacon, Patrick Cardinal, Eric Granger
Specifically, we propose a joint cross-attention model that relies on the complementary relationships to extract the salient features across A-V modalities, allowing for accurate prediction of continuous values of valence and arousal.
no code implementations • 25 Jul 2021 • Nasib Ullah, Partha Pratim Mohanta
A significant drawback with existing video captioning methods is that they are optimized over cross-entropy loss function, which is uncorrelated to the de facto evaluation metrics (BLEU, METEOR, CIDER, ROUGE).