no code implementations • 16 Jan 2025 • Thao Minh Le, Vuong Le, Kien Do, Sunil Gupta, Svetha Venkatesh, Truyen Tran
This paper introduces a new problem, Causal Abductive Reasoning on Video Events (CARVE), which involves identifying causal relationships between events in a video and generating hypotheses about causal chains that account for the occurrence of a target event.
no code implementations • 2 Jul 2024 • Long Hoang Dang, Thao Minh Le, Vuong Le, Tu Minh Phuong, Truyen Tran
To address the compositional nature of questions, the deliberation step decomposes complex questions into a sequence of subquestions.
no code implementations • ICCV 2023 • Hung Tran, Vuong Le, Svetha Venkatesh, Truyen Tran
To bridge that gap, this work proposes to model two concurrent mechanisms that jointly control human motion: the Persistent process that runs continually on the global scale, and the Transient sub-processes that operate intermittently on the local context of the human while interacting with objects.
1 code implementation • 8 Jul 2022 • Hoang-Anh Pham, Thao Minh Le, Vuong Le, Tu Minh Phuong, Truyen Tran
To tackle these challenges we present a new object-centric framework for video dialog that supports neural reasoning dubbed COST - which stands for Conversation about Objects in Space-Time.
no code implementations • 25 May 2022 • Thao Minh Le, Vuong Le, Sunil Gupta, Svetha Venkatesh, Truyen Tran
This grounding guides the attention mechanism inside VQA models through a duality of mechanisms: pre-training attention weight calculation and directly guiding the weights at inference time on a case-by-case basis.
no code implementations • 21 Apr 2022 • Hung Tran, Vuong Le, Svetha Venkatesh, Truyen Tran
We propose to model the persistent-transient duality in human behavior using a parent-child multi-channel neural network, which features a parent persistent channel that manages the global dynamics and children transient channels that are initiated and terminated on-demand to handle detailed interactive actions.
no code implementations • 13 Oct 2021 • Thomas P Quinn, Sunil Gupta, Svetha Venkatesh, Vuong Le
This article is a field guide to transparent model design.
no code implementations • 29 Sep 2021 • Majid Abdolshah, Hung Le, Thommen Karimpanal George, Vuong Le, Sunil Gupta, Santu Rana, Svetha Venkatesh
Whilst Generative Adversarial Networks (GANs) generate visually appealing high resolution images, the latent representations (or codes) of these models do not allow controllable changes on the semantic attributes of the generated images.
no code implementations • 25 Jun 2021 • Long Hoang Dang, Thao Minh Le, Vuong Le, Truyen Tran
Toward reaching this goal we propose an object-oriented reasoning approach in that video is abstracted as a dynamic stream of interacting objects.
1 code implementation • 20 May 2021 • Binh Nguyen-Thai, Vuong Le, Catherine Morgan, Nadia Badawi, Truyen Tran, Svetha Venkatesh
The absence or abnormality of fidgety movements of joints or limbs is strongly indicative of cerebral palsy in infants.
no code implementations • 12 Apr 2021 • Long Hoang Dang, Thao Minh Le, Vuong Le, Truyen Tran
Video question answering (Video QA) presents a powerful testbed for human-like intelligent behaviors.
no code implementations • CVPR 2021 • Romero Morais, Vuong Le, Svetha Venkatesh, Truyen Tran
Their interactions are sparse in time hence more faithful to the true underlying nature and more robust in inference and learning.
no code implementations • 10 Dec 2020 • Thomas P. Quinn, Stephan Jacobs, Manisha Senadeera, Vuong Le, Simon Coghlan
Our title alludes to the three Christmas ghosts encountered by Ebenezer Scrooge in \textit{A Christmas Carol}, who guide Ebenezer through the past, present, and future of Christmas holiday events.
no code implementations • 5 Nov 2020 • Hung Tran, Vuong Le, Truyen Tran
We design Goal-driven Trajectory Prediction model - a dual-channel neural network that realizes such intuition.
no code implementations • 18 Oct 2020 • Thao Minh Le, Vuong Le, Svetha Venkatesh, Truyen Tran
Video QA challenges modelers in multiple fronts.
1 code implementation • 20 Aug 2020 • Romero Morais, Vuong Le, Truyen Tran, Svetha Venkatesh
We propose Hierarchical Encoder-Refresher-Anticipator, a multi-level neural machine that can learn the structure of human activities by observing a partial hierarchy of events and roll-out such structure into a future prediction in multiple levels of abstraction.
no code implementations • 18 Aug 2020 • Thomas P. Quinn, Manisha Senadeera, Stephan Jacobs, Simon Coghlan, Vuong Le
These consequences could erode public trust in AI, which could in turn undermine trust in our healthcare institutions.
no code implementations • 10 Jun 2020 • Haripriya Harikumar, Vuong Le, Santu Rana, Sourangshu Bhattacharya, Sunil Gupta, Svetha Venkatesh
Recently, it has been shown that deep learning models are vulnerable to Trojan attacks, where an attacker can install a backdoor during training time to make the resultant model misidentify samples contaminated with a small trigger patch.
1 code implementation • 30 Apr 2020 • Thao Minh Le, Vuong Le, Svetha Venkatesh, Truyen Tran
We present Language-binding Object Graph Network, the first neural reasoning method with dynamic relational structures across both visual and textual domains with applications in visual question answering.
1 code implementation • CVPR 2020 • Thao Minh Le, Vuong Le, Svetha Venkatesh, Truyen Tran
Video question answering (VideoQA) is challenging as it requires modeling capacity to distill dynamic visual artifacts and distant relations and to associate them with linguistic concepts.
Ranked #3 on
Audio-Visual Question Answering (AVQA)
on AVQA
Audio-Visual Question Answering (AVQA)
Question Answering
+4
no code implementations • 10 Jul 2019 • Thao Minh Le, Vuong Le, Svetha Venkatesh, Truyen Tran
While recent advances in lingual and visual question answering have enabled sophisticated representations and neural reasoning mechanisms, major challenges in Video QA remain on dynamic grounding of concepts, relations and actions to support the reasoning process.
5 code implementations • ICCV 2019 • Dong Gong, Lingqiao Liu, Vuong Le, Budhaditya Saha, Moussa Reda Mansour, Svetha Venkatesh, Anton Van Den Hengel
At the test stage, the learned memory will be fixed, and the reconstruction is obtained from a few selected memory records of the normal data.
1 code implementation • CVPR 2019 • Romero Morais, Vuong Le, Truyen Tran, Budhaditya Saha, Moussa Mansour, Svetha Venkatesh
Appearance features have been widely used in video anomaly detection even though they contain complex entangled factors.
Ranked #6 on
Video Anomaly Detection
on HR-UBnormal