1 code implementation • ICCV 2021 • Deniz Engin, François Schnitzler, Ngoc Q. K. Duong, Yannis Avrithis
High-level understanding of stories in video such as movies and TV shows from raw data is extremely challenging.
Ranked #1 on Video Question Answering on KnowIT VQA
1 code implementation • 18 Oct 2020 • Huy Phan, Huy Le Nguyen, Oliver Y. Chén, Philipp Koch, Ngoc Q. K. Duong, Ian McLoughlin, Alfred Mertins
Existing generative adversarial networks (GANs) for speech enhancement solely rely on the convolution operation, which may obscure temporal dependencies across the sequence input.
no code implementations • 11 Sep 2020 • Huy Phan, Lam Pham, Philipp Koch, Ngoc Q. K. Duong, Ian McLoughlin, Alfred Mertins
Audio event localization and detection (SELD) have been commonly tackled using multitask models.
no code implementations • ICCV 2019 • Romain Cohendet, Claire-Hélène Demarty, Ngoc Q. K. Duong, Martin Engilberge
Humans share a strong tendency to memorize/forget some of the visual information they encounter.
no code implementations • 19 Apr 2018 • Sanjeel Parekh, Slim Essid, Alexey Ozerov, Ngoc Q. K. Duong, Patrick Pérez, Gaël Richard
Audio-visual representation learning is an important task from the perspective of designing machines with the ability to understand complex events.
no code implementations • 27 Mar 2018 • Huy V. Vo, Ngoc Q. K. Duong, Patrick Perez
Scene-agnostic visual inpainting remains very challenging despite progress in patch-based methods.
no code implementations • 24 Feb 2015 • Ngoc Q. K. Duong, Hien-Thanh Duong
Audio fingerprinting, also named as audio hashing, has been well-known as a powerful technique to perform audio identification and synchronization.