no code implementations • 30 Jan 2024 • Mehdi Noroozi, Isma Hadji, Brais Martinez, Adrian Bulat, Georgios Tzimiropoulos
We show that the combination of spatially distilled U-Net and fine-tuned decoder outperforms state-of-the-art methods requiring 200 steps with only one single step.
no code implementations • 24 Jan 2024 • Hai X. Pham, Isma Hadji, Xinnuo Xu, Ziedune Degutyte, Jay Rainey, Evangelos Kazakos, Afsaneh Fazly, Georgios Tzimiropoulos, Brais Martinez
The key technological enabler is a novel mechanism for automatic question-answer generation from procedural text which can ingest large amounts of textual instructions and produce exhaustive in-domain QA training data.
no code implementations • ICCV 2023 • Mohamed Ashraf Abdelsalam, Samrudhdhi B. Rangrej, Isma Hadji, Nikita Dvornik, Konstantinos G. Derpanis, Afsaneh Fazly
While most previous work focus on the problem of data scarcity in procedural video datasets, another core challenge of future anticipation is how to account for multiple plausible future realizations in natural settings.
no code implementations • CVPR 2023 • Nikita Dvornik, Isma Hadji, Ran Zhang, Konstantinos G. Derpanis, Animesh Garg, Richard P. Wildes, Allan D. Jepson
This motivates the need to temporally localize the instruction steps in such videos, i. e. the task called key-step localization.
1 code implementation • 10 Oct 2022 • Nikita Dvornik, Isma Hadji, Hai Pham, Dhaivat Bhatt, Brais Martinez, Afsaneh Fazly, Allan D. Jepson
In this setup, we seek the optimal step ordering consistent with the procedure flow graph and a given video.
1 code implementation • CVPR 2022 • He Zhao, Isma Hadji, Nikita Dvornik, Konstantinos G. Derpanis, Richard P. Wildes, Allan D. Jepson
Our model is based on a transformer equipped with a memory module, which maps the start and goal observations to a sequence of plausible actions.
no code implementations • NeurIPS 2021 • Nikita Dvornik, Isma Hadji, Konstantinos G. Derpanis, Animesh Garg, Allan D. Jepson
In our experiments, we show that Drop-DTW is a robust similarity measure for sequence retrieval and demonstrate its effectiveness as a training loss on diverse applications.
1 code implementation • CVPR 2021 • Isma Hadji, Konstantinos G. Derpanis, Allan D. Jepson
We introduce a weakly supervised method for representation learning based on aligning temporal sequences (e. g., videos) of the same process (e. g., human action).
no code implementations • 30 Nov 2020 • Isma Hadji, Richard P. Wildes
A standard explanation of this result is that these filters reflect the structure of the images that they have been exposed to during training: Natural images typically are locally composed of oriented contours at various scales and oriented bandpass filters are matched to such structure.
no code implementations • ECCV 2018 • Isma Hadji, Richard P. Wildes
This paper introduces a new large scale dynamic texture dataset.
3 code implementations • 23 Mar 2018 • Isma Hadji, Richard P. Wildes
This document will review the most prominent proposals using multilayer convolutional architectures.
1 code implementation • ICCV 2017 • Isma Hadji, Richard P. Wildes
Another key aspect of the network is its recurrent nature, whereby the output of each layer of processing feeds back to the input.