Search Results for author: Isma Hadji

Found 12 papers, 5 papers with code

You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation

no code implementations • 30 Jan 2024 • Mehdi Noroozi, Isma Hadji, Brais Martinez, Adrian Bulat, Georgios Tzimiropoulos

We show that the combination of spatially distilled U-Net and fine-tuned decoder outperforms state-of-the-art methods requiring 200 steps with only one single step.

Image Super-Resolution

Paper
Add Code

Graph Guided Question Answer Generation for Procedural Question-Answering

no code implementations • 24 Jan 2024 • Hai X. Pham, Isma Hadji, Xinnuo Xu, Ziedune Degutyte, Jay Rainey, Evangelos Kazakos, Afsaneh Fazly, Georgios Tzimiropoulos, Brais Martinez

The key technological enabler is a novel mechanism for automatic question-answer generation from procedural text which can ingest large amounts of textual instructions and produce exhaustive in-domain QA training data.

Answer Generation Question-Answer-Generation +1

Paper
Add Code

GePSAn: Generative Procedure Step Anticipation in Cooking Videos

no code implementations • ICCV 2023 • Mohamed Ashraf Abdelsalam, Samrudhdhi B. Rangrej, Isma Hadji, Nikita Dvornik, Konstantinos G. Derpanis, Afsaneh Fazly

While most previous work focus on the problem of data scarcity in procedural video datasets, another core challenge of future anticipation is how to account for multiple plausible future realizations in natural settings.

Paper
Add Code

StepFormer: Self-supervised Step Discovery and Localization in Instructional Videos

no code implementations • CVPR 2023 • Nikita Dvornik, Isma Hadji, Ran Zhang, Konstantinos G. Derpanis, Animesh Garg, Richard P. Wildes, Allan D. Jepson

This motivates the need to temporally localize the instruction steps in such videos, i. e. the task called key-step localization.

Paper
Add Code

Graph2Vid: Flow graph to Video Grounding for Weakly-supervised Multi-Step Localization

1 code implementation • 10 Oct 2022 • Nikita Dvornik, Isma Hadji, Hai Pham, Dhaivat Bhatt, Brais Martinez, Afsaneh Fazly, Allan D. Jepson

In this setup, we seek the optimal step ordering consistent with the procedure flow graph and a given video.

Video Grounding

Paper
Code

P3IV: Probabilistic Procedure Planning from Instructional Videos with Weak Supervision

1 code implementation • CVPR 2022 • He Zhao, Isma Hadji, Nikita Dvornik, Konstantinos G. Derpanis, Richard P. Wildes, Allan D. Jepson

Our model is based on a transformer equipped with a memory module, which maps the start and goal observations to a sequence of plausible actions.

Paper
Code

Drop-DTW: Aligning Common Signal Between Sequences While Dropping Outliers

no code implementations • NeurIPS 2021 • Nikita Dvornik, Isma Hadji, Konstantinos G. Derpanis, Animesh Garg, Allan D. Jepson

In our experiments, we show that Drop-DTW is a robust similarity measure for sequence retrieval and demonstrate its effectiveness as a training loss on diverse applications.

Dynamic Time Warping Representation Learning +1

Paper
Add Code

Representation Learning via Global Temporal Alignment and Cycle-Consistency

1 code implementation • CVPR 2021 • Isma Hadji, Konstantinos G. Derpanis, Allan D. Jepson

We introduce a weakly supervised method for representation learning based on aligning temporal sequences (e. g., videos) of the same process (e. g., human action).

Action Classification Dynamic Time Warping +5

Paper
Code

Why Convolutional Networks Learn Oriented Bandpass Filters: Theory and Empirical Support

no code implementations • 30 Nov 2020 • Isma Hadji, Richard P. Wildes

A standard explanation of this result is that these filters reflect the structure of the images that they have been exposed to during training: Natural images typically are locally composed of oriented contours at various scales and oriented bandpass filters are matched to such structure.