no code implementations • ECCV 2020 • Dimitri Zhukov, Jean-Baptiste Alayrac, Ivan Laptev, Josef Sivic
The annotation is particularly difficult for temporal action localization where large parts of the video present no action, or background.
no code implementations • 9 Sep 2021 • Dimitri Zhukov, Ignacio Rocco, Ivan Laptev, Josef Sivic, Johannes L. Schönberger, Bugra Tekin, Marc Pollefeys
Contrary to the standard scenario of instance-level 3D reconstruction, where identical objects or scenes are present in all views, objects in different instructional videos may have large appearance variations given varying conditions and versions of the same product.
4 code implementations • ICCV 2019 • Antoine Miech, Dimitri Zhukov, Jean-Baptiste Alayrac, Makarand Tapaswi, Ivan Laptev, Josef Sivic
In this work, we propose instead to learn such embeddings from video data with readily available natural language annotations in the form of automatically transcribed narrations.
Ranked #4 on Temporal Action Localization on CrossTask
Action Localization Long Video Retrieval (Background Removed) +3
2 code implementations • CVPR 2019 • Dimitri Zhukov, Jean-Baptiste Alayrac, Ramazan Gokberk Cinbis, David Fouhey, Ivan Laptev, Josef Sivic
In this paper we investigate learning visual models for the steps of ordinary tasks using weak supervision via instructional narrations and an ordered list of steps instead of strong supervision via temporal annotations.
Ranked #5 on Temporal Action Localization on CrossTask