1 code implementation • 8 Jun 2021 • Ting-Ting Xie, Christos Tzelepis, Fan Fu, Ioannis Patras
Learning to localize actions in long, cluttered, and untrimmed videos is a hard task, that in the literature has typically been addressed assuming the availability of large amounts of annotated training samples for each class -- either in a fully-supervised setting, where action boundaries are known, or in a weakly-supervised setting, where only class labels are known for each video.
no code implementations • 8 Mar 2021 • Fan Fu, TingTing Xie, Ioannis Patras, Sepehr Jalali
Understanding interactions between objects in an image is an important element for generating captions.