Search Results for author: Elad Ben-Avraham

Found 4 papers, 1 papers with code

Structured Video Tokens @ Ego4D PNR Temporal Localization Challenge 2022

no code implementations15 Jun 2022 Elad Ben-Avraham, Roei Herzig, Karttikeya Mangalam, Amir Bar, Anna Rohrbach, Leonid Karlinsky, Trevor Darrell, Amir Globerson

First, as both images and videos contain structured information, we enrich a transformer model with a set of \emph{object tokens} that can be used across images and videos.

Point- of-no-return (PNR) temporal localization Temporal Localization

Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens

no code implementations13 Jun 2022 Elad Ben-Avraham, Roei Herzig, Karttikeya Mangalam, Amir Bar, Anna Rohrbach, Leonid Karlinsky, Trevor Darrell, Amir Globerson

We explore a particular instantiation of scene structure, namely a \emph{Hand-Object Graph}, consisting of hands and objects with their locations as nodes, and physical relations of contact/no-contact as edges.

Action Recognition Video Understanding

Object-Region Video Transformers

1 code implementation CVPR 2022 Roei Herzig, Elad Ben-Avraham, Karttikeya Mangalam, Amir Bar, Gal Chechik, Anna Rohrbach, Trevor Darrell, Amir Globerson

In this work, we present Object-Region Video Transformers (ORViT), an \emph{object-centric} approach that extends video transformer layers with a block that directly incorporates object representations.

Action Detection Few-Shot action recognition +3

Cannot find the paper you are looking for? You can Submit a new open access paper.