Action Triplet Recognition
8 papers with code • 6 benchmarks • 4 datasets
Recognising action as a triplet of subject verb and object. Example HOI = Human Object Interaction, Surgical IVT = Instrument Verb Target, etc.
Recently, researchers have made significant progress combining the advances in deep learning for learning feature representations with reinforcement learning.
Rendezvous: Attention Mechanisms for the Recognition of Surgical Action Triplets in Endoscopic Videos
To achieve this task, we introduce our new model, the Rendezvous (RDV), which recognizes triplets directly from surgical videos by leveraging attention at two different levels.
In this paper, we present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge.
Recognition of surgical activity is an essential component to develop context-aware decision support for the operating room.
CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection
This paper presents the CholecTriplet2022 challenge, which extends surgical action triplet modeling from recognition to detection.
Correct transfer of these methods to surgery, as described and conducted in this work, leads to substantial performance gains over generic uses of SSL - up to 7. 4% on phase recognition and 20% on tool presence detection - as well as state-of-the-art semi-supervised phase recognition approaches by up to 14%.
Focusing more on the verbs, our RiT explores the connectedness of current and past frames to learn temporal attention-based features for enhanced triplet recognition.