TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Action Triplet Recognition	CholecT45 (cross-val)	RiT: Rendezvous-in-Time	mAP	29.7±2.6	# 1
Action Triplet Recognition	CholecT50 (Challenge)	RiT: Rendezvous-in-Time	mAP	30.94	# 11

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/rendezvous-in-time-an-attention-based/action-triplet-recognition-on-cholect45-cross)](https://paperswithcode.com/sota/action-triplet-recognition-on-cholect45-cross?p=rendezvous-in-time-an-attention-based)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/rendezvous-in-time-an-attention-based/action-triplet-recognition-on-cholect50-1)](https://paperswithcode.com/sota/action-triplet-recognition-on-cholect50-1?p=rendezvous-in-time-an-attention-based)`

Rendezvous in Time: An Attention-based Temporal Fusion approach for Surgical Triplet Recognition

30 Nov 2022 · Saurav Sharma, Chinedu Innocent Nwoye, Didier Mutter, Nicolas Padoy ·

One of the recent advances in surgical AI is the recognition of surgical activities as triplets of (instrument, verb, target). Albeit providing detailed information for computer-assisted intervention, current triplet recognition approaches rely only on single frame features. Exploiting the temporal cues from earlier frames would improve the recognition of surgical action triplets from videos. In this paper, we propose Rendezvous in Time (RiT) - a deep learning model that extends the state-of-the-art model, Rendezvous, with temporal modeling. Focusing more on the verbs, our RiT explores the connectedness of current and past frames to learn temporal attention-based features for enhanced triplet recognition. We validate our proposal on the challenging surgical triplet dataset, CholecT45, demonstrating an improved recognition of the verb and triplet along with other interactions involving the verb such as (instrument, verb). Qualitative results show that the RiT produces smoother predictions for most triplet instances than the state-of-the-arts. We present a novel attention-based approach that leverages the temporal fusion of video frames to model the evolution of surgical actions and exploit their benefits for surgical triplet recognition.

PDF Abstract

Code

Add Remove Mark official

camma-public/rendezvous-in-time official

Tasks

Add Remove

Action Triplet Recognition

Datasets

CholecT50

CholecT45

Results from the Paper

Edit

Ranked #1 on Action Triplet Recognition on CholecT45 (cross-val)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Result	Benchmark
Action Triplet Recognition	CholecT45 (cross-val)	RiT: Rendezvous-in-Time	mAP	29.7±2.6	# 1		Compare
Action Triplet Recognition	CholecT50 (Challenge)	RiT: Rendezvous-in-Time	mAP	30.94	# 11		Compare

Methods

Add Remove

Rendezvous

Edit Social Preview

Rendezvous in Time: An Attention-based Temporal Fusion approach for Surgical Triplet Recognition

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove