arXiv preprint 2019

Comparative Analysis of CNN-based Spatiotemporal Reasoning in Videos

arXiv preprint 2019 fubel/stmodeling

All techniques are trained end-to-end together with a CNN feature extraction part and evaluated on two publicly available benchmarks: The Jester and the Something-Something dataset.

ACTION RECOGNITION IN VIDEOS HUMAN-OBJECT INTERACTION DETECTION