Prior work has studied different visual modalities in isolation and developed separate architectures for recognition of images, videos, and 3D data.
Ranked #1 on Scene Recognition on SUN-RGBD (using extra training data)
1 code implementation • 18 Nov 2021 • Haoqi Fan, Tullie Murrell, Heng Wang, Kalyan Vasudev Alwala, Yanghao Li, Yilei Li, Bo Xiong, Nikhila Ravi, Meng Li, Haichuan Yang, Jitendra Malik, Ross Girshick, Matt Feiszli, Aaron Adcock, Wan-Yen Lo, Christoph Feichtenhofer
We introduce PyTorchVideo, an open-source deep-learning library that provides a rich set of modular, efficient, and reproducible components for a variety of video understanding tasks, including classification, detection, self-supervised learning, and low-level processing.
We present Worldsheet, a method for novel view synthesis using just a single RGB image as input.
We address these challenges by introducing PyTorch3D, a library of modular, efficient, and differentiable operators for 3D deep learning.
We propose C3DPO, a method for extracting 3D models of deformable objects from 2D keypoint annotations in unconstrained images.