Deep Patch Visual Odometry

princeton-vl/dpvo 8 Aug 2022

We propose Deep Patch Visual Odometry (DPVO), a new deep learning system for monocular Visual Odometry (VO).

Monocular Visual Odometry

62
1.56 stars / hour

3D Vision with Transformers: A Survey

lahoud/3d-vision-transformers 8 Aug 2022

The success of the transformer architecture in natural language processing has recently triggered attention in the computer vision field.

Natural Language Processing Pose Estimation

86
1.40 stars / hour

Reconstructing 3D Human Pose by Watching Humans in the Mirror

zju3dv/EasyMocap CVPR 2021

In this paper, we introduce the new task of reconstructing 3D human pose from a single image in which we can see the person and the person's image through a mirror.

3D Pose Estimation

1,617
0.75 stars / hour

Elucidating the Design Space of Diffusion-Based Generative Models

lucidrains/imagen-pytorch 1 Jun 2022

We argue that the theory and practice of diffusion-based generative models are currently unnecessarily convoluted and seek to remedy the situation by presenting a design space that clearly separates the concrete design choices.

Image Generation

4,656
0.68 stars / hour

MobileNeRF: Exploiting the Polygon Rasterization Pipeline for Efficient Neural Field Rendering on Mobile Architectures

google-research/jax3d 30 Jul 2022

Neural Radiance Fields (NeRFs) have demonstrated amazing ability to synthesize images of 3D scenes from novel views.

Novel View Synthesis

201
0.66 stars / hour

YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

wongkinyiu/yolov7 6 Jul 2022

YOLOv7 surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS and has the highest accuracy 56. 8% AP among all known real-time object detectors with 30 FPS or higher on GPU V100.

Real-Time Object Detection

4,320
0.63 stars / hour

Ivy: Templated Deep Learning for Inter-Framework Portability

ivy-dl/ivy 4 Feb 2021

We introduce Ivy, a templated Deep Learning (DL) framework which abstracts existing DL frameworks.

4,918
0.59 stars / hour

Expanding Language-Image Pretrained Models for General Video Recognition

microsoft/videox 4 Aug 2022

Extensive experiments demonstrate that our approach is effective and can be generalized to different video recognition scenarios.

Action Classification Action Recognition +2

361
0.53 stars / hour