3D Vision with Transformers: A Survey

lahoud/3d-vision-transformers 8 Aug 2022

The success of the transformer architecture in natural language processing has recently triggered attention in the computer vision field.

Natural Language Processing Pose Estimation

66
1.67 stars / hour

Hybrid Spectrogram and Waveform Source Separation

facebookresearch/demucs 5 Nov 2021

Source separation models either work on the spectrogram or waveform domain.

Music Source Separation

3,896
0.78 stars / hour

MobileNeRF: Exploiting the Polygon Rasterization Pipeline for Efficient Neural Field Rendering on Mobile Architectures

google-research/jax3d 30 Jul 2022

Neural Radiance Fields (NeRFs) have demonstrated amazing ability to synthesize images of 3D scenes from novel views.

Novel View Synthesis

190
0.73 stars / hour

Elucidating the Design Space of Diffusion-Based Generative Models

lucidrains/imagen-pytorch 1 Jun 2022

We argue that the theory and practice of diffusion-based generative models are currently unnecessarily convoluted and seek to remedy the situation by presenting a design space that clearly separates the concrete design choices.

Image Generation

4,615
0.66 stars / hour

Reconstructing 3D Human Pose by Watching Humans in the Mirror

zju3dv/EasyMocap CVPR 2021

In this paper, we introduce the new task of reconstructing 3D human pose from a single image in which we can see the person and the person's image through a mirror.

3D Pose Estimation

1,567
0.60 stars / hour

Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with Natural Language

microsoft/2D-TAN 4 Dec 2020

It is a challenging problem because a target moment may take place in the context of other temporal moments in the untrimmed video.

356
0.55 stars / hour

YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

wongkinyiu/yolov7 6 Jul 2022

YOLOv7 surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS and has the highest accuracy 56. 8% AP among all known real-time object detectors with 30 FPS or higher on GPU V100.

Real-Time Object Detection

4,256
0.54 stars / hour

Ivy: Templated Deep Learning for Inter-Framework Portability

ivy-dl/ivy 4 Feb 2021

We introduce Ivy, a templated Deep Learning (DL) framework which abstracts existing DL frameworks.

4,870
0.52 stars / hour