Omnivore: A Single Model for Many Visual Modalities

facebookresearch/omnivore 20 Jan 2022

Prior work has studied different visual modalities in isolation and developed separate architectures for recognition of images, videos, and 3D data.

Action Classification Action Recognition +3

1.11 stars / hour

Stitch it in Time: GAN-Based Facial Editing of Real Videos

rotemtzaban/STIT 20 Jan 2022

The ability of Generative Adversarial Networks to encode rich semantics within their latent space has been widely adopted for facial image editing.

Facial Editing

0.72 stars / hour
0.72 stars / hour

A ConvNet for the 2020s

facebookresearch/ConvNeXt 10 Jan 2022

The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification model.

 Ranked #1 on Domain Generalization on ImageNet-Sketch (using extra training data)

Domain Generalization Image Classification +2

0.63 stars / hour

Instant Neural Graphics Primitives with a Multiresolution Hash Encoding

nvlabs/instant-ngp 16 Jan 2022

Neural graphics primitives, parameterized by fully connected neural networks, can be costly to train and evaluate.

Neural Radiance Caching

0.63 stars / hour

Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training

hpcaitech/colossalai 28 Oct 2021

The Transformer architecture has improved the performance of deep learning models in domains such as Computer Vision and Natural Language Processing.

0.62 stars / hour

Deep Learning for Identifying Metastatic Breast Cancer

3dimaging/DeepLearningCamelyon 18 Jun 2016

The International Symposium on Biomedical Imaging (ISBI) held a grand challenge to evaluate computational systems for the automated detection of metastatic breast cancer in whole slide images of sentinel lymph node biopsies.

General Classification Image Classification +1

0.61 stars / hour

General-Purpose Question-Answering with Macaw

allenai/macaw 6 Sep 2021

Despite the successes of pretrained language models, there are still few high-quality, general-purpose QA systems that are freely available.

Generative Question Answering

0.46 stars / hour

Explaining in Style: Training a GAN to explain a classifier in StyleSpace

google/explaining-in-style ICCV 2021

A natural source for such attributes is the StyleSpace of StyleGAN, which is known to generate semantically meaningful dimensions in the image.

Image Classification

0.39 stars / hour

CoAtNet: Marrying Convolution and Attention for All Data Sizes

xmu-xiaoma666/External-Attention-pytorch NeurIPS 2021

Transformers have attracted increasing interests in computer vision, but they still fall behind state-of-the-art convolutional networks.

 Ranked #1 on Image Classification on ImageNet (using extra training data)

Image Classification

0.31 stars / hour