Omnivore: A Single Model for Many Visual Modalities

facebookresearch/omnivore 20 Jan 2022

Prior work has studied different visual modalities in isolation and developed separate architectures for recognition of images, videos, and 3D data.

Action Classification Action Recognition +3

55
1.88 stars / hour
48
1.86 stars / hour

Instant Neural Graphics Primitives with a Multiresolution Hash Encoding

nvlabs/instant-ngp 16 Jan 2022

Neural graphics primitives, parameterized by fully connected neural networks, can be costly to train and evaluate.

2,014
1.63 stars / hour

A ConvNet for the 2020s

facebookresearch/ConvNeXt 10 Jan 2022

The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification model.

 Ranked #1 on Domain Generalization on ImageNet-Sketch (using extra training data)

Domain Generalization Image Classification +2

2,540
1.30 stars / hour

Extracting Triangular 3D Models, Materials, and Lighting From Images

nvlabs/tiny-cuda-nn 24 Nov 2021

We present an efficient method for joint optimization of topology, materials and lighting from multi-view image observations.

686
1.00 stars / hour

Masked Autoencoders Are Scalable Vision Learners

facebookresearch/mae 11 Nov 2021

Our MAE approach is simple: we mask random patches of the input image and reconstruct the missing pixels.

Domain Generalization Object Detection +3

2,254
0.80 stars / hour

The effect of information controls on developers in China: An analysis of censorship in Chinese open source projects

citizenlab/chat-censorship COLING 2018

Censorship of Internet content in China is understood to operate through a system of intermediary liability whereby service providers are liable for the content on their platforms.

469
0.72 stars / hour

Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training

hpcaitech/colossalai 28 Oct 2021

The Transformer architecture has improved the performance of deep learning models in domains such as Computer Vision and Natural Language Processing.

609
0.58 stars / hour

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

hkunlp/unifiedskg 16 Jan 2022

Structured knowledge grounding (SKG) leverages structured knowledge to complete user requests, such as semantic parsing over databases and question answering over knowledge bases.

Few-Shot Learning Question Answering +2

83
0.55 stars / hour

Deep Learning for Identifying Metastatic Breast Cancer

3dimaging/DeepLearningCamelyon 18 Jun 2016

The International Symposium on Biomedical Imaging (ISBI) held a grand challenge to evaluate computational systems for the automated detection of metastatic breast cancer in whole slide images of sentinel lymph node biopsies.

General Classification Image Classification +1

60
0.54 stars / hour