TorchScale: Transformers at Scale

microsoft/torchscale 23 Nov 2022

Large Transformers have achieved state-of-the-art performance across many tasks.

Language Modelling Machine Translation +1

506
2.65 stars / hour

Human-level play in the game of Diplomacy by combining language models with strategic reasoning

facebookresearch/diplomacy_cicero Science 2022

Despite much progress in training AI systems to imitate human language, building agents that use language to communicate intentionally with humans in interactive environments remains a major challenge.

620
2.53 stars / hour

DiffusionDet: Diffusion Model for Object Detection

shoufachen/diffusiondet 17 Nov 2022

In inference, the model refines a set of randomly generated boxes to the output results in a progressive way.

Denoising object-detection +1

1,247
1.01 stars / hour

Towards Robust Blind Face Restoration with Codebook Lookup Transformer

sczhou/codeformer 22 Jun 2022

In this paper, we demonstrate that a learned discrete codebook prior in a small proxy space largely reduces the uncertainty and ambiguity of restoration mapping by casting blind face restoration as a code prediction task, while providing rich visual atoms for generating high-quality faces.

Blind Face Restoration

1,481
0.73 stars / hour

SinDiffusion: Learning a Diffusion Model from a Single Natural Image

weilunwang/sindiffusion 22 Nov 2022

We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image.

Denoising Image Generation +1

88
0.72 stars / hour

MetaFormer Baselines for Vision

facebookresearch/xformers 24 Oct 2022

By simply applying depthwise separable convolutions as token mixer in the bottom stages and vanilla self-attention in the top stages, the resulting model CAFormer sets a new record on ImageNet-1K: it achieves an accuracy of 85. 5% at 224x224 resolution, under normal supervised training without external data or distillation.

Image Classification

1,689
0.72 stars / hour

EVA: Exploring the Limits of Masked Visual Representation Learning at Scale

baaivision/eva 14 Nov 2022

We launch EVA, a vision-centric foundation model to explore the limits of visual representation at scale using only publicly accessible data.

 Ranked #1 on Object Detection on LVIS v1.0 val (using extra training data)

Action Classification Action Recognition +6

213
0.57 stars / hour

LiT: Zero-Shot Transfer with Locked-image text Tuning

mlfoundations/open_clip CVPR 2022

This paper presents contrastive-tuning, a simple method employing contrastive training to align image and text models while still taking advantage of their pre-training.

Image Classification Retrieval +2

2,378
0.51 stars / hour

Galactica: A Large Language Model for Science

paperswithcode/galai 16 Nov 2022

We believe these results demonstrate the potential for language models as a new interface for science.

Classification Language Modelling +4

1,617
0.49 stars / hour