We present the Pathways Autoregressive Text-to-Image (Parti) model, which generates high-fidelity photorealistic images and supports content-rich synthesis involving complex compositions and world knowledge.
Ranked #1 on
Text-to-Image Generation
on COCO
The 2D-3D coordinates and corresponding weights are treated as intermediate variables learned by minimizing the KL divergence between the predicted and target pose distribution.
Ranked #4 on
6D Pose Estimation using RGB
on LineMOD
We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities.
Ranked #1 on
Code Generation
on APPS
We introduce \textit{Nocturne}, a new 2D driving simulator for investigating multi-agent coordination under partial observability.
In recent years, deep generative models have attracted increasing interest due to their capacity to model complex distributions.
We introduce ArtBench-10, the first class-balanced, high-quality, cleanly annotated, and standardized dataset for benchmarking artwork generation.
The modified VTE is termed as Strided Transformer Encoder (STE), which is built upon the outputs of VTE.
Ranked #1 on
3D Human Pose Estimation
on HumanEva-I
We present a generative image inpainting system to complete images with free-form mask and guidance.
Ranked #3 on
Image Inpainting
on Places2 val
Autonomous agents have made great strides in specialist domains like Atari games and Go.
Deviating from such models, we here introduce Latent Image Animator (LIA), a self-supervised auto-encoder that evades need for structure representation.