The 2D-3D coordinates and corresponding weights are treated as intermediate variables learned by minimizing the KL divergence between the predicted and target pose distribution.
Ranked #4 on
6D Pose Estimation using RGB
on LineMOD
We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities.
Ranked #1 on
Code Generation
on APPS
In recent years, deep generative models have attracted increasing interest due to their capacity to model complex distributions.
We introduce \textit{Nocturne}, a new 2D driving simulator for investigating multi-agent coordination under partial observability.
Autonomous agents have made great strides in specialist domains like Atari games and Go.
Text-to-image generation has traditionally focused on finding better modeling assumptions for training on a fixed dataset.
Ranked #12 on
Text-to-Image Generation
on COCO
(using extra training data)
We present a generative image inpainting system to complete images with free-form mask and guidance.
Ranked #3 on
Image Inpainting
on Places2 val
We introduce ArtBench-10, the first class-balanced, high-quality, cleanly annotated, and standardized dataset for benchmarking artwork generation.
In this paper, we introduce an enormous dataset HaGRID (HAnd Gesture Recognition Image Dataset) for hand gesture recognition (HGR) systems.
Our EdgeNeXt model with 1. 3M parameters achieves 71. 2\% top-1 accuracy on ImageNet-1K, outperforming MobileViT with an absolute gain of 2. 2\% with 28\% reduction in FLOPs.
Ranked #37 on
Semantic Segmentation
on PASCAL VOC 2012 test