EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

tjiiv-cprg/epro-pnp CVPR 2022

The 2D-3D coordinates and corresponding weights are treated as intermediate variables learned by minimizing the KL divergence between the predicted and target pose distribution.

3D Object Detection 6D Pose Estimation using RGB +1

Evaluating Large Language Models Trained on Code

codedotal/gpt-code-clippy 7 Jul 2021

We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities.

Code Generation Language Modelling

Pythae: Unifying Generative Autoencoders in Python -- A Benchmarking Use Case

clementchadebec/benchmark_VAE 16 Jun 2022

In recent years, deep generative models have attracted increasing interest due to their capacity to model complex distributions.

Density Estimation Image Reconstruction +1

Free-Form Image Inpainting with Gated Convolution

zuruoke/watermark-removal ICCV 2019

We present a generative image inpainting system to complete images with free-form mask and guidance.

feature selection Image Inpainting

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

lucidrains/parti-pytorch 22 Jun 2022

We present the Pathways Autoregressive Text-to-Image (Parti) model, which generates high-fidelity photorealistic images and supports content-rich synthesis involving complex compositions and world knowledge.

Machine Translation Text to image generation +1

The ArtBench Dataset: Benchmarking Generative Models with Artworks

liaopeiyuan/artbench 22 Jun 2022

We introduce ArtBench-10, the first class-balanced, high-quality, cleanly annotated, and standardized dataset for benchmarking artwork generation.

Conditional Image Generation Unconditional Image Generation

Nocturne: a scalable driving benchmark for bringing multi-agent learning one step closer to the real world

facebookresearch/nocturne 20 Jun 2022

We introduce \textit{Nocturne}, a new 2D driving simulator for investigating multi-agent coordination under partial observability.

Imitation Learning

Zero-Shot Text-to-Image Generation

borisdayma/dalle-mini 24 Feb 2021

Text-to-image generation has traditionally focused on finding better modeling assumptions for training on a fixed dataset.

Ranked #12 on Text-to-Image Generation on COCO (using extra training data)

Text to image generation Zero-Shot Text-to-Image Generation

HaGRID -- HAnd Gesture Recognition Image Dataset

hukenovs/hagrid 16 Jun 2022

In this paper, we introduce an enormous dataset HaGRID (HAnd Gesture Recognition Image Dataset) for hand gesture recognition (HGR) systems.

Hand Detection Hand Gesture Recognition +1

RegionCLIP: Region-based Language-Image Pretraining

microsoft/regionclip CVPR 2022

However, we show that directly applying such models to recognize image regions for object detection leads to poor performance due to a domain shift: CLIP was trained to match an image as a whole to a text description, without capturing the fine-grained alignment between image regions and text spans.

Image Classification object-detection +2

