OnePose: One-Shot Object Pose Estimation without CAD Models

zju3dv/OnePose 24 May 2022

We propose a new method named OnePose for object pose estimation.

6D Pose Estimation Graph Attention +1

SymForce: Symbolic Computation and Code Generation for Robotics

symforce-org/symforce 17 Apr 2022

We present SymForce, a library for fast symbolic computation, code generation, and nonlinear optimization for robotics applications like computer vision, motion planning, and controls.

Code Generation Motion Planning

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

lucidrains/imagen-pytorch 23 May 2022

We present Imagen, a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding.

 Ranked #1 on Text-to-Image Generation on COCO (using extra training data)

Language Modelling Zero-Shot Text-to-Image Generation

Conformer: Convolution-augmented Transformer for Speech Recognition

PaddlePaddle/DeepSpeech 16 May 2020

Recently Transformer and Convolution neural network (CNN) based models have shown promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural networks (RNNs).

Automatic Speech Recognition

MuJoCo: A physics engine for model-based control

deepmind/mujoco IEEE/RSJ IROS 2012

To facilitate optimal control applications and in particular sampling and finite differencing, the dynamics can be evaluated for different states and controls in parallel.

Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality

implus/um-mae 20 May 2022

Masked AutoEncoder (MAE) has recently led the trends of visual self-supervision area by an elegant asymmetric encoder-decoder design, which significantly optimizes both the pre-training efficiency and fine-tuning accuracy.

Object Detection

Ivy: Templated Deep Learning for Inter-Framework Portability

ivy-dl/ivy 4 Feb 2021

We introduce Ivy, a templated Deep Learning (DL) framework which abstracts existing DL frameworks.

Planning with Diffusion for Flexible Behavior Synthesis

jannerm/diffuser 20 May 2022

Model-based reinforcement learning methods often use learning only for the purpose of estimating an approximate dynamics model, offloading the rest of the decision-making work to classical trajectory optimizers.

Decision Making Denoising +2

BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving

zhangyp15/beverse 19 May 2022

Specifically, BEVerse first performs shared feature extraction and lifting to generate 4D BEV representations from multi-timestamp and multi-view images.

3D Object Detection Autonomous Driving +3

