We propose 3DETR, an end-to-end Transformer based object detection model for 3D point clouds.
Though many attempts have been made in blind super-resolution to restore low-resolution images with unknown and complex degradations, they are still far from addressing general real-world degraded images.
In this paper, we present DeepSIM, a generative model for conditional image manipulation based on a single image.
We introduce RAFT-Stereo, a new deep architecture for rectified stereo based on the optical flow network RAFT.
We introduce a robust, real-time, high-resolution human video matting method that achieves new state-of-the-art performance.
Ranked #1 on Video Matting on VideoMatte240K