Software debugging has been shown to utilize upwards of half of developers' time.
Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in videos.
Ranked #1 on Multi-Object Tracking on MOT17 (using extra training data)
We introduce a large-scale dataset of diverse computational graphs of neural architectures - DeepNets-1M - and use it to explore parameter prediction on CIFAR-10 and ImageNet.
We find that one of the main reasons for that is the lack of an effective receptive field in both the inpainting network and the loss function.
We present a method that decomposes, or "unwraps", an input video into a set of layered 2D atlases, each providing a unified representation of the appearance of an object (or background) over the video.
Scenic is an open-source JAX library with a focus on Transformer-based models for computer vision research and beyond.
The style-based GAN (StyleGAN) architecture achieved state-of-the-art results for generating high-quality images, but it lacks explicit and precise control over camera poses.