Software debugging has been shown to utilize upwards of half of developers' time.
We introduce a large-scale dataset of diverse computational graphs of neural architectures - DeepNets-1M - and use it to explore parameter prediction on CIFAR-10 and ImageNet.
Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in videos.
Ranked #1 on Multi-Object Tracking on MOT17 (using extra training data)
In contrast, with NeRV, we can use any neural network compression method as a proxy for video compression, and achieve comparable performance to traditional frame-based video compression approaches (H. 264, HEVC \etc).
Scenic is an open-source JAX library with a focus on Transformer-based models for computer vision research and beyond.
We present a method that decomposes, or "unwraps", an input video into a set of layered 2D atlases, each providing a unified representation of the appearance of an object (or background) over the video.
We find that one of the main reasons for that is the lack of an effective receptive field in both the inpainting network and the loss function.