Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation

SysCV/pcan NeurIPS 2021

We propose Prototypical Cross-Attention Network (PCAN), capable of leveraging rich spatio-temporal information for online multiple object tracking and segmentation.

BR-IDL/PaddleViT 5 Oct 2021

:robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2. 0+

Attention Mechanisms in Computer Vision: A Survey

MenghaoGuo/Awesome-Vision-Attentions 15 Nov 2021

Humans can naturally and effectively find salient regions in complex scenes.

TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers

megviirobot/transmvsnet 29 Nov 2021

We analogize MVS back to its nature of a feature matching task and therefore propose a powerful Feature Matching Transformer (FMT) to leverage intra- (self-) and inter- (cross-) attention to aggregate long-range context information within and across images.

DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion

DanceTrack/DanceTrack 29 Nov 2021

A typical pipeline for multi-object tracking (MOT) is to use a detector for object localization, and following re-identification (re-ID) for object association.

MetaFormer is Actually What You Need for Vision

sail-sg/poolformer 22 Nov 2021

Based on this observation, we hypothesize that the general architecture of the transformers, instead of the specific token mixer module, is more essential to the model's performance.

Robust High-Resolution Video Matting with Temporal Guidance

PeterL1n/RobustVideoMatting 25 Aug 2021

We introduce a robust, real-time, high-resolution human video matting method that achieves new state-of-the-art performance.

Mesa: A Memory-saving Training Framework for Transformers

zhuang-group/mesa 22 Nov 2021

Specifically, Mesa uses exact activations during forward pass while storing a low-precision version of activations to reduce memory consumption during training.


HybVIO: Pushing the Limits of Real-time Visual-inertial Odometry

SpectacularAI/HybVIO 22 Jun 2021

We present HybVIO, a novel hybrid approach for combining filtering-based visual-inertial odometry (VIO) with optimization-based SLAM.

Projected GANs Converge Faster

autonomousvision/projected_gan NeurIPS 2021

Generative Adversarial Networks (GANs) produce high-quality images but are challenging to train.

