Attention Mechanisms in Computer Vision: A Survey

MenghaoGuo/Awesome-Vision-Attentions 15 Nov 2021

Humans can naturally and effectively find salient regions in complex scenes.

Image Classification Image Generation +4

934
0.73 stars / hour

Robust High-Resolution Video Matting with Temporal Guidance

PeterL1n/RobustVideoMatting 25 Aug 2021

We introduce a robust, real-time, high-resolution human video matting method that achieves new state-of-the-art performance.

Video Matting

4,817
0.69 stars / hour

PaddleViT

BR-IDL/PaddleViT NeurIPS 2021

:robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2. 0+

Image Classification

582
0.56 stars / hour

TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers

megviirobot/transmvsnet 29 Nov 2021

We analogize MVS back to its nature of a feature matching task and therefore propose a powerful Feature Matching Transformer (FMT) to leverage intra- (self-) and inter- (cross-) attention to aggregate long-range context information within and across images.

36
0.54 stars / hour

HybVIO: Pushing the Limits of Real-time Visual-inertial Odometry

SpectacularAI/HybVIO 22 Jun 2021

We present HybVIO, a novel hybrid approach for combining filtering-based visual-inertial odometry (VIO) with optimization-based SLAM.

69
0.51 stars / hour

VaxNeRF: Revisiting the Classic for Voxel-Accelerated Neural Radiance Field

naruya/vaxnerf 25 Nov 2021

We hope VaxNeRF -- a careful combination of a classic technique with a deep method (that arguably replaced it) -- can empower and accelerate new NeRF extensions and applications, with its simplicity, portability, and reliable performance gains.

3D Reconstruction Meta-Learning

36
0.48 stars / hour

Mesa: A Memory-saving Training Framework for Transformers

zhuang-group/mesa 22 Nov 2021

Specifically, Mesa uses exact activations during forward pass while storing a low-precision version of activations to reduce memory consumption during training.

Quantization

82
0.47 stars / hour

MetaFormer is Actually What You Need for Vision

sail-sg/poolformer 22 Nov 2021

Based on this observation, we hypothesize that the general architecture of the transformers, instead of the specific token mixer module, is more essential to the model's performance.

Image Classification Semantic Segmentation

356
0.45 stars / hour

Differentiable Drawing and Sketching

jonhare/DifferentiableSketching 30 Mar 2021

We present a bottom-up differentiable relaxation of the process of drawing points, lines and curves into a pixel raster.

Drawing Pictures

57
0.43 stars / hour

LibFewShot: A Comprehensive Library for Few-shot Learning

rl-vig/libfewshot 10 Sep 2021

Furthermore, based on LibFewShot, we provide comprehensive evaluations on multiple benchmark datasets with multiple backbone architectures to evaluate common pitfalls and effects of different training tricks.

Data Augmentation Few-Shot Image Classification +1

401
0.38 stars / hour