Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese

ofa-sys/chinese-clip 2 Nov 2022

The tremendous success of CLIP (Radford et al., 2021) has promoted the research and application of contrastive learning for vision-language pretraining.

Contrastive Learning Image Classification +9

Video Swin Transformer

open-mmlab/mmaction2 CVPR 2022

The vision community is witnessing a modeling shift from CNNs to Transformers, where pure Transformer architectures have attained top accuracy on the major video recognition benchmarks.

Ranked #17 on Action Classification on Kinetics-600 (using extra training data)

Action Classification Action Recognition +4

MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation

lhoyer/mic 2 Dec 2022

MIC significantly improves the state-of-the-art performance across the different recognition tasks for synthetic-to-real, day-to-nighttime, and clear-to-adverse-weather UDA.

Image Classification object-detection +4

Towards Multi-spatiotemporal-scale Generalized PDE Modeling

microsoft/pdearena 30 Sep 2022

Finally, we show promising results on generalization to different PDE parameters and time-scales with a single surrogate model.

PDE Surrogate Modeling

D$^2$LV: A Data-Driven and Local-Verification Approach for Image Copy Detection

wangwenhao0716/isc-track1-submission 13 Nov 2021

In this paper, a data-driven and local-verification (D$^2$LV) approach is proposed to compete for Image Similarity Challenge: Matching Track at NeurIPS'21.

Copy Detection Unsupervised Pre-training

One is All: Bridging the Gap Between Neural Radiance Fields Architectures with Progressive Volume Distillation

megvii-research/AAAI2023-PVD 29 Nov 2022

In this paper, we propose Progressive Volume Distillation (PVD), a systematic distillation method that allows any-to-any conversions between different architectures, including MLP, sparse or low-rank tensors, hashtables and their compositions.

 Ranked #1 on Novel View Synthesis on NeRF (Average PSNR metric)

3D Reconstruction Neural Rendering +1

Instant Neural Graphics Primitives with a Multiresolution Hash Encoding

nerfstudio-project/nerfstudio 16 Jan 2022

Neural graphics primitives, parameterized by fully connected neural networks, can be costly to train and evaluate.

3D Reconstruction 3D Shape Reconstruction +2

KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation

eleutherai/gpt-neox 20 May 2022

Relative positional embeddings (RPE) have received considerable attention since RPEs effectively model the relative distance among tokens and enable length extrapolation.

Language Modelling

GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds

nightmare-n/gd-mae 6 Dec 2022

In contrast to previous 3D MAE frameworks, which either design a complex decoder to infer masked information from maintained regions or adopt sophisticated masking strategies, we instead propose a much simpler paradigm.

