Thin-Plate Spline Motion Model for Image Animation

yoyo-nb/thin-plate-spline-motion-model 27 Mar 2022

Firstly, we propose thin-plate spline motion estimation to produce a more flexible optical flow, which warps the feature maps of the source image to the feature domain of the driving image.

Image Animation Motion Estimation +1

Neural 3D Scene Reconstruction with the Manhattan-world Assumption

zju3dv/manhattan_sdf 5 May 2022

Based on the Manhattan-world assumption, planar constraints are employed to regularize the geometry in floor and wall regions predicted by a 2D semantic segmentation network.

2D Semantic Segmentation 3D Reconstruction +2

Flamingo: a Visual Language Model for Few-Shot Learning

lucidrains/flamingo-pytorch DeepMind 2022

Building models that can be rapidly adapted to numerous tasks using only a handful of annotated examples is an open challenge for multimodal machine learning research.

Few-Shot Learning Language Modelling +6

microsoft/unilm 18 Apr 2022

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Document AI Document Image Classification +8

VGAER: Graph Neural Network Reconstruction based Community Detection

qcydm/vgaer 8 Jan 2022

Community detection is a fundamental and important issue in network science, but there are only a few community detection algorithms based on graph neural networks, among which unsupervised algorithms are almost blank.

Community Detection

ConvMAE: Masked Convolution Meets Masked Autoencoders

alpha-vl/convmae 8 May 2022

Masked auto-encoding for feature pretraining and multi-scale hybrid convolution-transformer architectures can further unleash the potentials of ViT, leading to state-of-the-art performances on image classification, detection and semantic segmentation.

Image Classification Object Detection +1

Focal Sparse Convolutional Networks for 3D Object Detection

dvlab-research/deepvision3d 26 Apr 2022

In this paper, we introduce two new modules to enhance the capability of Sparse CNNs, both are based on making feature sparsity learnable with position-wise importance prediction.

3D Object Detection

User Guide for KOTE: Korean Online Comments Emotions Dataset

searle-j/kote 11 May 2022

The emotion taxonomy of the 43 emotions is systematically established by cluster analysis of Korean emotion concepts expressed on word embedding space.

Sentiment Analysis

Multiview Stereo with Cascaded Epipolar RAFT

princeton-vl/cer-mvs 9 May 2022

CER-MVS is significantly different from prior work in multiview stereo.

Optical Flow Estimation

Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation

csebuetnlp/banglanmt EMNLP 2020

With the segmenter and the two methods combined, we compile a high-quality Bengali-English parallel corpus comprising of 2. 75 million sentence pairs, more than 2 million of which were not available before.

Machine Translation Sentence segmentation +1

