SVTR: Scene Text Recognition with a Single Visual Model

PaddlePaddle/PaddleOCR 30 Apr 2022

Dominant scene text recognition models commonly contain two building blocks, a visual model for feature extraction and a sequence model for text transcription.

Scene Text Recognition

0.35 stars / hour

READ: Large-Scale Neural Scene Rendering for Autonomous Driving

JOP-Lee/READ-Large-Scale-Neural-Scene-Rendering-for-Autonomous-Driving 11 May 2022

In this paper, a large-scale neural rendering method is proposed to synthesize the autonomous driving scene~(READ), which makes it possible to synthesize large-scale driving scenarios on a PC through a variety of sampling schemes.

3D Scene Reconstruction Autonomous Driving +4

0.31 stars / hour


microsoft/unilm 20 Apr 2022

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language Modelling

0.30 stars / hour

Neural 3D Scene Reconstruction with the Manhattan-world Assumption

zju3dv/manhattan_sdf 5 May 2022

Based on the Manhattan-world assumption, planar constraints are employed to regularize the geometry in floor and wall regions predicted by a 2D semantic segmentation network.

2D Semantic Segmentation 3D Reconstruction +2

0.30 stars / hour

HeadNeRF: A Real-time NeRF-based Parametric Head Model

crishy1995/headnerf 10 Dec 2021

Different from existing related parametric models, we use the neural radiance fields as a novel 3D proxy instead of the traditional 3D textured mesh, which makes that HeadNeRF is able to generate high fidelity images.

Neural Rendering

0.29 stars / hour

View Synthesis with Sculpted Neural Points

princeton-vl/snp 12 May 2022

We address the task of view synthesis, which can be posed as recovering a rendering function that renders new views from a set of existing images.

0.29 stars / hour

Flamingo: a Visual Language Model for Few-Shot Learning

lucidrains/flamingo-pytorch DeepMind 2022

Building models that can be rapidly adapted to numerous tasks using only a handful of annotated examples is an open challenge for multimodal machine learning research.

Few-Shot Learning Language Modelling +6

0.28 stars / hour

DeepDPM: Deep Clustering With an Unknown Number of Clusters

bgu-cs-vil/deepdpm 27 Mar 2022

Using a split/merge framework, a dynamic architecture that adapts to the changing K, and a novel loss, our proposed method outperforms existing nonparametric methods (both classical and deep ones).

Deep Nonparametric Clustering Model Selection +2

0.27 stars / hour

VGAER: Graph Neural Network Reconstruction based Community Detection

qcydm/vgaer 8 Jan 2022

Community detection is a fundamental and important issue in network science, but there are only a few community detection algorithms based on graph neural networks, among which unsupervised algorithms are almost blank.

Community Detection

0.26 stars / hour

RITA: a Study on Scaling Up Generative Protein Sequence Models

lightonai/rita 11 May 2022

In this work we introduce RITA: a suite of autoregressive generative models for protein sequences, with up to 1. 2 billion parameters, trained on over 280 million protein sequences belonging to the UniRef-100 database.

0.26 stars / hour