Trending Research

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

ICLR 2021 google-research/vision_transformer

While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited.

 Ranked #1 on Image Classification on CIFAR-10 (using extra training data)

FINE-GRAINED IMAGE CLASSIFICATION

505
3.60 stars / hour

Castle in the Sky: Dynamic Sky Replacement and Harmonization in Videos

22 Oct 2020jiupinjia/SkyAR

This paper proposes a vision-based method for video sky replacement and harmonization, which can automatically generate realistic and dramatic sky backgrounds in videos with controllable styles.

MOTION ESTIMATION

383
2.72 stars / hour

mT5: A massively multilingual pre-trained text-to-text transformer

22 Oct 2020google-research/multilingual-t5

The recent "Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on a wide variety of English-language NLP tasks.

208
2.12 stars / hour

LambdaNetworks: Modeling long-range Interactions without Attention

ICLR 2021 lucidrains/lambda-networks

We present a general framework for capturing long-range interactions between an input and structured contextual information (e. g. a pixel surrounded by other pixels).

IMAGE CLASSIFICATION INSTANCE SEGMENTATION OBJECT DETECTION SCENE SEGMENTATION

868
1.13 stars / hour

FaceShifter: Towards High Fidelity And Occlusion Aware Face Swapping

31 Dec 2019mindslab-ai/faceshifter

We propose a novel attributes encoder for extracting multi-level target face attributes, and a new generator with carefully designed Adaptive Attentional Denormalization (AAD) layers to adaptively integrate the identity and the attributes for face synthesis.

FACE GENERATION FACE SWAPPING

108
1.12 stars / hour

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

ACL 2019 lab-ml/labml_nn

Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling.

LANGUAGE MODELLING

187
0.89 stars / hour

FashionBERT: Text and Image Matching with Adaptive Loss for Cross-modal Retrieval

20 May 2020alibaba/EasyTransfer

In this paper, we address the text and image matching in cross-modal retrieval of the fashion industry.

CROSS-MODAL RETRIEVAL

263
0.86 stars / hour

Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method

17 Nov 2017jklj077/meProp

Based on the sparsified gradients, we further simplify the model by eliminating the rows or columns that are seldom updated, which will reduce the computational cost both in the training and decoding, and potentially accelerate decoding in real-world applications.

97
0.86 stars / hour

Proximal Policy Optimization Algorithms

20 Jul 2017lab-ml/nn

We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent.

DOTA 2 POLICY GRADIENT METHODS

186
0.81 stars / hour