# Trending Research

Ordered by accumulated GitHub stars in last 3 days
##### BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers.
7,013
1.43 stars / hour
##### Rotation Equivariant CNNs for Digital Pathology
We propose a new model for digital pathology segmentation, based on the observation that histopathology images are inherently symmetric under rotation and reflection. Utilizing recent findings on rotation equivariant CNNs, the proposed model leverages these symmetries in a principled manner.

90
1.11 stars / hour
In this paper, we propose the Self-Attention Generative Adversarial Network (SAGAN) which allows attention-driven, long-range dependency modeling for image generation tasks. Traditional convolutional GANs generate high-resolution details as a function of only spatially local points in lower-resolution feature maps.
3,638
0.96 stars / hour
##### Progressive Growing of GANs for Improved Quality, Stability, and Variation
We describe a new training methodology for generative adversarial networks. The key idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly fine details as training progresses.
3,638
0.96 stars / hour
##### GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium
Generative Adversarial Networks (GANs) excel at creating realistic images with complex models for which maximum likelihood is infeasible. However, the convergence of GAN training has still not been proved.
3,638
0.96 stars / hour
##### Rethinking floating point for deep learning
Uniform quantization using integer multiply-add has been thoroughly investigated, which requires learning many quantization parameters, fine-tuning training or other prerequisites. In 16 bits, our log float multiply-add is 0.59x the power and 0.68x the area of IEEE 754 float16 fused multiply-add, maintaining the same signficand precision and dynamic range, proving useful for training ASICs as well.
166
0.57 stars / hour
##### Extended Isolation Forest
We present an extension to the model-free anomaly detection algorithm, Isolation Forest. This approach results in improved score maps.

46
0.39 stars / hour
##### Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow
Adversarial learning methods have been proposed for a wide range of applications, but the training of adversarial models can be notoriously unstable. By enforcing a constraint on the mutual information between the observations and the discriminator's internal representation, we can effectively modulate the discriminator's accuracy and maintain useful and informative gradients.

33
0.39 stars / hour
##### Horizon
A platform for Applied Reinforcement Learning (Applied RL)

1,169
0.37 stars / hour
##### Memory-Efficient Implementation of DenseNets
The DenseNet architecture is highly computationally efficient as a result of feature reuse. A 264-layer DenseNet (73M parameters), which previously would have been infeasible to train, can now be trained on a single workstation with 8 NVIDIA Tesla M40 GPUs.
18
0.36 stars / hour
##### Deep Interest Network for Click-Through Rate Prediction
Click-through rate prediction is an essential task in industrial applications, such as online advertising. In this way, user features are compressed into a fixed-length representation vector, in regardless of what candidate ads are.

117
0.33 stars / hour
##### Deep Interest Evolution Network for Click-Through Rate Prediction
For CTR prediction model, it is necessary to capture the latent user interest behind the user behavior data. As user interests are diverse, especially in the e-commerce system, we propose interest evolving layer to capture interest evolving process that is relative to the target item.

117
0.33 stars / hour
##### Revisiting Unreasonable Effectiveness of Data in Deep Learning Era
The success of deep learning in vision can be attributed to: (a) models with high capacity; (b) increased computational power; and (c) availability of large-scale labeled data. What will happen if we increase the dataset size by 10x or 100x?
1,159
0.29 stars / hour
##### Fully Supervised Speaker Diarization
In this paper, we propose a fully supervised speaker diarization approach, named unbounded interleaved-state recurrent neural networks (UIS-RNN). Given extracted speaker-discriminative embeddings (a.k.a.

26
0.28 stars / hour
##### models
Models and examples built with TensorFlow
44,262
0.26 stars / hour
##### ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech
In this work, we propose an alternative solution for parallel wave generation by WaveNet. In contrast to parallel WaveNet (Oord et al., 2018), we distill a Gaussian inverse autoregressive flow from the autoregressive WaveNet by minimizing a novel regularized KL divergence between their highly-peaked output distributions.

126
0.25 stars / hour
##### TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards.
114,272
0.24 stars / hour
##### Deep Residual Learning for Image Recognition
We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.
9,528
0.24 stars / hour
##### AllenNLP: A Deep Semantic Natural Language Processing Platform
This paper describes AllenNLP, a platform for research on deep learning methods in natural language understanding. AllenNLP is designed to support researchers who want to build novel language understanding models quickly and easily.

4,116
0.24 stars / hour
##### BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers.
1,583
0.20 stars / hour
##### tensor2tensor
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

5,649
0.17 stars / hour
##### Relational inductive biases, deep learning, and graph networks
This has been due, in part, to cheap data and cheap compute resources, which have fit the natural strengths of deep learning. As a companion to this paper, we have released an open-source software library for building graph networks, with demonstrations of how to use them in practice.

2,078
0.17 stars / hour
##### Progressive Neural Architecture Search
We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms. Our approach uses a sequential model-based optimization (SMBO) strategy, in which we search for structures in order of increasing complexity, while simultaneously learning a surrogate model to guide the search through structure space.
2,589
0.16 stars / hour
##### Albumentations: fast and flexible image augmentations
In computer vision domain, image augmentations have become a common implicit regularization technique to combat overfitting in deep convolutional neural networks and are ubiquitously used to improve performance. We provide examples of image augmentations for different computer vision tasks and show that Albumentations is faster than other commonly used image augmentation tools on the most of commonly used image transformations.
1,313
0.16 stars / hour
##### Horovod: fast and easy distributed deep learning in TensorFlow
Training modern deep learning models requires large amounts of computation, often provided by GPUs. Depending on the particular methods employed, this communication may entail anywhere from negligible to significant overhead.

4,236
0.16 stars / hour
Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection.
8,630
0.14 stars / hour
##### Glow: Generative Flow with Invertible 1x1 Convolutions
Flow-based generative models (Dinh et al., 2014) are conceptually attractive due to tractability of the exact log-likelihood, tractability of exact latent-variable inference, and parallelizability of both training and synthesis. In this paper we propose Glow, a simple type of generative flow using an invertible 1x1 convolution.
1,963
0.14 stars / hour
##### Benchmarking Deep Reinforcement Learning for Continuous Control
Recently, researchers have made significant progress combining the advances in deep learning for learning feature representations with reinforcement learning. Some notable examples include training agents to play Atari games based on raw pixel data and to acquire advanced manipulation skills using raw sensory inputs.
86
0.14 stars / hour
##### Progressive Growing of GANs for Improved Quality, Stability, and Variation
We describe a new training methodology for generative adversarial networks. The key idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly fine details as training progresses.
3,603
0.14 stars / hour
##### Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
The purpose of this technical report is two-fold. First of all, it introduces a suite of challenging continuous control tasks (integrated with OpenAI Gym) based on currently existing robotics hardware.
14,300
0.13 stars / hour
##### OpenAI Gym
OpenAI Gym is a toolkit for reinforcement learning research. It includes a growing collection of benchmark problems that expose a common interface, and a website where people can share their results and compare the performance of algorithms.
14,300
0.13 stars / hour
##### Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks
In this work, we propose a simple, lightweight approach for better context exploitation in CNNs. We also propose a parametric gather-excite operator pair which yields further performance gains, relate it to the recently-introduced Squeeze-and-Excitation Networks, and analyse the effects of these changes to the CNN feature activation statistics.
51
0.12 stars / hour
##### DropBlock: A regularization method for convolutional networks
This lack of success of dropout for convolutional layers is perhaps due to the fact that activation units in convolutional layers are spatially correlated so information can still flow through convolutional networks despite dropout. In this paper, we introduce DropBlock, a form of structured dropout, where units in a contiguous region of a feature map are dropped together.
10
0.12 stars / hour
##### BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers.
71
0.12 stars / hour
##### Glow: Generative Flow with Invertible 1x1 Convolutions
Flow-based generative models (Dinh et al., 2014) are conceptually attractive due to tractability of the exact log-likelihood, tractability of exact latent-variable inference, and parallelizability of both training and synthesis. In this paper we propose Glow, a simple type of generative flow using an invertible 1x1 convolution.
44
0.12 stars / hour
##### Auto-Keras: Efficient Neural Architecture Search with Network Morphism
Neural architecture search (NAS) has been proposed to automatically tune deep neural networks, but existing search algorithms usually suffer from expensive computational cost. Network morphism, which keeps the functionality of a neural network while changing its neural architecture, could be helpful for NAS by enabling a more efficient training during the search.
3,621
0.12 stars / hour
##### Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
Subword units are an effective way to alleviate the open vocabulary problems in neural machine translation (NMT). While sentences are usually converted into unique subword sequences, subword segmentation is potentially ambiguous and multiple segmentations are possible even with the same vocabulary.

1,613
0.12 stars / hour
##### Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Also, most NMT systems have difficulty with rare words. These issues have hindered NMT's use in practical deployments and services, where both accuracy and speed are essential.

1,613
0.12 stars / hour
##### Consistent Individualized Feature Attribution for Tree Ensembles
Interpreting predictions from tree ensemble methods such as gradient boosting machines and random forests is important, yet feature attribution for trees is often heuristic and not individualized for each prediction. Here we show that popular feature attribution methods are inconsistent, meaning they can lower a feature's assigned importance when the true impact of that feature actually increases.

2,601
0.12 stars / hour
##### Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
Gradient-based optimization is the foundation of deep learning and reinforcement learning. We introduce a general framework for learning low-variance, unbiased gradient estimators for black-box functions of random variables.
102
0.12 stars / hour
##### ELEGANT: Exchanging Latent Encodings with GAN for Transferring Multiple Face Attributes
Recent studies on face attribute transfer have achieved great success. A lot of models are able to transfer face attributes with an input image.
104
0.12 stars / hour
##### Real-Time Rotation-Invariant Face Detection with Progressive Calibration Networks
Rotation-invariant face detection, i.e. detecting faces with arbitrary rotation-in-plane (RIP) angles, is widely required in unconstrained applications but still remains as a challenging task, due to the large variations of face appearances. To address this problem more efficiently, we propose Progressive Calibration Networks (PCN) to perform rotation-invariant face detection in a coarse-to-fine manner.
567
0.12 stars / hour
##### Focal Loss for Dense Object Detection
We propose to address this class imbalance by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples. Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.
1,881
0.11 stars / hour
##### Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing
We present Memory Augmented Policy Optimization (MAPO), a simple and novel way to leverage a memory buffer of promising trajectories to reduce the variance of policy gradient estimate. MAPO is applicable to deterministic environments with discrete actions, such as structured prediction and combinatorial optimization tasks.
86
0.11 stars / hour
##### Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision
Harnessing the statistical power of neural networks to perform language understanding and symbolic reasoning is difficult, when it requires executing efficient discrete operations against a large knowledge-base. In this work, we introduce a Neural Symbolic Machine, which contains (a) a neural "programmer", i.e., a sequence-to-sequence model that maps language utterances to programs and utilizes a key-variable memory to handle compositionality (b) a symbolic "computer", i.e., a Lisp interpreter that performs program execution, and helps find good programs by pruning the search space.

86
0.11 stars / hour
##### IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
In this work we aim to solve a large collection of tasks using a single reinforcement learning agent with a single set of parameters. A key challenge is to handle the increased amount of data and extended training time.
86
0.11 stars / hour
##### PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume
We present a compact but effective CNN model for optical flow, called PWC-Net. It then uses the warped features and features of the first image to construct a cost volume, which is processed by a CNN to estimate the optical flow.
331
0.11 stars / hour
##### Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation
We investigate two crucial and closely related aspects of CNNs for optical flow estimation: models and training. First, we design a compact but effective CNN model, called PWC-Net, according to simple and well-established principles: pyramidal processing, warping, and cost volume processing.

331
0.11 stars / hour
##### Video-to-Video Synthesis
We study the problem of video-to-video synthesis, whose goal is to learn a mapping function from an input source video (e.g., a sequence of semantic segmentation masks) to an output photorealistic video that precisely depicts the content of the source video. Without understanding temporal dynamics, directly applying existing image synthesis approaches to an input video often results in temporally incoherent videos of low visual quality.
5,011
0.10 stars / hour
##### Bag of Tricks for Efficient Text Classification
This paper explores a simple and efficient baseline for text classification. Our experiments show that our fast text classifier fastText is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation.

3,128
0.09 stars / hour
##### A Sensitivity Analysis of (and Practitioners' Guide to) Convolutional Neural Networks for Sentence Classification
Convolutional Neural Networks (CNNs) have recently achieved remarkably strong performance on the practically important task of sentence classification (kim 2014, kalchbrenner 2014, johnson 2014). However, these models require practitioners to specify an exact model architecture and set accompanying hyperparameters, including the filter region size, regularization parameters, and so on.

3,128
0.09 stars / hour
##### Neural Machine Translation by Jointly Learning to Align and Translate
Neural machine translation is a recently proposed approach to machine translation. In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

3,128
0.09 stars / hour
##### Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
In this work, we are interested in generalizing convolutional neural networks (CNNs) from low-dimensional regular grids, where image, video and speech are represented, to high-dimensional irregular domains, such as social networks, brain connectomes or words' embedding, represented by graphs. We present a formulation of CNNs in the context of spectral graph theory, which provides the necessary mathematical background and efficient numerical schemes to design fast localized convolutional filters on graphs.
20
0.09 stars / hour
##### DeepSphere: Efficient spherical Convolutional Neural Network with HEALPix sampling for cosmological applications
We present a spherical CNN for analysis of full and partial HEALPix maps, which we call DeepSphere. This way, DeepSphere is a special case of a graph CNN, tailored to the HEALPix sampling of the sphere.

20
0.09 stars / hour
##### IQA: Visual Question Answering in Interactive Environments
The agent must navigate around the scene, acquire visual understanding of scene elements, interact with objects (e.g. open refrigerators) and plan for a series of actions conditioned on the question. Our experiments show that our proposed model outperforms popular single controller based methods on IQUAD V1.
46
0.09 stars / hour
##### A Structured Self-attentive Sentence Embedding
This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention. Instead of using a vector, we use a 2-D matrix to represent the embedding, with each row of the matrix attending on a different part of the sentence.

220
0.09 stars / hour
##### cilantro: a lean, versatile, and efficient library for point cloud data processing
We introduce cilantro, an open-source C++ library for geometric and general-purpose point cloud data processing. The library provides functionality that covers low-level point cloud operations, spatial reasoning, various methods for point cloud segmentation and generic data clustering, flexible algorithms for robust or local geometric alignment, model fitting, as well as powerful visualization tools.

109
0.09 stars / hour
##### Analogical Reasoning on Chinese Morphological and Semantic Relations
Analogical reasoning is effective in capturing linguistic regularities. This paper proposes an analogical reasoning task on Chinese.
3,142
0.09 stars / hour
##### Attention Is All You Need
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism.
194
0.09 stars / hour
##### Image Inpainting for Irregular Holes Using Partial Convolutions
Existing deep learning based image inpainting methods use a standard convolutional network over the corrupted image, using convolutional filter responses conditioned on both valid pixels as well as the substitute values in the masked holes (typically the mean value). This often leads to artifacts such as color discrepancy and blurriness.
334
0.09 stars / hour
##### Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. Our goal is to learn a mapping $G: X \rightarrow Y$ such that the distribution of images from $G(X)$ is indistinguishable from the distribution $Y$ using an adversarial loss.
5,814
0.09 stars / hour
##### Image-to-Image Translation with Conditional Adversarial Networks
We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping.
5,814
0.09 stars / hour
##### xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems
With the great success of deep neural networks (DNNs) in various fields, recently researchers have proposed several DNN-based factorization model to learn both low- and high-order feature interactions. On one hand, the xDeepFM is able to learn certain bounded-degree feature interactions explicitly; on the other hand, it can learn arbitrary low- and high-order feature interactions implicitly.

153
0.09 stars / hour
##### BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages
We present BPEmb, a collection of pre-trained subword unit embeddings in 275 languages, based on Byte-Pair Encoding (BPE). In an evaluation using fine-grained entity typing as testbed, BPEmb performs competitively, and for some languages bet- ter than alternative subword approaches, while requiring vastly fewer resources and no tokenization.

283
0.09 stars / hour
##### Prioritized Experience Replay
Experience replay lets online reinforcement learning agents remember and reuse experiences from the past. In prior work, experience transitions were uniformly sampled from a replay memory.
6,307
0.09 stars / hour
##### Implicit Quantile Networks for Distributional Reinforcement Learning
In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN. We achieve this by using quantile regression to approximate the full quantile function for the state-action return distribution.
6,307
0.09 stars / hour
##### Deep Reinforcement Learning
We draw a big picture, filled with details. We start with background of artificial intelligence, machine learning, deep learning, and reinforcement learning (RL), with resources.

6,307
0.09 stars / hour
##### Context Encoding for Semantic Segmentation
Recent work has made significant progress in improving spatial resolution for pixelwise labeling with Fully Convolutional Network (FCN) framework by employing Dilated/Atrous convolution, utilizing multi-scale features and refining boundaries. In this paper, we explore the impact of global contextual information in semantic segmentation by introducing the Context Encoding Module, which captures the semantic context of scenes and selectively highlights class-dependent featuremaps.
512
0.08 stars / hour
##### Deep TEN: Texture Encoding Network
We propose a Deep Texture Encoding Network (Deep-TEN) with a novel Encoding Layer integrated on top of convolutional layers, which ports the entire dictionary learning and encoding pipeline into a single model. The representation is orderless and therefore is particularly useful for material and texture recognition.
512
0.08 stars / hour
##### fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
2,433
0.08 stars / hour