# Trending Research

Ordered by accumulated GitHub stars in last 3 days
##### Context is Everything: Finding Meaning Statistically in Semantic Spaces
This paper introduces Contextual Salience (CoSal), a simple and explicit measure of a word's importance in context which is a more theoretically natural, practically simpler, and more accurate replacement to tf-idf. A word vector space generated with both bigram phrases and unigram tokens reveals that contextually significant words disproportionately define phrases.
89
0.91 stars / hour
##### Proximal Policy Optimization Algorithms
We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent. Whereas standard policy gradient methods perform one gradient update per data sample, we propose a novel objective function that enables multiple epochs of minibatch updates.
86
0.82 stars / hour
##### Learning regression and verification networks for long-term visual tracking
It is difficult to determine the presence of the target and re-search the target in the entire image. Within the proposed collaborative framework, we develop a matching based regression module and a classification based verification module for long-term visual tracking.
22
0.67 stars / hour
##### Efficient Algorithms for t-distributed Stochastic Neighborhood Embedding
t-distributed Stochastic Neighborhood Embedding (t-SNE) is a method for dimensionality reduction and visualization that has become widely popular in recent years. Efficient implementations of t-SNE are available, but they scale poorly to datasets with hundreds of thousands to millions of high dimensional data-points.
241
0.58 stars / hour
##### Video-to-Video Synthesis
We study the problem of video-to-video synthesis, whose goal is to learn a mapping function from an input source video (e.g., a sequence of semantic segmentation masks) to an output photorealistic video that precisely depicts the content of the source video. Without understanding temporal dynamics, directly applying existing image synthesis approaches to an input video often results in temporally incoherent videos of low visual quality.
3,745
0.50 stars / hour
##### Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition
Face recognition has witnessed great progress in recent years, mainly attributed to the high-capacity model designed and the abundant labeled data collected. However, it becomes more and more prohibitive to scale up the current million-level identity annotations.
84
0.44 stars / hour
##### Dual Attention Network for Scene Segmentation
Specifically, we append two types of attention modules on top of traditional dilated FCN, which model the semantic interdependencies in spatial and channel dimensions respectively. The position attention module selectively aggregates the features at each position by a weighted sum of the features at all positions.
187
0.44 stars / hour
##### Deep Interest Network for Click-Through Rate Prediction
Click-through rate prediction is an essential task in industrial applications, such as online advertising. In this way, user features are compressed into a fixed-length representation vector, in regardless of what candidate ads are.
43
0.40 stars / hour
##### Deep Interest Evolution Network for Click-Through Rate Prediction
For CTR prediction model, it is necessary to capture the latent user interest behind the user behavior data. As user interests are diverse, especially in the e-commerce system, we propose interest evolving layer to capture interest evolving process that is relative to the target item.
43
0.40 stars / hour
##### TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards.
109,790
0.37 stars / hour
##### models
Models and examples built with TensorFlow
41,574
0.33 stars / hour
##### Adversarial Risk and the Dangers of Evaluating Against Weak Attacks
We motivate 'adversarial risk' as an objective for achieving models robust to worst-case inputs. We then frame commonly used attacks and evaluation metrics as defining a tractable surrogate objective to the true adversarial risk.
89
0.32 stars / hour
##### Focal Loss in 3D Object Detection
3D object detection is still an open problem in autonomous driving scenes. Robots recognize and localize key objects from sparse inputs, and suffer from a larger continuous searching space as well as serious fore-background imbalance compared to the image-based detection.
11
0.29 stars / hour
##### ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks
The Super-Resolution Generative Adversarial Network (SRGAN) is a seminal work that is capable of generating realistic textures during single image super-resolution. To further enhance the visual quality, we thoroughly study three key components of SRGAN - network architecture, adversarial loss and perceptual loss, and improve each of them to derive an Enhanced SRGAN (ESRGAN).
118
0.29 stars / hour
##### Medical Image Synthesis for Data Augmentation and Anonymization using Generative Adversarial Networks
Medical imaging data sets are often imbalanced as pathologic findings are generally rare, which introduces significant challenges when training deep learning models. In this work, we propose a method to generate synthetic abnormal MRI images with brain tumors by training a generative adversarial network using two publicly available data sets of brain MRI.
21
0.29 stars / hour
##### A Probabilistic U-Net for Segmentation of Ambiguous Images
To this end we propose a generative segmentation model based on a combination of a U-Net with a conditional variational autoencoder that is capable of efficiently producing an unlimited number of plausible hypotheses. We show on a lung abnormalities segmentation task and on a Cityscapes segmentation task that our model reproduces the possible segmentation variants as well as the frequencies with which they occur, doing so significantly better than published approaches.
52
0.26 stars / hour
##### Commonsense for Generative Multi-Hop Question Answering Tasks
Reading comprehension QA tasks have seen a recent surge in popularity, yet most works have focused on fact-finding extractive QA. We instead focus on a more challenging multi-hop generative task (NarrativeQA), which requires the model to reason, gather, and synthesize disjoint pieces of information within the context to generate an answer.
9
0.25 stars / hour
##### An Integral Pose Regression System for the ECCV2018 PoseTrack Challenge
For the ECCV 2018 PoseTrack Challenge, we present a 3D human pose estimation system based mainly on the integral human pose regression method. We show a comprehensive ablation study to examine the key performance factors of the proposed system.
32
0.25 stars / hour
##### Integral Human Pose Regression
State-of-the-art human pose estimation methods are based on heat map representation. In spite of the good performance, the representation has a few issues in nature, such as not differentiable and quantization error.
32
0.25 stars / hour
##### Deep Speech: Scaling up end-to-end speech recognition
We present a state-of-the-art speech recognition system developed using end-to-end deep learning. Our architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments.
7,878
0.24 stars / hour
##### OCNet: Object Context Network for Scene Parsing
According to that the label of each pixel $\mathit{P}$ is defined as the category of the object it belongs to, we propose the pixel-wise Object Context that consists of the objects belonging to the same category with pixel $\mathit{P}$. Since the ground truth objects that the pixel $\mathit{P}$ belonging to is unavailable, we employ the self-attention method to approximate the objects by learning a pixel-wise similarity map.
159
0.23 stars / hour
##### Detectron
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
16,297
0.20 stars / hour
##### Axiomatic Attribution for Deep Networks
We study the problem of attributing the prediction of a deep network to its input features, a problem previously studied by several other works. We identify two fundamental axioms---Sensitivity and Implementation Invariance that attribution methods ought to satisfy.
2,067
0.19 stars / hour
##### SmoothGrad: removing noise by adding noise
Explaining the output of a deep network remains a challenge. In the case of an image classifier, one type of explanation is to identify pixels that strongly influence the final decision.
2,067
0.19 stars / hour
##### Temporal Relational Reasoning in Videos
Temporal relational reasoning, the ability to link meaningful transformations of objects or entities over time, is a fundamental property of intelligent species. In this paper, we introduce an effective and interpretable network module, the Temporal Relation Network (TRN), designed to learn and reason about temporal dependencies between video frames at multiple time scales.
230
0.18 stars / hour
Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection.
7,645
0.18 stars / hour
##### Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation
What is the right object representation for manipulation? In this paper we present Dense Object Nets, which build on recent developments in self-supervised dense descriptor learning, as a consistent object representation for visual understanding and manipulation.
199
0.17 stars / hour
##### Consistent Individualized Feature Attribution for Tree Ensembles
Interpreting predictions from tree ensemble methods such as gradient boosting machines and random forests is important, yet feature attribution for trees is often heuristic and not individualized for each prediction. Here we show that popular feature attribution methods are inconsistent, meaning they can lower a feature's assigned importance when the true impact of that feature actually increases.
2,069
0.17 stars / hour
##### Axiomatic Attribution for Deep Networks
We study the problem of attributing the prediction of a deep network to its input features, a problem previously studied by several other works. We identify two fundamental axioms---Sensitivity and Implementation Invariance that attribution methods ought to satisfy.
2,069
0.17 stars / hour
##### SmoothGrad: removing noise by adding noise
Explaining the output of a deep network remains a challenge. In the case of an image classifier, one type of explanation is to identify pixels that strongly influence the final decision.
2,069
0.17 stars / hour
##### Billion-scale similarity search with GPUs
Similarity search finds application in specialized database systems handling complex data such as images or videos, which are typically represented by high-dimensional features and require specific indexing structures. We propose a design for k-selection that operates at up to 55% of theoretical peak performance, enabling a nearest neighbor implementation that is 8.5x faster than prior GPU state of the art.
4,612
0.16 stars / hour
##### Polysemous codes
This paper considers the problem of approximate nearest neighbor search in the compressed domain. We introduce polysemous codes, which offer both the distance estimation quality of product quantization and the efficient comparison of binary codes with Hamming distance.
4,612
0.16 stars / hour
##### Towards End-to-End Lane Detection: an Instance Segmentation Approach
Modern cars are incorporating an increasing number of driver assist features, among which automatic lane keeping. By doing so, we ensure a lane fitting which is robust against road plane changes, unlike existing approaches that rely on a fixed, pre-defined transformation.
174
0.16 stars / hour
##### Deep Exemplar-based Colorization
More importantly, as opposed to other learning-based colorization methods, our network allows the user to achieve customizable results by simply feeding different references. The colorization can be performed fully automatically by simply picking the top reference suggestion.
75
0.15 stars / hour
##### Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
To achieve this, we propose a perceptual loss function which consists of an adversarial loss and a content loss. The adversarial loss pushes our solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images.
80
0.15 stars / hour
##### Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform
In this paper, we show that it is possible to recover textures faithful to semantic classes. In particular, we only need to modulate features of a few intermediate layers in a single network conditioned on semantic segmentation probability maps.
80
0.15 stars / hour
##### UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimension reduction. UMAP is constructed from a theoretical framework based in Riemannian geometry and algebraic topology.
1,869
0.15 stars / hour
##### Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation
We introduce Texar, an open-source toolkit aiming to support the broad set of text generation tasks that transforms any inputs into natural language, such as machine translation, summarization, dialog, content manipulation, and so forth. With the design goals of modularity, versatility, and extensibility in mind, Texar extracts common patterns underlying the diverse tasks and methodologies, creates a library of highly reusable modules and functionalities, and allows arbitrary model architectures and algorithmic paradigms.
487
0.15 stars / hour
##### Analogical Reasoning on Chinese Morphological and Semantic Relations
Analogical reasoning is effective in capturing linguistic regularities. This paper proposes an analogical reasoning task on Chinese.
2,598
0.15 stars / hour
##### Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. Our goal is to learn a mapping $G: X \rightarrow Y$ such that the distribution of images from $G(X)$ is indistinguishable from the distribution $Y$ using an adversarial loss.
5,236
0.15 stars / hour
##### Image-to-Image Translation with Conditional Adversarial Networks
We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping.
5,236
0.15 stars / hour
##### Efficient Neural Architecture Search with Network Morphism
Network morphism, which keeps the functionality of a neural network while changing its neural architecture, could be helpful for NAS by enabling a more efficient training during the search. In this paper, we propose a novel framework enabling Bayesian optimization to guide the network morphism for efficient neural architecture search by introducing a neural network kernel and a tree-structured acquisition function optimization algorithm.
3,162
0.15 stars / hour
##### Scaling Neural Machine Translation
Sequence to sequence learning models still require several days to reach state of the art performance on large benchmark datasets using a single machine. This paper shows that reduced precision and large batch training can speedup training by nearly 5x on a single 8-GPU machine with careful tuning and implementation.
2,110
0.15 stars / hour
##### Classical Structured Prediction Losses for Sequence to Sequence Learning
There has been much recent work on training neural attention models at the sequence-level using either reinforcement learning-style methods or by optimizing the beam. In this paper, we survey a range of classical objective functions that have been widely used to train linear models for structured prediction and apply them to neural sequence to sequence models.
2,110
0.15 stars / hour
##### Hierarchical Neural Story Generation
We explore story generation: creative systems that can build coherent and fluent passages of text about a topic. We collect a large dataset of 300K human-written stories paired with writing prompts from an online forum.
2,110
0.15 stars / hour
##### fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
2,110
0.15 stars / hour
##### Path Aggregation Network for Instance Segmentation
The way that information propagates in neural networks is of great importance. In this paper, we propose Path Aggregation Network (PANet) aiming at boosting information flow in proposal-based instance segmentation framework.
214
0.15 stars / hour
##### A Fully Progressive Approach to Single-Image Super-Resolution
Recent deep learning approaches to single image super-resolution have achieved impressive results in terms of traditional error measures and perceptual quality. However, in each case it remains challenging to achieve high quality results for large upsampling factors.
355
0.14 stars / hour
##### NiftyNet: a deep-learning platform for medical imaging
NiftyNet provides a modular deep-learning pipeline for a range of medical imaging applications including segmentation, regression, image generation and representation learning applications. NiftyNet enables researchers to rapidly develop and distribute deep learning solutions for segmentation, regression, image generation and representation learning applications, or extend the platform to new applications.
621
0.14 stars / hour
##### Phrase-Based & Neural Unsupervised Machine Translation
Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs. On low-resource languages like English-Urdu and English-Romanian, our methods achieve even better results than semi-supervised and supervised approaches leveraging the paucity of available bitexts.
728
0.14 stars / hour
##### Unsupervised Machine Translation Using Monolingual Corpora Only
Machine translation has recently achieved impressive performance thanks to recent advances in deep learning and the availability of large-scale parallel corpora. By learning to reconstruct in both languages from this shared feature space, the model effectively learns to translate without using any labeled data.
728
0.14 stars / hour
##### AllenNLP: A Deep Semantic Natural Language Processing Platform
This paper describes AllenNLP, a platform for research on deep learning methods in natural language understanding. AllenNLP is designed to support researchers who want to build novel language understanding models quickly and easily.
3,330
0.14 stars / hour
##### Progressive Neural Architecture Search
We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms. Our approach uses a sequential model-based optimization (SMBO) strategy, in which we search for structures in order of increasing complexity, while simultaneously learning a surrogate model to guide the search through structure space.
2,068
0.14 stars / hour
##### Generalisation in humans and deep neural networks
We compare the robustness of humans and current convolutional deep neural networks (DNNs) on object recognition under twelve different types of image degradations. First, using three well known DNNs (ResNet-152, VGG-19, GoogLeNet) we find the human visual system to be more robust to nearly all of the tested image manipulations, and we observe progressively diverging classification error-patterns between humans and DNNs when the signal gets weaker.
26
0.14 stars / hour
##### Self-Attention Generative Adversarial Networks
In this paper, we propose the Self-Attention Generative Adversarial Network (SAGAN) which allows attention-driven, long-range dependency modeling for image generation tasks. Traditional convolutional GANs generate high-resolution details as a function of only spatially local points in lower-resolution feature maps.
460
0.14 stars / hour
##### Spectral Normalization for Generative Adversarial Networks
One of the challenges in the study of generative adversarial networks is the instability of its training. In this paper, we propose a novel weight normalization technique called spectral normalization to stabilize the training of the discriminator.
460
0.14 stars / hour
##### CoQA: A Conversational Question Answering Challenge
Humans gather information by engaging in conversations involving a series of interconnected questions and answers. For machines to assist in information gathering, it is therefore essential to enable them to answer conversational questions.
749
0.13 stars / hour
##### XGBoost: A Scalable Tree Boosting System
In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning.
13,502
0.13 stars / hour
##### Distractor-aware Siamese Networks for Visual Object Tracking
In this paper, we focus on learning distractor-aware Siamese networks for accurate and long-term tracking. During the off-line training phase, an effective sampling strategy is introduced to control this distribution and make the model focus on the semantic distractors.
291
0.13 stars / hour
##### The Lovász-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks
The Jaccard index, also referred to as the intersection-over-union score, is commonly employed in the evaluation of image segmentation results given its perceptual qualities, scale invariance - which lends appropriate relevance to small objects, and appropriate counting of false negatives, in comparison to per-pixel losses. We present a method for direct optimization of the mean intersection-over-union loss in neural networks, in the context of semantic image segmentation, based on the convex Lov\'asz extension of submodular losses.
228
0.13 stars / hour
##### Neural Factorization Machines for Sparse Predictive Analytics
Many predictive tasks of web applications need to model categorical variables, such as user IDs and demographics like genders and occupations. However, FM models feature interactions in a linear way, which can be insufficient for capturing the non-linear and complex inherent structure of real-world data.
117
0.13 stars / hour
##### xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems
With the great success of deep neural networks (DNNs) in various fields, recently researchers have proposed several DNN-based factorization model to learn both low- and high-order feature interactions. On one hand, the xDeepFM is able to learn certain bounded-degree feature interactions explicitly; on the other hand, it can learn arbitrary low- and high-order feature interactions implicitly.
117
0.13 stars / hour
##### One-Shot Unsupervised Cross Domain Translation
Given a single image x from domain A and a set of images from domain B, our task is to generate the analogous of x in B. We argue that this task could be a key AI capability that underlines the ability of cognitive agents to act in the world and present empirical evidence that the existing unsupervised domain translation methods fail on this task.
59
0.13 stars / hour
##### Style Transfer Through Back-Translation
This paper introduces a new method for automatic style transfer. We first learn a latent representation of the input sentence which is grounded in a language translation model in order to better preserve the meaning of the sentence while reducing stylistic properties.
39
0.13 stars / hour
##### Focal Loss for Dense Object Detection
We propose to address this class imbalance by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples. Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.
1,628
0.13 stars / hour
##### FaceNet: A Unified Embedding for Face Recognition and Clustering
Despite significant recent advances in the field of face recognition, implementing face verification and recognition efficiently at scale presents serious challenges to current approaches. On the widely used Labeled Faces in the Wild (LFW) dataset, our system achieves a new record accuracy of 99.63%.
5,767
0.13 stars / hour
##### RecoGym: A Reinforcement Learning Environment for the problem of Product Recommendation in Online Advertising
Recommender Systems are becoming ubiquitous in many settings and take many forms, from product recommendation in e-commerce stores, to query suggestions in search engines, to friend recommendation in social networks. Current research directions which are largely based upon supervised learning from historical data appear to be showing diminishing returns with a lot of practitioners report a discrepancy between improvements in offline metrics for supervised learning and the online performance of the newly proposed models.
5
0.12 stars / hour
##### SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
Visual attention has been successfully applied in structural prediction tasks such as visual captioning and question answering. Existing visual attention models are generally spatial, i.e., the attention is modeled as spatial probabilities that re-weight the last conv-layer feature map of a CNN encoding an input image.
80
0.12 stars / hour
##### Horovod: fast and easy distributed deep learning in TensorFlow
Training modern deep learning models requires large amounts of computation, often provided by GPUs. Depending on the particular methods employed, this communication may entail anywhere from negligible to significant overhead.
3,744
0.12 stars / hour
##### Learning Named Entity Tagger using Domain-Specific Dictionary
Recent advances in deep neural models allow us to build reliable named entity recognition (NER) systems without handcrafting features. However, such methods require large amounts of manually-labeled training data.
20
0.12 stars / hour
##### Automated Phrase Mining from Massive Text Corpora
As one of the fundamental tasks in text analysis, phrase mining aims at extracting quality phrases from a text corpus. Since one can easily obtain many quality phrases from public knowledge bases to a scale that is much larger than that produced by human experts, in this paper, we propose a novel framework for automated phrase mining, AutoPhrase, which leverages this large amount of high-quality phrases in an effective way and achieves better performance compared to limited human labeled phrases.
20
0.12 stars / hour
##### Generative Image Inpainting with Contextual Attention
Motivated by these observations, we propose a new deep generative model-based approach which can not only synthesize novel image structures but also explicitly utilize surrounding image features as references during network training to make better predictions. The model is a feed-forward, fully convolutional neural network which can process images with multiple holes at arbitrary locations and with variable sizes during the test time.
406
0.12 stars / hour
##### Free-Form Image Inpainting with Gated Convolution
We present a novel deep learning based image inpainting system to complete images with free-form masks and inputs. Furthermore, visualization of learned feature representations reveals the effectiveness of gated convolution and provides an interpretation of how the proposed neural network fills in missing regions.
406
0.12 stars / hour
##### Ray: A Distributed Framework for Emerging AI Applications
The next generation of AI applications will continuously interact with the environment and learn from these interactions. These applications impose new and demanding systems requirements, both in terms of performance and flexibility.
4,245
0.12 stars / hour