# Top Papers

Ordered by total GitHub stars
##### TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards.
107,618
##### models
Models and examples built with TensorFlow
40,033
##### Caffe: Convolutional Architecture for Fast Feature Embedding
The framework is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures. Caffe fits industry and internet-scale media needs by CUDA GPU computation, processing over 40 million images a day on a single K40 or Titan GPU ($\approx$ 2.5 ms per image).
25,209
##### A Neural Algorithm of Artistic Style
In fine art, especially painting, humans have mastered the skill to create unique visual experiences through composing a complex interplay between the content and style of an image. Thus far the algorithmic basis of this process is unknown and there exists no artificial system with similar capabilities.
15,998
##### Detectron
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
15,787
##### Enriching Word Vectors with Subword Information
Continuous word representations, trained on large unlabeled corpora are useful for many natural language processing tasks. A vector representation is associated to each character $n$-gram; words being represented as the sum of these representations.
15,217
##### FastText.zip: Compressing text classification models
We consider the problem of producing compact architectures for text classification, such that the full model fits in a limited amount of memory. After considering different solutions inspired by the hashing literature, we propose a method built upon product quantization to store word embeddings.
15,217
##### Bag of Tricks for Efficient Text Classification
This paper explores a simple and efficient baseline for text classification. Our experiments show that our fast text classifier fastText is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation.
15,217
##### Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
The purpose of this technical report is two-fold. First of all, it introduces a suite of challenging continuous control tasks (integrated with OpenAI Gym) based on currently existing robotics hardware.
13,218
##### OpenAI Gym
OpenAI Gym is a toolkit for reinforcement learning research. It includes a growing collection of benchmark problems that expose a common interface, and a website where people can share their results and compare the performance of algorithms.
13,218
##### XGBoost: A Scalable Tree Boosting System
In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning.
13,144
##### Image Super-Resolution Using Deep Convolutional Networks
The mapping is represented as a deep convolutional neural network (CNN) that takes the low-resolution image as the input and outputs the high-resolution one. We further show that traditional sparse-coding-based SR methods can also be viewed as a deep convolutional network.
11,796
##### Deep Photo Style Transfer
This paper introduces a deep-learning approach to photographic style transfer that handles a large variety of image content while faithfully transferring the reference style. Our approach builds upon the recent work on painterly transfer that separates style from the content of an image by considering different layers of a neural network.
11,025
##### pix2code: Generating Code from a Graphical User Interface Screenshot
Transforming a graphical user interface screenshot created by a designer into computer code is a typical task conducted by a developer in order to build customized software, websites, and mobile applications. In this paper, we show that deep learning methods can be leveraged to train a model end-to-end to automatically generate code from a single input image with over 77% of accuracy for three different platforms (i.e. iOS, Android and web-based technologies).
9,390
##### A Closed-form Solution to Photorealistic Image Stylization
Photorealistic image stylization concerns transferring style of a reference photo to a content photo with the constraint that the stylized photo should remain photorealistic. The proposed method consists of a stylization step and a smoothing step.
8,480
##### Generating Sequences With Recurrent Neural Networks
This paper shows how Long Short-term Memory recurrent neural networks can be used to generate complex sequences with long-range structure, simply by predicting one data point at a time. The approach is demonstrated for text (where the data are discrete) and online handwriting (where the data are real-valued).
8,342
##### Deep Speech: Scaling up end-to-end speech recognition
We present a state-of-the-art speech recognition system developed using end-to-end deep learning. Our architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments.
7,539
Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection.
7,149
##### Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. Our goal is to learn a mapping $G: X \rightarrow Y$ such that the distribution of images from $G(X)$ is indistinguishable from the distribution $Y$ using an adversarial loss.
6,907
##### AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles
Developing and testing algorithms for autonomous vehicles in real world is an expensive and time consuming process. Also, in order to utilize recent advances in machine intelligence and deep learning we need to collect a large amount of annotated training data in a variety of conditions and environments.
5,544
##### FaceNet: A Unified Embedding for Face Recognition and Clustering
Despite significant recent advances in the field of face recognition, implementing face verification and recognition efficiently at scale presents serious challenges to current approaches. On the widely used Labeled Faces in the Wild (LFW) dataset, our system achieves a new record accuracy of 99.63%.
5,481
##### DeepMind Lab
DeepMind Lab is a first-person 3D game platform designed for research and development of general artificial intelligence and machine learning systems. DeepMind Lab can be used to study how autonomous artificial agents may learn complex tasks in large, partially observed, and visually diverse worlds.
5,238
##### Image-to-Image Translation with Conditional Adversarial Networks
We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping.
5,072
##### Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. Our goal is to learn a mapping $G: X \rightarrow Y$ such that the distribution of images from $G(X)$ is indistinguishable from the distribution $Y$ using an adversarial loss.
4,894
##### Image-to-Image Translation with Conditional Adversarial Networks
We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping.
4,894
##### StarCraft II: A New Challenge for Reinforcement Learning
Finally, we present initial baseline results for canonical deep reinforcement learning agents applied to the StarCraft II domain. On the mini-games, these agents learn to achieve a level of play that is comparable to a novice player.
4,870
##### Show and Tell: A Neural Image Caption Generator
Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descriptions. For instance, while the current state-of-the-art BLEU-1 score (the higher the better) on the Pascal dataset is 25, our approach yields 59, to be compared to human performance around 69.
4,786
##### tensor2tensor
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
4,725
##### Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features---using the recently popular terminology of neural networks with 'attention' mechanisms, the RPN component tells the unified network where to look.
4,666
##### Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention.
4,584
##### Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science
As the field of data science continues to grow, there will be an ever-increasing demand for tools that make machine learning accessible to non-experts. In this paper, we introduce the concept of tree-based pipeline optimization for automating one of the most tedious parts of machine learning---pipeline design.
4,456
##### Billion-scale similarity search with GPUs
Similarity search finds application in specialized database systems handling complex data such as images or videos, which are typically represented by high-dimensional features and require specific indexing structures. We propose a design for k-selection that operates at up to 55% of theoretical peak performance, enabling a nearest neighbor implementation that is 8.5x faster than prior GPU state of the art.
4,375
##### Polysemous codes
This paper considers the problem of approximate nearest neighbor search in the compressed domain. We introduce polysemous codes, which offer both the distance estimation quality of product quantization and the efficient comparison of binary codes with Hamming distance.
4,375
##### "Why Should I Trust You?": Explaining the Predictions of Any Classifier
Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model.
4,364
##### A guide to convolution arithmetic for deep learning
We introduce a guide to help deep learning practitioners understand and manipulate convolutional neural network architectures. The guide clarifies the relationship between various properties (input shape, kernel shape, zero padding, strides and output shape) of convolutional, pooling and transposed convolutional layers, as well as the relationship between convolutional and transposed convolutional layers.
4,339
##### Practical recommendations for gradient-based training of deep architectures
Learning algorithms related to artificial neural networks and in particular for Deep Learning may seem to involve many bells and whistles, called hyper-parameters. It also discusses how to deal with the fact that more interesting results can be obtained when allowing one to adjust many hyper-parameters.
4,329
##### Practical recommendations for gradient-based training of deep architectures
Learning algorithms related to artificial neural networks and in particular for Deep Learning may seem to involve many bells and whistles, called hyper-parameters. It also discusses how to deal with the fact that more interesting results can be obtained when allowing one to adjust many hyper-parameters.
4,329
##### Increasing the Action Gap: New Operators for Reinforcement Learning
This paper introduces new optimality-preserving operators on Q-functions. Extending the idea of a locally consistent operator, we then derive sufficient conditions for an operator to preserve optimality, leading to a family of operators which includes our consistent Bellman operator.
4,130
##### A Neural Algorithm of Artistic Style
In fine art, especially painting, humans have mastered the skill to create unique visual experiences through composing a complex interplay between the content and style of an image. Thus far the algorithmic basis of this process is unknown and there exists no artificial system with similar capabilities.
4,077
##### Massive Exploration of Neural Machine Translation Architectures
Neural Machine Translation (NMT) has shown remarkable progress over the past few years with production systems now being deployed to end-users. One major drawback of current architectures is that they are expensive to train, typically requiring days to weeks of GPU time to converge.
4,046
##### Deep Painterly Harmonization
Copying an element from a photo and pasting it into a painting is a challenging task. Applying photo compositing techniques in this context yields subpar results that look like a collage --- and existing painterly stylization algorithms, which are global, perform poorly when applied locally.
3,980
##### Very Deep Convolutional Networks for Large-Scale Image Recognition
In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
3,912
##### Deep Residual Learning for Image Recognition
We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.
3,912
##### Rethinking the Inception Architecture for Computer Vision
Convolutional networks are at the core of most state-of-the-art computer vision solutions for a wide variety of tasks. Since 2014 very deep convolutional networks started to become mainstream, yielding substantial gains in various benchmarks.
3,912
##### Ray: A Distributed Framework for Emerging AI Applications
The next generation of AI applications will continuously interact with the environment and learn from these interactions. These applications impose new and demanding systems requirements, both in terms of performance and flexibility.
3,909
##### Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms
We present Fashion-MNIST, a new dataset comprising of 28x28 grayscale images of 70,000 fashion products from 10 categories, with 7,000 images per category. The training set has 60,000 images and the test set has 10,000 images.
3,833
##### Convolutional Neural Networks for Sentence Classification
We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks. We show that a simple CNN with little hyperparameter tuning and static vectors achieves excellent results on multiple benchmarks.
3,706
##### A Sensitivity Analysis of (and Practitioners' Guide to) Convolutional Neural Networks for Sentence Classification
Convolutional Neural Networks (CNNs) have recently achieved remarkably strong performance on the practically important task of sentence classification (kim 2014, kalchbrenner 2014, johnson 2014). However, these models require practitioners to specify an exact model architecture and set accompanying hyperparameters, including the filter region size, regularization parameters, and so on.
3,706
##### Learning a Driving Simulator
Comma.ai's approach to Artificial Intelligence for self-driving cars is based on an agent that learns to clone driver behaviors and plans maneuvers by simulating future events in the road. This paper illustrates one of our research approaches for driving simulation.
3,677
The move from hand-designed features to learned features in machine learning has been wildly successful. In spite of this, optimization algorithms are still designed by hand.
3,562
##### ParlAI
A framework for training and evaluating AI models on a variety of openly available dialog datasets.
3,551
##### You Only Look Once: Unified, Real-Time Object Detection
A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation. A smaller version of the network, Fast YOLO, processes an astounding 155 frames per second while still achieving double the mAP of other real-time detectors.
3,502
##### YOLO9000: Better, Faster, Stronger
YOLO9000 gets 19.7 mAP on the ImageNet detection validation set despite only having detection data for 44 of the 200 classes. On the 156 classes not in COCO, YOLO9000 gets 16.0 mAP.
3,502
##### Horovod: fast and easy distributed deep learning in TensorFlow
Training modern deep learning models requires large amounts of computation, often provided by GPUs. Depending on the particular methods employed, this communication may entail anywhere from negligible to significant overhead.
3,442
##### First-Pass Large Vocabulary Continuous Speech Recognition using Bi-Directional Recurrent DNNs
Recent work demonstrated the feasibility of discarding the HMM sequence modeling framework by directly predicting transcript text from audio. This approach to decoding enables first-pass speech recognition with a language model, completely unaided by the cumbersome infrastructure of HMM-based systems.
3,329
##### Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
We show that an end-to-end deep learning approach can be used to recognize either English or Mandarin Chinese speech--two vastly different languages. Key to our approach is our application of HPC techniques, resulting in a 7x speedup over our previous system.
3,329
##### Image-to-Image Translation with Conditional Adversarial Networks
We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping.
3,318
##### Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis
This paper studies a combination of generative Markov random field (MRF) models and discriminatively trained deep convolutional neural networks (dCNNs) for synthesizing 2D images. The generative MRF acts on higher-levels of a dCNN feature pyramid, controling the image layout at an abstract level.
3,233
##### DensePose: Dense Human Pose Estimation In The Wild
In this work, we establish dense correspondences between RGB image and a surface-based representation of the human body, a task we refer to as dense human pose estimation. We first gather dense correspondences for 50K persons appearing in the COCO dataset by introducing an efficient annotation pipeline.
3,215
##### CatBoost: unbiased boosting with categorical features
This paper presents the key algorithmic techniques behind CatBoost, a state-of-the-art open-source gradient boosting toolkit. Their combination leads to CatBoost outperforming other publicly available boosting implementations in terms of quality on a variety of datasets.
3,101
##### Convolutional Sequence to Sequence Learning
The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks. We introduce an architecture based entirely on convolutional neural networks.
3,039
##### A Convolutional Encoder Model for Neural Machine Translation
The prevalent approach to neural machine translation relies on bi-directional LSTMs to encode the source sentence. In this paper we present a faster and simpler architecture based on a succession of convolutional layers.
3,039
##### Neural Machine Translation of Rare Words with Subword Units
Neural machine translation (NMT) models typically operate with a fixed vocabulary, but translation is an open-vocabulary problem. Previous work addresses the translation of out-of-vocabulary words by backing off to a dictionary.
3,039
##### Progressive Growing of GANs for Improved Quality, Stability, and Variation
We describe a new training methodology for generative adversarial networks. The key idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly fine details as training progresses.
3,035
##### Dynamic Routing Between Capsules
We use the length of the activity vector to represent the probability that the entity exists and its orientation to represent the instantiation parameters. Active capsules at one level make predictions, via transformation matrices, for the instantiation parameters of higher-level capsules.
2,965
##### AllenNLP: A Deep Semantic Natural Language Processing Platform
This paper describes AllenNLP, a platform for research on deep learning methods in natural language understanding. AllenNLP is designed to support researchers who want to build novel language understanding models quickly and easily.
2,964
##### Distributed and parallel time series feature extraction for industrial big data applications
The all-relevant problem of feature selection is the identification of all strongly and weakly relevant attributes. This problem is especially hard to solve for time series classification and regression in industrial applications such as predictive maintenance or production line optimization, for which each label or regression target is associated with several time series and meta-information simultaneously.
2,946
##### Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data
Word embeddings are a popular approach to unsupervised learning of word relationships that are widely used in natural language processing. In this article, we present a new set of embeddings for medical concepts learned using an extremely large collection of multimodal medical data.
2,906
##### PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts
The dataset consists of approximately 200,000 abstracts of randomized controlled trials, totaling 2.3 million sentences. First, the majority of datasets for sequential short-text classification (i.e., classification of short texts that appear in sequences) are small: we hope that releasing a new large dataset will help develop more accurate algorithms for this task.
2,906
##### StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation
Recent studies have shown remarkable success in image-to-image translation for two domains. To address this limitation, we propose StarGAN, a novel and scalable approach that can perform image-to-image translations for multiple domains using only a single model.
2,890
##### cleverhans
An adversarial example library for constructing attacks, building defenses, and benchmarking both
2,892
##### DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
We evaluate whether features extracted from the activation of a deep convolutional network trained in a fully supervised fashion on a large, fixed set of object recognition tasks can be re-purposed to novel generic tasks. Our generic tasks may differ significantly from the originally trained tasks and there may be insufficient labeled or unlabeled data to conventionally train or adapt a deep architecture to the new tasks.
2,862
##### Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention.
2,816
##### Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
We present an approach to efficiently detect the 2D pose of multiple people in an image. The approach uses a nonparametric representation, which we refer to as Part Affinity Fields (PAFs), to learn to associate body parts with individuals in the image.
2,789
##### Densely Connected Convolutional Networks
Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion.
2,770
##### Memory-Efficient Implementation of DenseNets
The DenseNet architecture is highly computationally efficient as a result of feature reuse. A 264-layer DenseNet (73M parameters), which previously would have been infeasible to train, can now be trained on a single workstation with 8 NVIDIA Tesla M40 GPUs.
2,770
##### Instance Normalization: The Missing Ingredient for Fast Stylization
It this paper we revisit the fast stylization method introduced in Ulyanov et. We show how a small change in the stylization architecture results in a significant qualitative improvement in the generated images.
2,669
##### Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
The game of chess is the most widely-studied domain in the history of artificial intelligence. The strongest programs are based on a combination of sophisticated search techniques, domain-specific adaptations, and handcrafted evaluation functions that have been refined by human experts over several decades.
2,660
##### Understanding Neural Networks Through Deep Visualization
Recent years have produced great advances in training large, deep neural networks (DNNs), including notable successes in training convolutional neural networks (convnets) to recognize natural images. The first is a tool that visualizes the activations produced on each layer of a trained convnet as it processes an image or video (e.g. a live webcam stream).
2,647
##### Pylearn2: a machine learning research library
Pylearn2 is a machine learning research library. This does not just mean that it is a collection of machine learning algorithms that share a common API; it means that it has been designed for flexibility and extensibility in order to facilitate research projects that involve new or unusual use cases.
2,609
##### A Neural Algorithm of Artistic Style
In fine art, especially painting, humans have mastered the skill to create unique visual experiences through composing a complex interplay between the content and style of an image. Thus far the algorithmic basis of this process is unknown and there exists no artificial system with similar capabilities.
2,546
##### WaveNet: A Generative Model for Raw Audio
This paper introduces WaveNet, a deep neural network for generating raw audio waveforms. The model is fully probabilistic and autoregressive, with the predictive distribution for each audio sample conditioned on all previous ones; nonetheless we show that it can be efficiently trained on data with tens of thousands of samples per second of audio.
2,520
This paper proposes to tackle open- domain question answering using Wikipedia as the unique knowledge source: the answer to any factoid question is a text span in a Wikipedia article. This task of machine reading at scale combines the challenges of document retrieval (finding the relevant articles) with that of machine comprehension of text (identifying the answer spans from those articles).
2,517
##### Efficient Neural Architecture Search with Network Morphism
Network morphism, which keeps the functionality of a neural network while changing its neural architecture, could be helpful for NAS by enabling a more efficient training during the search. In this paper, we propose a novel framework enabling Bayesian optimization to guide the network morphism for efficient neural architecture search by introducing a neural network kernel and a tree-structured acquisition function optimization algorithm.
2,471
##### scikit-image: Image processing in Python
scikit-image is an image processing library that implements algorithms and utilities for use in research, education and industry applications. It is released under the liberal "Modified BSD" open source license, provides a well-documented API in the Python programming language, and is developed by an active, international team of collaborators.
2,457
##### DyNet: The Dynamic Neural Network Toolkit
We describe DyNet, a toolkit for implementing neural network models based on dynamic declaration of network structure. In the static declaration strategy that is used in toolkits like Theano, CNTK, and TensorFlow, the user first defines a computation graph (a symbolic representation of the computation), and then examples are fed into an engine that executes this computation and computes its derivatives.
2,412
##### DyNet: The Dynamic Neural Network Toolkit
We describe DyNet, a toolkit for implementing neural network models based on dynamic declaration of network structure. In the static declaration strategy that is used in toolkits like Theano, CNTK, and TensorFlow, the user first defines a computation graph (a symbolic representation of the computation), and then examples are fed into an engine that executes this computation and computes its derivatives.
2,412
##### OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks
We show how a multiscale and sliding window approach can be efficiently implemented within a ConvNet. This integrated framework is the winner of the localization task of the ImageNet Large Scale Visual Recognition Challenge 2013 (ILSVRC2013) and obtained very competitive results for the detection and classifications tasks.
2,392
##### Very Deep Convolutional Networks for Large-Scale Image Recognition
In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
2,392
##### Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning
Imbalanced-learn is an open-source python toolbox aiming at providing a wide range of methods to cope with the problem of imbalanced dataset frequently encountered in machine learning and pattern recognition. The implemented state-of-the-art methods can be categorized into 4 groups: (i) under-sampling, (ii) over-sampling, (iii) combination of over- and under-sampling, and (iv) ensemble learning methods.
2,376
##### Analogical Reasoning on Chinese Morphological and Semantic Relations
Analogical reasoning is effective in capturing linguistic regularities. This paper proposes an analogical reasoning task on Chinese.
2,360
##### A Neural Algorithm of Artistic Style
In fine art, especially painting, humans have mastered the skill to create unique visual experiences through composing a complex interplay between the content and style of an image. Thus far the algorithmic basis of this process is unknown and there exists no artificial system with similar capabilities.
2,210
##### Going Deeper with Convolutions
We propose a deep convolutional neural network architecture codenamed "Inception", which was responsible for setting the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC 2014). The main hallmark of this architecture is the improved utilization of the computing resources inside the network.
2,210
##### High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
We present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs). Conditional GANs have enabled a variety of applications, but the results are often limited to low-resolution and still far from realistic.
2,193
##### SSD: Single Shot MultiBox Detector
We present a method for detecting objects in images using a single deep neural network. Experimental results on the PASCAL VOC, MS COCO, and ILSVRC datasets confirm that SSD has comparable accuracy to methods that utilize an additional object proposal step and is much faster, while providing a unified framework for both training and inference.
2,166
##### Augmentor: An Image Augmentation Library for Machine Learning
The generation of artificial data based on existing observations, known as data augmentation, is a technique used in machine learning to improve model accuracy, generalisation, and to control overfitting. Augmentor is a software package, available in both Python and Julia versions, that provides a high level API for the expansion of image data using a stochastic, pipeline-based approach which effectively allows for images to be sampled from a distribution of augmented images at runtime.
2,157
##### StarSpace: Embed All The Things!
We present StarSpace, a general-purpose neural embedding model that can solve a wide variety of problems: labeling tasks such as text classification, ranking tasks such as information retrieval/web search, collaborative filtering-based or content-based recommendation, embedding of multi-relational graphs, and learning word, sentence or document level embeddings. In each case the model works by embedding those entities comprised of discrete features and comparing them against each other -- learning similarities dependent on the task.
2,121
##### Better Computer Go Player with Neural Network and Long-term Prediction
Competing with top human players in the ancient game of Go has been a long-term goal of artificial intelligence. Against human players, the newest versions, darkfores2, achieve a stable 3d level on KGS Go Server as a ranked bot, a substantial improvement upon the estimated 4k-5k ranks for DCNN reported in Clark & Storkey (2015) based on games against other machine players.
2,098
##### Fast R-CNN
Fast R-CNN builds on previous work to efficiently classify object proposals using deep convolutional networks. Fast R-CNN trains the very deep VGG16 network 9x faster than R-CNN, is 213x faster at test-time, and achieves a higher mAP on PASCAL VOC 2012.
2,091