Search Results for author: Sebastian Ruder

Found 76 papers, 50 papers with code

XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalisation

2 code implementations ICML 2020 Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, Melvin Johnson

However, these broad-coverage benchmarks have been mostly limited to English, and despite an increasing interest in multilingual models, a benchmark that enables the comprehensive evaluation of such methods on a diverse range of languages and tasks is still missing.

Retrieval Zero-Shot Cross-Lingual Transfer

Writing System and Speaker Metadata for 2,800+ Language Varieties

1 code implementation LREC 2022 Daan van Esch, Tamar Lucassen, Sebastian Ruder, Isaac Caswell, Clara Rivera

We describe an open-source dataset providing metadata for about 2, 800 language varieties used in the world today.

Multi-Domain Multilingual Question Answering

1 code implementation EMNLP (ACL) 2021 Sebastian Ruder, Avi Sil

Question answering (QA) is one of the most challenging and impactful tasks in natural language processing.

Cross-Lingual Transfer Domain Adaptation +2

MAD-G: Multilingual Adapter Generation for Efficient Cross-Lingual Transfer

no code implementations Findings (EMNLP) 2021 Alan Ansell, Edoardo Maria Ponti, Jonas Pfeiffer, Sebastian Ruder, Goran Glavaš, Ivan Vulić, Anna Korhonen

While offering (1) improved fine-tuning efficiency (by a factor of around 50 in our experiments), (2) a smaller parameter budget, and (3) increased language coverage, MAD-G remains competitive with more expensive methods for language-specific adapter training across the board.

Dependency Parsing named-entity-recognition +4

Modular Deep Learning

no code implementations22 Feb 2023 Jonas Pfeiffer, Sebastian Ruder, Ivan Vulić, Edoardo Maria Ponti

Modular deep learning has emerged as a promising solution to these challenges.

Causal Inference Transfer Learning

QAmeleon: Multilingual QA with Only 5 Examples

no code implementations15 Nov 2022 Priyanka Agrawal, Chris Alberti, Fantine Huot, Joshua Maynez, Ji Ma, Sebastian Ruder, Kuzman Ganchev, Dipanjan Das, Mirella Lapata

The availability of large, high-quality datasets has been one of the main drivers of recent progress in question answering (QA).

Few-Shot Learning Question Answering

TaTa: A Multilingual Table-to-Text Dataset for African Languages

1 code implementation31 Oct 2022 Sebastian Gehrmann, Sebastian Ruder, Vitaly Nikolaev, Jan A. Botha, Michael Chavinda, Ankur Parikh, Clara Rivera

To address this lack of data, we create Table-to-Text in African languages (TaTa), the first large multilingual table-to-text dataset with a focus on African languages.

Data-to-Text Generation

Language Models are Multilingual Chain-of-Thought Reasoners

1 code implementation6 Oct 2022 Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei

Finally, we show that the multilingual reasoning abilities of language models extend to other tasks such as commonsense reasoning and word-in-context semantic judgment.

GSM8K

Square One Bias in NLP: Towards a Multi-Dimensional Exploration of the Research Manifold

1 code implementation Findings (ACL) 2022 Sebastian Ruder, Ivan Vulić, Anders Søgaard

Most work targeting multilinguality, for example, considers only accuracy; most work on fairness or interpretability considers only English; and so on.

Fairness

Evaluating Inclusivity, Equity, and Accessibility of NLP Technology: A Case Study for Indian Languages

no code implementations25 May 2022 Simran Khanuja, Sebastian Ruder, Partha Talukdar

In order for NLP technology to be widely applicable, fair, and useful, it needs to serve a diverse set of speakers across the world's languages, be equitable, i. e., not unduly biased towards any particular language, and be inclusive of all users, particularly in low-resource settings where compute constraints are common.

XTREME-S: Evaluating Cross-lingual Speech Representations

no code implementations21 Mar 2022 Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson

Covering 102 languages from 10+ language families, 3 different domains and 4 task families, XTREME-S aims to simplify multilingual speech representation evaluation, as well as catalyze research in "universal" speech representation learning.

Representation Learning Retrieval +4

Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation

1 code implementation ACL 2022 Xinyi Wang, Sebastian Ruder, Graham Neubig

The performance of multilingual pretrained models is highly dependent on the availability of monolingual or parallel text present in a target language.

NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis

2 code implementations LREC 2022 Shamsuddeen Hassan Muhammad, David Ifeoluwa Adelani, Sebastian Ruder, Ibrahim Said Ahmad, Idris Abdulmumin, Bello Shehu Bello, Monojit Choudhury, Chris Chinenye Emezue, Saheed Salahudeen Abdullahi, Anuoluwapo Aremu, Alipio Jeorge, Pavel Brazdil

We introduce the first large-scale human-annotated Twitter sentiment dataset for the four most widely spoken languages in Nigeria (Hausa, Igbo, Nigerian-Pidgin, and Yor\`ub\'a ) consisting of around 30, 000 annotated tweets per language (and 14, 000 for Nigerian-Pidgin), including a significant fraction of code-mixed tweets.

Sentiment Analysis

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

2 code implementations6 Dec 2021 Kaustubh D. Dhole, Varun Gangal, Sebastian Gehrmann, Aadesh Gupta, Zhenhao Li, Saad Mahamood, Abinaya Mahendiran, Simon Mille, Ashish Shrivastava, Samson Tan, Tongshuang Wu, Jascha Sohl-Dickstein, Jinho D. Choi, Eduard Hovy, Ondrej Dusek, Sebastian Ruder, Sajant Anand, Nagender Aneja, Rabin Banjade, Lisa Barthe, Hanna Behnke, Ian Berlot-Attwell, Connor Boyle, Caroline Brun, Marco Antonio Sobrevilla Cabezudo, Samuel Cahyawijaya, Emile Chapuis, Wanxiang Che, Mukund Choudhary, Christian Clauss, Pierre Colombo, Filip Cornell, Gautier Dagan, Mayukh Das, Tanay Dixit, Thomas Dopierre, Paul-Alexis Dray, Suchitra Dubey, Tatiana Ekeinhor, Marco Di Giovanni, Tanya Goyal, Rishabh Gupta, Louanes Hamla, Sang Han, Fabrice Harel-Canada, Antoine Honore, Ishan Jindal, Przemyslaw K. Joniak, Denis Kleyko, Venelin Kovatchev, Kalpesh Krishna, Ashutosh Kumar, Stefan Langer, Seungjae Ryan Lee, Corey James Levinson, Hualou Liang, Kaizhao Liang, Zhexiong Liu, Andrey Lukyanenko, Vukosi Marivate, Gerard de Melo, Simon Meoni, Maxime Meyer, Afnan Mir, Nafise Sadat Moosavi, Niklas Muennighoff, Timothy Sum Hon Mun, Kenton Murray, Marcin Namysl, Maria Obedkova, Priti Oli, Nivranshu Pasricha, Jan Pfister, Richard Plant, Vinay Prabhu, Vasile Pais, Libo Qin, Shahab Raji, Pawan Kumar Rajpoot, Vikas Raunak, Roy Rinberg, Nicolas Roberts, Juan Diego Rodriguez, Claude Roux, Vasconcellos P. H. S., Ananya B. Sai, Robin M. Schmidt, Thomas Scialom, Tshephisho Sefara, Saqib N. Shamsi, Xudong Shen, Haoyue Shi, Yiwen Shi, Anna Shvets, Nick Siegel, Damien Sileo, Jamie Simon, Chandan Singh, Roman Sitelew, Priyank Soni, Taylor Sorensen, William Soto, Aman Srivastava, KV Aditya Srivatsa, Tony Sun, Mukund Varma T, A Tabassum, Fiona Anting Tan, Ryan Teehan, Mo Tiwari, Marie Tolkiehn, Athena Wang, Zijian Wang, Gloria Wang, Zijie J. Wang, Fuxuan Wei, Bryan Wilie, Genta Indra Winata, Xinyi Wu, Witold Wydmański, Tianbao Xie, Usama Yaseen, Michael A. Yee, Jing Zhang, Yue Zhang

Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on.

Data Augmentation

ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning

2 code implementations ICLR 2022 Vamsi Aribandi, Yi Tay, Tal Schuster, Jinfeng Rao, Huaixiu Steven Zheng, Sanket Vaibhav Mehta, Honglei Zhuang, Vinh Q. Tran, Dara Bahri, Jianmo Ni, Jai Gupta, Kai Hui, Sebastian Ruder, Donald Metzler

Despite the recent success of multi-task learning and transfer learning for natural language processing (NLP), few works have systematically studied the effect of scaling up the number of tasks during pre-training.

Denoising Multi-Task Learning

Balancing Average and Worst-case Accuracy in Multitask Learning

no code implementations12 Oct 2021 Paul Michel, Sebastian Ruder, Dani Yogatama

When training and evaluating machine learning models on a large number of tasks, it is important to not only look at average task accuracy -- which may be biased by easy or redundant tasks -- but also worst-case accuracy (i. e. the performance on the task with the lowest accuracy).

Image Classification Language Modelling

Compacter: Efficient Low-Rank Hypercomplex Adapter Layers

2 code implementations NeurIPS 2021 Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder

In this work, we propose Compacter, a method for fine-tuning large-scale language models with a better trade-off between task performance and the number of trainable parameters than prior work.

Memorisation versus Generalisation in Pre-trained Language Models

1 code implementation ACL 2022 Michael Tänzer, Sebastian Ruder, Marek Rei

State-of-the-art pre-trained language models have been shown to memorise facts and perform well with limited amounts of training data.

Few-Shot Learning Low Resource Named Entity Recognition +3

MasakhaNER: Named Entity Recognition for African Languages

1 code implementation22 Mar 2021 David Ifeoluwa Adelani, Jade Abbott, Graham Neubig, Daniel D'souza, Julia Kreutzer, Constantine Lignos, Chester Palen-Michel, Happy Buzaaba, Shruti Rijhwani, Sebastian Ruder, Stephen Mayhew, Israel Abebe Azime, Shamsuddeen Muhammad, Chris Chinenye Emezue, Joyce Nakatumba-Nabende, Perez Ogayo, Anuoluwapo Aremu, Catherine Gitau, Derguene Mbaye, Jesujoba Alabi, Seid Muhie Yimam, Tajuddeen Gwadabe, Ignatius Ezeani, Rubungo Andre Niyongabo, Jonathan Mukiibi, Verrah Otiende, Iroro Orife, Davis David, Samba Ngom, Tosin Adewumi, Paul Rayson, Mofetoluwa Adeyemi, Gerald Muriuki, Emmanuel Anebi, Chiamaka Chukwuneke, Nkiruka Odu, Eric Peter Wairagala, Samuel Oyerinde, Clemencia Siro, Tobius Saul Bateesa, Temilola Oloyede, Yvonne Wambui, Victor Akinode, Deborah Nabagereka, Maurice Katusiime, Ayodele Awokoya, Mouhamadane MBOUP, Dibora Gebreyohannes, Henok Tilaye, Kelechi Nwaike, Degaga Wolde, Abdoulaye Faye, Blessing Sibanda, Orevaoghene Ahia, Bonaventure F. P. Dossou, Kelechi Ogueji, Thierno Ibrahima DIOP, Abdoulaye Diallo, Adewale Akinfaderin, Tendai Marengereke, Salomey Osei

We take a step towards addressing the under-representation of the African continent in NLP research by creating the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages, bringing together a variety of stakeholders.

named-entity-recognition Named Entity Recognition +2

Multi-view Subword Regularization

1 code implementation NAACL 2021 Xinyi Wang, Sebastian Ruder, Graham Neubig

Multilingual pretrained representations generally rely on subword segmentation algorithms to create a shared multilingual vocabulary.

Cross-Lingual Transfer

Mind the Gap: Assessing Temporal Generalization in Neural Language Models

1 code implementation NeurIPS 2021 Angeliki Lazaridou, Adhiguna Kuncoro, Elena Gribovskaya, Devang Agrawal, Adam Liska, Tayfun Terzi, Mai Gimenez, Cyprien de Masson d'Autume, Tomas Kocisky, Sebastian Ruder, Dani Yogatama, Kris Cao, Susannah Young, Phil Blunsom

Hence, given the compilation of ever-larger language modelling datasets, combined with the growing list of language-model-based NLP applications that require up-to-date factual knowledge about the world, we argue that now is the right time to rethink the static way in which we currently train and evaluate our language models, and develop adaptive language models that can remain up-to-date with respect to our ever-changing and non-stationary world.

Language Modelling

How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models

1 code implementation ACL 2021 Phillip Rust, Jonas Pfeiffer, Ivan Vulić, Sebastian Ruder, Iryna Gurevych

In this work, we provide a systematic and comprehensive empirical comparison of pretrained multilingual language models versus their monolingual counterparts with regard to their monolingual task performance.

Pretrained Multilingual Language Models

UNKs Everywhere: Adapting Multilingual Language Models to New Scripts

2 code implementations EMNLP 2021 Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder

The ultimate challenge is dealing with under-resourced languages not covered at all by the models and written in scripts unseen during pretraining.

Cross-Lingual Transfer

Morphologically Aware Word-Level Translation

no code implementations COLING 2020 Paula Czarnowska, Sebastian Ruder, Ryan Cotterell, Ann Copestake

We propose a novel morphologically aware probability model for bilingual lexicon induction, which jointly models lexeme translation and inflectional morphology in a structured way.

Bilingual Lexicon Induction Translation

Long Range Arena: A Benchmark for Efficient Transformers

5 code implementations8 Nov 2020 Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, Donald Metzler

In the recent months, a wide spectrum of efficient, fast Transformers have been proposed to tackle this problem, more often than not claiming superior or comparable model quality to vanilla Transformer models.

Benchmarking Long-range modeling

AdapterHub: A Framework for Adapting Transformers

5 code implementations EMNLP 2020 Jonas Pfeiffer, Andreas Rücklé, Clifton Poth, Aishwarya Kamath, Ivan Vulić, Sebastian Ruder, Kyunghyun Cho, Iryna Gurevych

We propose AdapterHub, a framework that allows dynamic "stitching-in" of pre-trained adapters for different tasks and languages.

XLM-R

A Call for More Rigor in Unsupervised Cross-lingual Learning

no code implementations ACL 2020 Mikel Artetxe, Sebastian Ruder, Dani Yogatama, Gorka Labaka, Eneko Agirre

We review motivations, definition, approaches, and methodology for unsupervised cross-lingual learning and call for a more rigorous position in each of them.

Cross-Lingual Word Embeddings Translation +2

MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer

3 code implementations EMNLP 2020 Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder

The main goal behind state-of-the-art pre-trained multilingual models such as multilingual BERT and XLM-R is enabling and bootstrapping NLP applications in low-resource languages through zero-shot or few-shot cross-lingual transfer.

Ranked #4 on Cross-Lingual Transfer on XCOPA (using extra training data)

Cross-Lingual Transfer named-entity-recognition +4

Are All Good Word Vector Spaces Isomorphic?

1 code implementation EMNLP 2020 Ivan Vulić, Sebastian Ruder, Anders Søgaard

Existing algorithms for aligning cross-lingual word vector spaces assume that vector spaces are approximately isomorphic.

XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization

3 code implementations24 Mar 2020 Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, Melvin Johnson

However, these broad-coverage benchmarks have been mostly limited to English, and despite an increasing interest in multilingual models, a benchmark that enables the comprehensive evaluation of such methods on a diverse range of languages and tasks is still missing.

Cross-Lingual Transfer Retrieval

On the Cross-lingual Transferability of Monolingual Representations

6 code implementations ACL 2020 Mikel Artetxe, Sebastian Ruder, Dani Yogatama

This generalization ability has been attributed to the use of a shared subword vocabulary and joint training across multiple languages giving rise to deep multilingual abstractions.

Cross-Lingual Question Answering Language Modelling +1

What do Deep Networks Like to Read?

no code implementations10 Sep 2019 Jonas Pfeiffer, Aishwarya Kamath, Iryna Gurevych, Sebastian Ruder

Recent research towards understanding neural networks probes models in a top-down manner, but is only able to identify model tendencies that are known a priori.

Unsupervised Cross-Lingual Representation Learning

no code implementations ACL 2019 Sebastian Ruder, Anders S{\o}gaard, Ivan Vuli{\'c}

In this tutorial, we provide a comprehensive survey of the exciting recent work on cutting-edge weakly-supervised and unsupervised cross-lingual word representations.

Representation Learning Structured Prediction

Episodic Memory in Lifelong Language Learning

2 code implementations NeurIPS 2019 Cyprien de Masson d'Autume, Sebastian Ruder, Lingpeng Kong, Dani Yogatama

We introduce a lifelong language learning setup where a model needs to learn from a stream of text examples without any dataset identifier.

Continual Learning General Classification +3

Transfer Learning in Natural Language Processing

no code implementations NAACL 2019 Sebastian Ruder, Matthew E. Peters, Swabha Swayamdipta, Thomas Wolf

The classic supervised machine learning paradigm is based on learning in isolation, a single predictive model for a task using a single dataset.

Transfer Learning Word Embeddings

To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks

no code implementations WS 2019 Matthew E. Peters, Sebastian Ruder, Noah A. Smith

While most previous work has focused on different pretraining objectives and architectures for transfer learning, we ask how to best adapt the pretrained model to a given target task.

Transfer Learning

A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks

1 code implementation14 Nov 2018 Victor Sanh, Thomas Wolf, Sebastian Ruder

The model is trained in a hierarchical fashion to introduce an inductive bias by supervising a set of low level tasks at the bottom layers of the model and more complex tasks at the top layers of the model.

Ranked #10 on Relation Extraction on ACE 2005 (using extra training data)

Inductive Bias Multi-Task Learning +4

Off-the-Shelf Unsupervised NMT

no code implementations6 Nov 2018 Chris Hokamp, Sebastian Ruder, John Glover

We frame unsupervised machine translation (MT) in the context of multi-task learning (MTL), combining insights from both directions.

Multi-Task Learning NMT +2

Generalizing Procrustes Analysis for Better Bilingual Dictionary Induction

1 code implementation CONLL 2018 Yova Kementchedjhieva, Sebastian Ruder, Ryan Cotterell, Anders Søgaard

Most recent approaches to bilingual dictionary induction find a linear alignment between the word vector spaces of two languages.

360\mbox$^\circ$ Stance Detection

no code implementations NAACL 2018 Sebastian Ruder, John Glover, Afshin Mehrabani, Parsa Ghaffari

To ameliorate this, we propose 360{\mbox{$^\circ$}} Stance Detection, a tool that aggregates news with multiple perspectives on a topic.

Stance Detection

On the Limitations of Unsupervised Bilingual Dictionary Induction

no code implementations ACL 2018 Anders Søgaard, Sebastian Ruder, Ivan Vulić

Unsupervised machine translation---i. e., not assuming any cross-lingual supervision signal, whether a dictionary, translations, or comparable corpora---seems impossible, but nevertheless, Lample et al. (2018) recently proposed a fully unsupervised machine translation (MT) model.

Graph Similarity Translation +1

Strong Baselines for Neural Semi-supervised Learning under Domain Shift

2 code implementations ACL 2018 Sebastian Ruder, Barbara Plank

In this paper, we re-evaluate classic general-purpose bootstrapping approaches in the context of neural networks under domain shifts vs. recent neural approaches and propose a novel multi-task tri-training method that reduces the time and space complexity of classic tri-training.

Domain Adaptation Multi-Task Learning +2

360° Stance Detection

no code implementations3 Apr 2018 Sebastian Ruder, John Glover, Afshin Mehrabani, Parsa Ghaffari

To ameliorate this, we propose 360{\deg} Stance Detection, a tool that aggregates news with multiple perspectives on a topic.

Stance Detection

Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces

1 code implementation NAACL 2018 Isabelle Augenstein, Sebastian Ruder, Anders Søgaard

We combine multi-task learning and semi-supervised learning by inducing a joint embedding space between disparate label spaces and learning transfer functions between label embeddings, enabling us to jointly leverage unlabelled data and auxiliary, annotated datasets.

General Classification Multi-Task Learning +1

Universal Language Model Fine-tuning for Text Classification

65 code implementations ACL 2018 Jeremy Howard, Sebastian Ruder

Inductive transfer learning has greatly impacted computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch.

General Classification Language Modelling +3

Learning to select data for transfer learning with Bayesian Optimization

1 code implementation EMNLP 2017 Sebastian Ruder, Barbara Plank

Domain similarity measures can be used to gauge adaptability and select suitable data for transfer learning, but existing approaches define ad hoc measures that are deemed suitable for respective tasks.

Part-Of-Speech Tagging Sentiment Analysis +1

A Survey Of Cross-lingual Word Embedding Models

no code implementations15 Jun 2017 Sebastian Ruder, Ivan Vulić, Anders Søgaard

Cross-lingual representations of words enable us to reason about word meaning in multilingual contexts and are a key facilitator of cross-lingual transfer when developing natural language processing models for low-resource languages.

Cross-Lingual Transfer Cross-Lingual Word Embeddings +1

An Overview of Multi-Task Learning in Deep Neural Networks

4 code implementations15 Jun 2017 Sebastian Ruder

Multi-task learning (MTL) has led to successes in many applications of machine learning, from natural language processing and speech recognition to computer vision and drug discovery.

BIG-bench Machine Learning Drug Discovery +3

Latent Multi-task Architecture Learning

2 code implementations23 May 2017 Sebastian Ruder, Joachim Bingel, Isabelle Augenstein, Anders Søgaard

In practice, however, MTL involves searching an enormous space of possible parameter sharing architectures to find (a) the layers or subspaces that benefit from sharing, (b) the appropriate amount of sharing, and (c) the appropriate relative weights of the different task losses.

Multi-Task Learning

Data Selection Strategies for Multi-Domain Sentiment Analysis

1 code implementation8 Feb 2017 Sebastian Ruder, Parsa Ghaffari, John G. Breslin

However, the selection of appropriate training data is as important as the choice of algorithm.

Domain Adaptation Sentiment Analysis

Knowledge Adaptation: Teaching to Adapt

no code implementations7 Feb 2017 Sebastian Ruder, Parsa Ghaffari, John G. Breslin

Domain adaptation is crucial in many real-world applications where the distribution of the training data differs from the distribution of the test data.

Knowledge Distillation Sentiment Analysis +2

Character-level and Multi-channel Convolutional Neural Networks for Large-scale Authorship Attribution

3 code implementations21 Sep 2016 Sebastian Ruder, Parsa Ghaffari, John G. Breslin

Convolutional neural networks (CNNs) have demonstrated superior capability for extracting information from raw signals in computer vision.

Sentence Classification

An overview of gradient descent optimization algorithms

21 code implementations15 Sep 2016 Sebastian Ruder

Gradient descent optimization algorithms, while increasingly popular, are often used as black-box optimizers, as practical explanations of their strengths and weaknesses are hard to come by.

Cannot find the paper you are looking for? You can Submit a new open access paper.