2 code implementations • ICML 2020 • Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, Melvin Johnson
However, these broad-coverage benchmarks have been mostly limited to English, and despite an increasing interest in multilingual models, a benchmark that enables the comprehensive evaluation of such methods on a diverse range of languages and tasks is still missing.
Ranked #1 on
Zero-Shot Cross-Lingual Transfer
on XTREME
(AVG metric)
1 code implementation • LREC 2022 • Daan van Esch, Tamar Lucassen, Sebastian Ruder, Isaac Caswell, Clara Rivera
We describe an open-source dataset providing metadata for about 2, 800 language varieties used in the world today.
1 code implementation • EMNLP (ACL) 2021 • Sebastian Ruder, Avi Sil
Question answering (QA) is one of the most challenging and impactful tasks in natural language processing.
no code implementations • Findings (EMNLP) 2021 • Alan Ansell, Edoardo Maria Ponti, Jonas Pfeiffer, Sebastian Ruder, Goran Glavaš, Ivan Vulić, Anna Korhonen
While offering (1) improved fine-tuning efficiency (by a factor of around 50 in our experiments), (2) a smaller parameter budget, and (3) increased language coverage, MAD-G remains competitive with more expensive methods for language-specific adapter training across the board.
no code implementations • 22 Feb 2023 • Jonas Pfeiffer, Sebastian Ruder, Ivan Vulić, Edoardo Maria Ponti
Modular deep learning has emerged as a promising solution to these challenges.
2 code implementations • 17 Feb 2023 • Shamsuddeen Hassan Muhammad, Idris Abdulmumin, Abinew Ali Ayele, Nedjma Ousidhoum, David Ifeoluwa Adelani, Seid Muhie Yimam, Ibrahim Sa'id Ahmad, Meriem Beloucif, Saif Mohammad, Sebastian Ruder, Oumaima Hourrane, Pavel Brazdil, Felermino Dário Mário António Ali, Davis Davis, Salomey Osei, Bello Shehu Bello, Falalu Ibrahim, Tajuddeen Gwadabe, Samuel Rutunda, Tadesse Belay, Wendimu Baye Messelle, Hailu Beshada Balcha, Sisay Adugna Chala, Hagos Tesfahun Gebremichael, Bernard Opoku, Steven Arthur
Yet, there is little NLP research conducted on African languages.
1 code implementation • 19 Dec 2022 • Samuel Cahyawijaya, Holy Lovenia, Alham Fikri Aji, Genta Indra Winata, Bryan Wilie, Rahmad Mahendra, Christian Wibisono, Ade Romadhony, Karissa Vincentio, Fajri Koto, JENNIFER SANTOSO, David Moeljadi, Cahya Wirawan, Frederikus Hudi, Ivan Halim Parmonangan, Ika Alfina, Muhammad Satrio Wicaksono, Ilham Firdausi Putra, Samsul Rahmadani, Yulianti Oenang, Ali Akbar Septiandri, James Jaya, Kaustubh D. Dhole, Arie Ardiyanti Suryani, Rifki Afina Putri, Dan Su, Keith Stevens, Made Nindyatama Nityasya, Muhammad Farid Adilazuarda, Ryan Ignatius, Ryandito Diandaru, Tiezheng Yu, Vito Ghifari, Wenliang Dai, Yan Xu, Dyah Damapuspita, Cuk Tho, Ichwanul Muslim Karo Karo, Tirana Noor Fatyanosa, Ziwei Ji, Pascale Fung, Graham Neubig, Timothy Baldwin, Sebastian Ruder, Herry Sujaini, Sakriani Sakti, Ayu Purwarianti
We present NusaCrowd, a collaborative initiative to collect and unite existing resources for Indonesian languages, including opening access to previously non-public resources.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 15 Nov 2022 • Priyanka Agrawal, Chris Alberti, Fantine Huot, Joshua Maynez, Ji Ma, Sebastian Ruder, Kuzman Ganchev, Dipanjan Das, Mirella Lapata
The availability of large, high-quality datasets has been one of the main drivers of recent progress in question answering (QA).
1 code implementation • 31 Oct 2022 • Sebastian Gehrmann, Sebastian Ruder, Vitaly Nikolaev, Jan A. Botha, Michael Chavinda, Ankur Parikh, Clara Rivera
To address this lack of data, we create Table-to-Text in African languages (TaTa), the first large multilingual table-to-text dataset with a focus on African languages.
no code implementations • 22 Oct 2022 • David Ifeoluwa Adelani, Graham Neubig, Sebastian Ruder, Shruti Rijhwani, Michael Beukman, Chester Palen-Michel, Constantine Lignos, Jesujoba O. Alabi, Shamsuddeen H. Muhammad, Peter Nabende, Cheikh M. Bamba Dione, Andiswa Bukula, Rooweither Mabuya, Bonaventure F. P. Dossou, Blessing Sibanda, Happy Buzaaba, Jonathan Mukiibi, Godson Kalipe, Derguene Mbaye, Amelia Taylor, Fatoumata Kabore, Chris Chinenye Emezue, Anuoluwapo Aremu, Perez Ogayo, Catherine Gitau, Edwin Munkoh-Buabeng, Victoire M. Koagne, Allahsera Auguste Tapo, Tebogo Macucwa, Vukosi Marivate, Elvis Mboning, Tajuddeen Gwadabe, Tosin Adewumi, Orevaoghene Ahia, Joyce Nakatumba-Nabende, Neo L. Mokono, Ignatius Ezeani, Chiamaka Chukwuneke, Mofetoluwa Adeyemi, Gilles Q. Hacheme, Idris Abdulmumin, Odunayo Ogundepo, Oreen Yousuf, Tatiana Moteu Ngoli, Dietrich Klakow
African languages are spoken by over a billion people, but are underrepresented in NLP research and development.
1 code implementation • 6 Oct 2022 • Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei
Finally, we show that the multilingual reasoning abilities of language models extend to other tasks such as commonsense reasoning and word-in-context semantic judgment.
1 code implementation • Findings (ACL) 2022 • Sebastian Ruder, Ivan Vulić, Anders Søgaard
Most work targeting multilinguality, for example, considers only accuracy; most work on fairness or interpretability considers only English; and so on.
2 code implementations • 31 May 2022 • Genta Indra Winata, Alham Fikri Aji, Samuel Cahyawijaya, Rahmad Mahendra, Fajri Koto, Ade Romadhony, Kemal Kurniawan, David Moeljadi, Radityo Eko Prasojo, Pascale Fung, Timothy Baldwin, Jey Han Lau, Rico Sennrich, Sebastian Ruder
In this work, we focus on developing resources for languages in Indonesia.
no code implementations • 25 May 2022 • Simran Khanuja, Sebastian Ruder, Partha Talukdar
In order for NLP technology to be widely applicable, fair, and useful, it needs to serve a diverse set of speakers across the world's languages, be equitable, i. e., not unduly biased towards any particular language, and be inclusive of all users, particularly in low-resource settings where compute constraints are common.
1 code implementation • 24 May 2022 • Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord, Sebastian Ruder
Massively multilingual models are promising for transfer learning across tasks and languages.
no code implementations • ACL 2022 • Alham Fikri Aji, Genta Indra Winata, Fajri Koto, Samuel Cahyawijaya, Ade Romadhony, Rahmad Mahendra, Kemal Kurniawan, David Moeljadi, Radityo Eko Prasojo, Timothy Baldwin, Jey Han Lau, Sebastian Ruder
NLP research is impeded by a lack of resources and awareness of the challenges presented by underrepresented languages and dialects.
no code implementations • 21 Mar 2022 • Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson
Covering 102 languages from 10+ language families, 3 different domains and 4 task families, XTREME-S aims to simplify multilingual speech representation evaluation, as well as catalyze research in "universal" speech representation learning.
1 code implementation • ACL 2022 • Xinyi Wang, Sebastian Ruder, Graham Neubig
The performance of multilingual pretrained models is highly dependent on the availability of monolingual or parallel text present in a target language.
2 code implementations • LREC 2022 • Shamsuddeen Hassan Muhammad, David Ifeoluwa Adelani, Sebastian Ruder, Ibrahim Said Ahmad, Idris Abdulmumin, Bello Shehu Bello, Monojit Choudhury, Chris Chinenye Emezue, Saheed Salahudeen Abdullahi, Anuoluwapo Aremu, Alipio Jeorge, Pavel Brazdil
We introduce the first large-scale human-annotated Twitter sentiment dataset for the four most widely spoken languages in Nigeria (Hausa, Igbo, Nigerian-Pidgin, and Yor\`ub\'a ) consisting of around 30, 000 annotated tweets per language (and 14, 000 for Nigerian-Pidgin), including a significant fraction of code-mixed tweets.
2 code implementations • 6 Dec 2021 • Kaustubh D. Dhole, Varun Gangal, Sebastian Gehrmann, Aadesh Gupta, Zhenhao Li, Saad Mahamood, Abinaya Mahendiran, Simon Mille, Ashish Shrivastava, Samson Tan, Tongshuang Wu, Jascha Sohl-Dickstein, Jinho D. Choi, Eduard Hovy, Ondrej Dusek, Sebastian Ruder, Sajant Anand, Nagender Aneja, Rabin Banjade, Lisa Barthe, Hanna Behnke, Ian Berlot-Attwell, Connor Boyle, Caroline Brun, Marco Antonio Sobrevilla Cabezudo, Samuel Cahyawijaya, Emile Chapuis, Wanxiang Che, Mukund Choudhary, Christian Clauss, Pierre Colombo, Filip Cornell, Gautier Dagan, Mayukh Das, Tanay Dixit, Thomas Dopierre, Paul-Alexis Dray, Suchitra Dubey, Tatiana Ekeinhor, Marco Di Giovanni, Tanya Goyal, Rishabh Gupta, Louanes Hamla, Sang Han, Fabrice Harel-Canada, Antoine Honore, Ishan Jindal, Przemyslaw K. Joniak, Denis Kleyko, Venelin Kovatchev, Kalpesh Krishna, Ashutosh Kumar, Stefan Langer, Seungjae Ryan Lee, Corey James Levinson, Hualou Liang, Kaizhao Liang, Zhexiong Liu, Andrey Lukyanenko, Vukosi Marivate, Gerard de Melo, Simon Meoni, Maxime Meyer, Afnan Mir, Nafise Sadat Moosavi, Niklas Muennighoff, Timothy Sum Hon Mun, Kenton Murray, Marcin Namysl, Maria Obedkova, Priti Oli, Nivranshu Pasricha, Jan Pfister, Richard Plant, Vinay Prabhu, Vasile Pais, Libo Qin, Shahab Raji, Pawan Kumar Rajpoot, Vikas Raunak, Roy Rinberg, Nicolas Roberts, Juan Diego Rodriguez, Claude Roux, Vasconcellos P. H. S., Ananya B. Sai, Robin M. Schmidt, Thomas Scialom, Tshephisho Sefara, Saqib N. Shamsi, Xudong Shen, Haoyue Shi, Yiwen Shi, Anna Shvets, Nick Siegel, Damien Sileo, Jamie Simon, Chandan Singh, Roman Sitelew, Priyank Soni, Taylor Sorensen, William Soto, Aman Srivastava, KV Aditya Srivatsa, Tony Sun, Mukund Varma T, A Tabassum, Fiona Anting Tan, Ryan Teehan, Mo Tiwari, Marie Tolkiehn, Athena Wang, Zijian Wang, Gloria Wang, Zijie J. Wang, Fuxuan Wei, Bryan Wilie, Genta Indra Winata, Xinyi Wu, Witold Wydmański, Tianbao Xie, Usama Yaseen, Michael A. Yee, Jing Zhang, Yue Zhang
Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on.
2 code implementations • ICLR 2022 • Vamsi Aribandi, Yi Tay, Tal Schuster, Jinfeng Rao, Huaixiu Steven Zheng, Sanket Vaibhav Mehta, Honglei Zhuang, Vinh Q. Tran, Dara Bahri, Jianmo Ni, Jai Gupta, Kai Hui, Sebastian Ruder, Donald Metzler
Despite the recent success of multi-task learning and transfer learning for natural language processing (NLP), few works have systematically studied the effect of scaling up the number of tasks during pre-training.
no code implementations • 12 Oct 2021 • Paul Michel, Sebastian Ruder, Dani Yogatama
When training and evaluating machine learning models on a large number of tasks, it is important to not only look at average task accuracy -- which may be biased by easy or redundant tasks -- but also worst-case accuracy (i. e. the performance on the task with the lowest accuracy).
1 code implementation • ACL 2022 • Yanan Zheng, Jing Zhou, Yujie Qian, Ming Ding, Chonghua Liao, Jian Li, Ruslan Salakhutdinov, Jie Tang, Sebastian Ruder, Zhilin Yang
The few-shot natural language understanding (NLU) task has attracted much recent attention.
1 code implementation • Findings (EMNLP) 2021 • Xinyi Wang, Yulia Tsvetkov, Sebastian Ruder, Graham Neubig
Adapters are light-weight modules that allow parameter-efficient fine-tuning of pretrained models.
2 code implementations • ICLR 2022 • Yi Tay, Vinh Q. Tran, Sebastian Ruder, Jai Gupta, Hyung Won Chung, Dara Bahri, Zhen Qin, Simon Baumgartner, Cong Yu, Donald Metzler
In this paper, we propose a new model inductive bias that learns a subword tokenization end-to-end as part of the model.
Ranked #3 on
Paraphrase Identification
on Quora Question Pairs
2 code implementations • NeurIPS 2021 • Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder
In this work, we propose Compacter, a method for fine-tuning large-scale language models with a better trade-off between task performance and the number of trainable parameters than prior work.
1 code implementation • ACL 2021 • Rabeeh Karimi Mahabadi, Sebastian Ruder, Mostafa Dehghani, James Henderson
State-of-the-art parameter-efficient fine-tuning methods rely on introducing adapter modules between the layers of a pretrained language model.
1 code implementation • ACL 2022 • Michael Tänzer, Sebastian Ruder, Marek Rei
State-of-the-art pre-trained language models have been shown to memorise facts and perform well with limited amounts of training data.
2 code implementations • EMNLP 2021 • Samuel Cahyawijaya, Genta Indra Winata, Bryan Wilie, Karissa Vincentio, Xiaohong Li, Adhiguna Kuncoro, Sebastian Ruder, Zhi Yuan Lim, Syafri Bahar, Masayu Leylia Khodra, Ayu Purwarianti, Pascale Fung
Natural language generation (NLG) benchmarks provide an important avenue to measure progress and develop better NLG systems.
1 code implementation • EMNLP 2021 • Sebastian Ruder, Noah Constant, Jan Botha, Aditya Siddhant, Orhan Firat, Jinlan Fu, PengFei Liu, Junjie Hu, Dan Garrette, Graham Neubig, Melvin Johnson
While a sizeable gap to human-level performance remains, improvements have been easier to achieve in some tasks than in others.
1 code implementation • 22 Mar 2021 • David Ifeoluwa Adelani, Jade Abbott, Graham Neubig, Daniel D'souza, Julia Kreutzer, Constantine Lignos, Chester Palen-Michel, Happy Buzaaba, Shruti Rijhwani, Sebastian Ruder, Stephen Mayhew, Israel Abebe Azime, Shamsuddeen Muhammad, Chris Chinenye Emezue, Joyce Nakatumba-Nabende, Perez Ogayo, Anuoluwapo Aremu, Catherine Gitau, Derguene Mbaye, Jesujoba Alabi, Seid Muhie Yimam, Tajuddeen Gwadabe, Ignatius Ezeani, Rubungo Andre Niyongabo, Jonathan Mukiibi, Verrah Otiende, Iroro Orife, Davis David, Samba Ngom, Tosin Adewumi, Paul Rayson, Mofetoluwa Adeyemi, Gerald Muriuki, Emmanuel Anebi, Chiamaka Chukwuneke, Nkiruka Odu, Eric Peter Wairagala, Samuel Oyerinde, Clemencia Siro, Tobius Saul Bateesa, Temilola Oloyede, Yvonne Wambui, Victor Akinode, Deborah Nabagereka, Maurice Katusiime, Ayodele Awokoya, Mouhamadane MBOUP, Dibora Gebreyohannes, Henok Tilaye, Kelechi Nwaike, Degaga Wolde, Abdoulaye Faye, Blessing Sibanda, Orevaoghene Ahia, Bonaventure F. P. Dossou, Kelechi Ogueji, Thierno Ibrahima DIOP, Abdoulaye Diallo, Adewale Akinfaderin, Tendai Marengereke, Salomey Osei
We take a step towards addressing the under-representation of the African continent in NLP research by creating the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages, bringing together a variety of stakeholders.
1 code implementation • NAACL 2021 • Xinyi Wang, Sebastian Ruder, Graham Neubig
Multilingual pretrained representations generally rely on subword segmentation algorithms to create a shared multilingual vocabulary.
1 code implementation • NeurIPS 2021 • Angeliki Lazaridou, Adhiguna Kuncoro, Elena Gribovskaya, Devang Agrawal, Adam Liska, Tayfun Terzi, Mai Gimenez, Cyprien de Masson d'Autume, Tomas Kocisky, Sebastian Ruder, Dani Yogatama, Kris Cao, Susannah Young, Phil Blunsom
Hence, given the compilation of ever-larger language modelling datasets, combined with the growing list of language-model-based NLP applications that require up-to-date factual knowledge about the world, we argue that now is the right time to rethink the static way in which we currently train and evaluate our language models, and develop adaptive language models that can remain up-to-date with respect to our ever-changing and non-stationary world.
no code implementations • ICLR 2021 • Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, Donald Metzler
Transformers do not scale very well to long sequence lengths largely because of quadratic self-attention complexity.
1 code implementation • ACL 2021 • Phillip Rust, Jonas Pfeiffer, Ivan Vulić, Sebastian Ruder, Iryna Gurevych
In this work, we provide a systematic and comprehensive empirical comparison of pretrained multilingual language models versus their monolingual counterparts with regard to their monolingual task performance.
2 code implementations • EMNLP 2021 • Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder
The ultimate challenge is dealing with under-resourced languages not covered at all by the models and written in scripts unseen during pretraining.
no code implementations • COLING 2020 • Paula Czarnowska, Sebastian Ruder, Ryan Cotterell, Ann Copestake
We propose a novel morphologically aware probability model for bilingual lexicon induction, which jointly models lexeme translation and inflectional morphology in a structured way.
5 code implementations • 8 Nov 2020 • Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, Donald Metzler
In the recent months, a wide spectrum of efficient, fast Transformers have been proposed to tackle this problem, more often than not claiming superior or comparable model quality to vanilla Transformer models.
Ranked #14 on
Long-range modeling
on LRA
2 code implementations • ICLR 2021 • Hyung Won Chung, Thibault Févry, Henry Tsai, Melvin Johnson, Sebastian Ruder
We re-evaluate the standard practice of sharing weights between input and output embeddings in state-of-the-art pre-trained language models.
Ranked #1 on
Cross-Lingual NER
on NER
Cross-Lingual Natural Language Inference
Cross-Lingual NER
+4
5 code implementations • EMNLP 2020 • Jonas Pfeiffer, Andreas Rücklé, Clifton Poth, Aishwarya Kamath, Ivan Vulić, Sebastian Ruder, Kyunghyun Cho, Iryna Gurevych
We propose AdapterHub, a framework that allows dynamic "stitching-in" of pre-trained adapters for different tasks and languages.
no code implementations • ACL 2020 • Mikel Artetxe, Sebastian Ruder, Dani Yogatama, Gorka Labaka, Eneko Agirre
We review motivations, definition, approaches, and methodology for unsupervised cross-lingual learning and call for a more rigorous position in each of them.
3 code implementations • EMNLP 2020 • Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder
The main goal behind state-of-the-art pre-trained multilingual models such as multilingual BERT and XLM-R is enabling and bootstrapping NLP applications in low-resource languages through zero-shot or few-shot cross-lingual transfer.
Ranked #4 on
Cross-Lingual Transfer
on XCOPA
(using extra training data)
1 code implementation • EMNLP 2020 • Marcin Kardas, Piotr Czapla, Pontus Stenetorp, Sebastian Ruder, Sebastian Riedel, Ross Taylor, Robert Stojnic
Tracking progress in machine learning has become increasingly difficult with the recent explosion in the number of papers.
1 code implementation • EMNLP 2020 • Ivan Vulić, Sebastian Ruder, Anders Søgaard
Existing algorithms for aligning cross-lingual word vector spaces assume that vector spaces are approximately isomorphic.
3 code implementations • 24 Mar 2020 • Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, Melvin Johnson
However, these broad-coverage benchmarks have been mostly limited to English, and despite an increasing interest in multilingual models, a benchmark that enables the comprehensive evaluation of such methods on a diverse range of languages and tasks is still missing.
6 code implementations • ACL 2020 • Mikel Artetxe, Sebastian Ruder, Dani Yogatama
This generalization ability has been attributed to the use of a shared subword vocabulary and joint training across multiple languages giving rise to deep multilingual abstractions.
no code implementations • 10 Sep 2019 • Jonas Pfeiffer, Aishwarya Kamath, Iryna Gurevych, Sebastian Ruder
Recent research towards understanding neural networks probes models in a top-down manner, but is only able to identify model tendencies that are known a priori.
4 code implementations • IJCNLP 2019 • Julian Martin Eisenschlos, Sebastian Ruder, Piotr Czapla, Marcin Kardas, Sylvain Gugger, Jeremy Howard
Pretrained language models are promising particularly for low-resource languages as they only require unlabelled data.
Ranked #1 on
Zero-shot Cross-Lingual Document Classification
on Cross-Lingual Sentiment (CLS)- English to German - DVD
Cross-Lingual Document Classification
Document Classification
+2
no code implementations • IJCNLP 2019 • Paula Czarnowska, Sebastian Ruder, Edouard Grave, Ryan Cotterell, Ann Copestake
Human translators routinely have to translate rare inflections of words - due to the Zipfian distribution of words in a language.
no code implementations • ACL 2019 • Sebastian Ruder, Anders S{\o}gaard, Ivan Vuli{\'c}
In this tutorial, we provide a comprehensive survey of the exciting recent work on cutting-edge weakly-supervised and unsupervised cross-lingual word representations.
2 code implementations • NeurIPS 2019 • Cyprien de Masson d'Autume, Sebastian Ruder, Lingpeng Kong, Dani Yogatama
We introduce a lifelong language learning setup where a model needs to learn from a stream of text examples without any dataset identifier.
no code implementations • NAACL 2019 • Sebastian Ruder, Matthew E. Peters, Swabha Swayamdipta, Thomas Wolf
The classic supervised machine learning paradigm is based on learning in isolation, a single predictive model for a task using a single dataset.
no code implementations • WS 2019 • Matthew E. Peters, Sebastian Ruder, Noah A. Smith
While most previous work has focused on different pretraining objectives and architectures for transfer learning, we ask how to best adapt the pretrained model to a given target task.
1 code implementation • ACL 2019 • Goran Glavas, Robert Litschko, Sebastian Ruder, Ivan Vulic
In this work, we make the first step towards a comprehensive evaluation of cross-lingual word embeddings.
1 code implementation • 14 Nov 2018 • Victor Sanh, Thomas Wolf, Sebastian Ruder
The model is trained in a hierarchical fashion to introduce an inductive bias by supervising a set of low level tasks at the bottom layers of the model and more complex tasks at the top layers of the model.
Ranked #10 on
Relation Extraction
on ACE 2005
(using extra training data)
no code implementations • 6 Nov 2018 • Chris Hokamp, Sebastian Ruder, John Glover
We frame unsupervised machine translation (MT) in the context of multi-task learning (MTL), combining insights from both directions.
1 code implementation • CONLL 2018 • Yova Kementchedjhieva, Sebastian Ruder, Ryan Cotterell, Anders Søgaard
Most recent approaches to bilingual dictionary induction find a linear alignment between the word vector spaces of two languages.
1 code implementation • EMNLP 2018 • Sebastian Ruder, Ryan Cotterell, Yova Kementchedjhieva, Anders Søgaard
We introduce a novel discriminative latent-variable model for the task of bilingual lexicon induction.
no code implementations • NAACL 2018 • Sebastian Ruder, John Glover, Afshin Mehrabani, Parsa Ghaffari
To ameliorate this, we propose 360{\mbox{$^\circ$}} Stance Detection, a tool that aggregates news with multiple perspectives on a topic.
no code implementations • ACL 2018 • Anders Søgaard, Sebastian Ruder, Ivan Vulić
Unsupervised machine translation---i. e., not assuming any cross-lingual supervision signal, whether a dictionary, translations, or comparable corpora---seems impossible, but nevertheless, Lample et al. (2018) recently proposed a fully unsupervised machine translation (MT) model.
2 code implementations • ACL 2018 • Sebastian Ruder, Barbara Plank
In this paper, we re-evaluate classic general-purpose bootstrapping approaches in the context of neural networks under domain shifts vs. recent neural approaches and propose a novel multi-task tri-training method that reduces the time and space complexity of classic tri-training.
Ranked #3 on
Sentiment Analysis
on Multi-Domain Sentiment Dataset
no code implementations • 3 Apr 2018 • Sebastian Ruder, John Glover, Afshin Mehrabani, Parsa Ghaffari
To ameliorate this, we propose 360{\deg} Stance Detection, a tool that aggregates news with multiple perspectives on a topic.
1 code implementation • NAACL 2018 • Isabelle Augenstein, Sebastian Ruder, Anders Søgaard
We combine multi-task learning and semi-supervised learning by inducing a joint embedding space between disparate label spaces and learning transfer functions between label embeddings, enabling us to jointly leverage unlabelled data and auxiliary, annotated datasets.
65 code implementations • ACL 2018 • Jeremy Howard, Sebastian Ruder
Inductive transfer learning has greatly impacted computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch.
Ranked #4 on
Text Classification
on AG News
1 code implementation • EMNLP 2017 • Sebastian Ruder, Barbara Plank
Domain similarity measures can be used to gauge adaptability and select suitable data for transfer learning, but existing approaches define ad hoc measures that are deemed suitable for respective tasks.
no code implementations • 15 Jun 2017 • Sebastian Ruder, Ivan Vulić, Anders Søgaard
Cross-lingual representations of words enable us to reason about word meaning in multilingual contexts and are a key facilitator of cross-lingual transfer when developing natural language processing models for low-resource languages.
4 code implementations • 15 Jun 2017 • Sebastian Ruder
Multi-task learning (MTL) has led to successes in many applications of machine learning, from natural language processing and speech recognition to computer vision and drug discovery.
2 code implementations • 23 May 2017 • Sebastian Ruder, Joachim Bingel, Isabelle Augenstein, Anders Søgaard
In practice, however, MTL involves searching an enormous space of possible parameter sharing architectures to find (a) the layers or subspaces that benefit from sharing, (b) the appropriate amount of sharing, and (c) the appropriate relative weights of the different task losses.
1 code implementation • 8 Feb 2017 • Sebastian Ruder, Parsa Ghaffari, John G. Breslin
However, the selection of appropriate training data is as important as the choice of algorithm.
no code implementations • 7 Feb 2017 • Sebastian Ruder, Parsa Ghaffari, John G. Breslin
Domain adaptation is crucial in many real-world applications where the distribution of the training data differs from the distribution of the test data.
no code implementations • WS 2016 • Sebastian Ruder, Parsa Ghaffari, John G. Breslin
Humans continuously adapt their style and language to a variety of domains.
3 code implementations • 21 Sep 2016 • Sebastian Ruder, Parsa Ghaffari, John G. Breslin
Convolutional neural networks (CNNs) have demonstrated superior capability for extracting information from raw signals in computer vision.
21 code implementations • 15 Sep 2016 • Sebastian Ruder
Gradient descent optimization algorithms, while increasingly popular, are often used as black-box optimizers, as practical explanations of their strengths and weaknesses are hard to come by.
no code implementations • SEMEVAL 2016 • Sebastian Ruder, Parsa Ghaffari, John G. Breslin
This paper describes our deep learning-based approach to multilingual aspect-based sentiment analysis as part of SemEval 2016 Task 5.
no code implementations • SEMEVAL 2016 • Sebastian Ruder, Parsa Ghaffari, John G. Breslin
This paper describes our deep learning-based approach to sentiment analysis in Twitter as part of SemEval-2016 Task 4.
no code implementations • EMNLP 2016 • Sebastian Ruder, Parsa Ghaffari, John G. Breslin
Opinion mining from customer reviews has become pervasive in recent years.
Aspect-Based Sentiment Analysis (ABSA)
General Classification
+1