Methods > Natural Language Processing > Transformers

XLNet is an autoregressive Transformer that leverages the best of both autoregressive language modeling and autoencoding while attempting to avoid their limitations. Instead of using a fixed forward or backward factorization order as in conventional autoregressive models, XLNet maximizes the expected log likelihood of a sequence w.r.t. all possible permutations of the factorization order. Thanks to the permutation operation, the context for each position can consist of tokens from both left and right. In expectation, each position learns to utilize contextual information from all positions, i.e., capturing bidirectional context.

Additionally, inspired by the latest advancements in autogressive language modeling, XLNet integrates the segment recurrence mechanism and relative encoding scheme of Transformer-XL into pretraining, which empirically improves the performance especially for tasks involving a longer text sequence.

Source: XLNet: Generalized Autoregressive Pretraining for Language Understanding

Latest Papers

PAPER DATE
Transformers: "The End of History" for NLP?
Anton ChernyavskiyDmitry IlvovskyPreslav Nakov
2021-04-09
Exploring Transformers in Emotion Recognition: a comparison of BERT, DistillBERT, RoBERTa, XLNet and ELECTRA
Diogo Cortiz
2021-04-05
K-XLNet: A General Method for Combining Explicit Knowledge with Language Model Pretraining
Ruiqing YanLanchang SunFang WangXiaoMing Zhang
2021-03-25
Comparing the Performance of NLP Toolkits and Evaluation measures in Legal Tech
Muhammad Zohaib Khan
2021-03-12
Orthogonal Attention: A Cloze-Style Approach to Negation Scope Resolution
Aditya KhandelwalVahida Attar
2021-03-07
NLP-CUET@LT-EDI-EACL2021: Multilingual Code-Mixed Hope Speech Detection using Cross-lingual Representation Learner
| Eftekhar HossainOmar SharifMohammed Moshiul Hoque
2021-02-28
NLP-CUET@DravidianLangTech-EACL2021: Investigating Visual and Textual Features to Identify Trolls from Multimodal Social Media Memes
Eftekhar HossainOmar SharifMohammed Moshiul Hoque
2021-02-28
Few Shot Learning for Information Verification
Usama KhalidMirza Omer Beg
2021-02-22
Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet
M. Onat TopalAnil BasImke van Heerden
2021-02-16
Fake News Detection System using XLNet model with Topic Distributions: CONSTRAINT@AAAI2021 Shared Task
Akansha GautamVenktesh VSarah Masud
2021-01-12
Cisco at AAAI-CAD21 shared task: Predicting Emphasis in Presentation Slides using Contextualized Embeddings
| Sreyan GhoshSonal KumarHarsh JalanHemant YadavRajiv Ratn Shah
2021-01-10
Transformer-based approach towards music emotion recognition from lyrics
| Yudhik AgrawalRamaguru Guru Ravi ShankerVinoo Alluri
2021-01-06
Syntactic Relevance XLNet Word Embedding Generation in Low-Resource Machine Translation
Anonymous
2021-01-01
EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets
Xiaohan ChenYu ChengShuohang WangZhe GanZhangyang WangJingjing Liu
2020-12-31
DialogXL: All-in-One XLNet for Multi-Party Conversation Emotion Recognition
| Weizhou ShenJunqing ChenXiaojun QuanZhixian Xie
2020-12-16
Discriminative Pre-training for Low Resource Title Compression in Conversational Grocery
Snehasish MukherjeePhaniram SayapaneniShankar Subramanya
2020-12-13
Yelp Review Rating Prediction: Machine Learning and Deep Learning Models
| Zefang Liu
2020-12-12
Neural language models for text classification in evidence-based medicine
Andres CarvalloDenis ParraGabriel RadaDaniel PerezJuan Ignacio VasquezCamilo Vergara
2020-12-01
Multitask Learning of Negation and Speculation using Transformers
| Aditya KhandelwalBenita Kathleen Britto
2020-11-20
BioNerFlair: biomedical named entity recognition using flair embedding and sequence tagger
| Harsh Patel
2020-11-03
CHIME: Cross-passage Hierarchical Memory Network for Generative Review Question Answering
| Junru LuGabriele PergolaLin GuiBinyang LiYulan He
2020-11-01
KERMIT: Complementing Transformer Architectures with Encoders of Explicit Syntactic Interpretations
| Fabio Massimo ZanzottoAndrea SantilliLeonardo RanaldiDario OnoratiPierfrancesco TommasinoFrancesca Fallucchi
2020-11-01
ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding
Dongling XiaoYu-Kun LiHan ZhangYu SunHao TianHua WuHaifeng Wang
2020-10-23
AutoMeTS: The Autocomplete for Medical Text Simplification
Hoang VanDavid KauchakGondy Leroy
2020-10-20
Performance of Transfer Learning Model vs. Traditional Neural Network in Low System Resource Environment
William Hui
2020-10-20
NUIG-Shubhanker@Dravidian-CodeMix-FIRE2020: Sentiment Analysis of Code-Mixed Dravidian text using XLNet
Shubhanker BanerjeeArun JayapalSajeetha Thavareesan
2020-10-15
Interpreting Attention Models with Human Visual Attention in Machine Reading Comprehension
Ekta SoodSimon TannertDiego FrassinelliAndreas BullingNgoc Thang Vu
2020-10-13
Aspect-based Document Similarity for Research Papers
| Malte OstendorffTerry RuasTill BlumeBela GippGeorg Rehm
2020-10-13
Automated Concatenation of Embeddings for Structured Prediction
| Xinyu WangYong JiangNguyen BachTao WangZhongqiang HuangFei HuangKewei Tu
2020-10-10
Analyzing Individual Neurons in Pre-trained Language Models
Nadir DurraniHassan SajjadFahim DalviYonatan Belinkov
2020-10-06
PUM at SemEval-2020 Task 12: Aggregation of Transformer-based models' features for offensive language recognition
Piotr JaniszewskiMateusz SkibaUrszula Walińska
2020-10-05
How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers?
Shayne LongpreYu WangChristopher DuBois
2020-10-05
Examining the rhetorical capacities of neural language models
Zining ZhuChuer PanMohamed AbdallaFrank Rudzicz
2020-10-01
Accelerating Multi-Model Inference by Merging DNNs of Different Weights
Joo Seong JeongSoojeong KimGyeong-In YuYunseong LeeByung-Gon Chun
2020-09-28
Weird AI Yankovic: Generating Parody Lyrics
Mark Riedl
2020-09-25
BET: A Backtranslation Approach for Easy Data Augmentation in Transformer-based Paraphrase Identification Context
| Jean-Philippe CorbeilHadi Abdi Ghadivel
2020-09-25
Fine-tuning Pre-trained Contextual Embeddings for Citation Content Analysis in Scholarly Publication
Haihua ChenHuyen Nguyen
2020-09-12
UPB at SemEval-2020 Task 6: Pretrained Language Models for Definition Extraction
| Andrei-Marius AvramDumitru-Clementin CercelCostin-Gabriel Chiru
2020-09-11
Compressed Deep Networks: Goodbye SVD, Hello Robust Low-Rank Approximation
Murad TukanAlaa MaaloufMatan WekslerDan Feldman
2020-09-11
QiaoNing at SemEval-2020 Task 4: Commonsense Validation and Explanation system based on ensemble of language model
Pai Liu
2020-09-06
EdinburghNLP at WNUT-2020 Task 2: Leveraging Transformers with Generalized Augmentation for Identifying Informativeness in COVID-19 Tweets
Nickil Maveli
2020-09-06
A Multitask Deep Learning Approach for User Depression Detection on Sina Weibo
Yiding WangZhenyi WangChenghao LiYilin ZhangHaizhou Wang
2020-08-26
KR-BERT: A Small-Scale Korean-Specific Language Model
| Sangah LeeHansol JangYunmee BaikSuzi ParkHyopil Shin
2020-08-10
Multi-node Bert-pretraining: Cost-efficient Approach
Jiahuang LinXin LiGennady Pekhimenko
2020-08-01
Neural Machine Translation with Error Correction
| Kaitao SongXu TanJianfeng Lu
2020-07-21
Detecting Sarcasm in Conversation Context Using Transformer-Based Models
Adithya AvvaruSanath VobilisettyRadhika Mamidi
2020-07-01
Metaphor Detection Using Contextual Word Embeddings From Transformers
Jerry LiuNathan O{'}HaraAlex RubinerRachel DraelosCynthia Rudin
2020-07-01
A Transformer Approach to Contextual Sarcasm Detection in Twitter
Hunter GregorySteven LiPouya MohammadiNatalie TarnRachel DraelosCynthia Rudin
2020-07-01
Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya
Abrhalei TelaAbraham WoubieVille Hautamaki
2020-06-13
Using Large Pretrained Language Models for Answering User Queries from Product Specifications
Kalyani RoySmit ShahNithish PaiJaidam RamtejPrajit Prashant NadkarnJyotirmoy BanerjeePawan GoyalSurender Kumar
2020-05-29
A Comparative Study of Lexical Substitution Approaches based on Neural Language Models
Nikolay ArefyevBoris SheludkoAlexander PodolskiyAlexander Panchenko
2020-05-29
ImpactCite: An XLNet-based method for Citation Impact Analysis
| Dominique MercierSyed Tahseen Raza RizviVikas RajashekarAndreas DengelSheraz Ahmed
2020-05-05
DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering
| Qingqing CaoHarsh TrivediAruna BalasubramanianNiranjan Balasubramanian
2020-05-02
Cross-lingual Information Retrieval with BERT
| Zhuolin JiangAmro El-JaroudiWilliam HartmannDamianos KarakosLingjun Zhao
2020-04-24
StereoSet: Measuring stereotypical bias in pretrained language models
| Moin NadeemAnna BethkeSiva Reddy
2020-04-20
MPNet: Masked and Permuted Pre-training for Language Understanding
| Kaitao SongXu TanTao QinJianfeng LuTie-Yan Liu
2020-04-20
On the Effect of Dropping Layers of Pre-trained Transformer Models
| Hassan SajjadFahim DalviNadir DurraniPreslav Nakov
2020-04-08
Analyzing Redundancy in Pretrained Transformer Models
| Fahim DalviHassan SajjadNadir DurraniYonatan Belinkov
2020-04-08
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
| Kevin ClarkMinh-Thang LuongQuoc V. LeChristopher D. Manning
2020-03-23
Pairwise Multi-Class Document Classification for Semantic Relations between Wikipedia Articles
| Malte OstendorffTerry RuasMoritz SchubotzGeorg RehmBela Gipp
2020-03-22
Stress Test Evaluation of Transformer-based Models in Natural Language Understanding Tasks
Carlos AspillagaAndrés CarvalloVladimir Araujo
2020-02-14
Resolving the Scope of Speculation and Negation using Transformer-Based Architectures
| Benita Kathleen BrittoAditya Khandelwal
2020-01-09
BERT-AL: BERT for Arbitrarily Long Document Understanding
Ruixuan ZhangZhuoyu WeiYu ShiYining Chen
2020-01-01
Clinical XLNet: Modeling Sequential Clinical Notes and Predicting Prolonged Mechanical Ventilation
| Kexin HuangAbhishek SinghSitong ChenEdward T. MoseleyChih-ying DengNaomi GeorgeCharlotta Lindvall
2019-12-27
WaLDORf: Wasteless Language-model Distillation On Reading-comprehension
James Yi TianAlexander P. KreuzerPai-Hung ChenHans-Martin Will
2019-12-13
An Exploration of Data Augmentation and Sampling Techniques for Domain-Agnostic Question Answering
Shayne LongpreYi LuZhucheng TuChris DuBois
2019-12-04
Evaluating Commonsense in Pre-trained Language Models
| Xuhui ZhouYue ZhangLeyang CuiDandan Huang
2019-11-27
Low Rank Factorization for Compact Multi-Head Self-Attention
| Sneha MehtaHuzefa RangwalaNaren Ramakrishnan
2019-11-26
Attending to Entities for Better Text Understanding
Pengxiang ChengKatrin Erk
2019-11-11
IIT-KGP at COIN 2019: Using pre-trained Language Models for modeling Machine Comprehension
Prakhar SharmaSumegh Roychowdhury
2019-11-01
Pingan Smart Health and SJTU at COIN - Shared Task: utilizing Pre-trained Language Models and Common-sense Knowledge in Machine Reading Tasks
Xiepeng LiZhexi ZhangWei ZhuZheng LiYuan NiPeng GaoJunchi YanGuotong Xie
2019-11-01
FASPell: A Fast, Adaptable, Simple, Powerful Chinese Spell Checker Based On DAE-Decoder Paradigm
| Yuzhong HongXianguo YuNeng HeNan LiuJunhui Liu
2019-11-01
Generalizing Question Answering System with Pre-trained Language Model Fine-tuning
Dan SuYan XuGenta Indra WinataPeng XuHyeondey KimZihan LiuPascale Fung
2019-11-01
Transfer Learning from Transformers to Fake News Challenge Stance Detection (FNC-1) Task
Valeriya Slovikovskaya
2019-10-31
Modeling Inter-Speaker Relationship in XLNet for Contextual Spoken Language Understanding
Jonggu KimJong-Hyeok Lee
2019-10-28
Speech-XLNet: Unsupervised Acoustic Model Pretraining For Self-Attention Networks
Xingchen SongGuangsen WangZhiyong WuYiheng HuangDan SuDong YuHelen Meng
2019-10-23
XL-Editor: Post-editing Sentences with XLNet
Yong-Siang ShihWei-Cheng ChangYiming Yang
2019-10-19
Multilingual Question Answering from Formatted Text applied to Conversational Agents
Wissam SibliniCharlotte PasqualAxel LavielleMohamed ChallalCyril Cauchois
2019-10-10
Extremely Small BERT Models from Mixed-Vocabulary Training
Sanqiang ZhaoRaghav GuptaYang songDenny Zhou
2019-09-25
Language models and Automated Essay Scoring
Pedro Uria RodriguezAmir JafariChristopher M. Ormerod
2019-09-18
Frustratingly Easy Natural Question Answering
Lin PanRishav ChakravartiAnthony FerrittoMichael GlassAlfio GliozzoSalim RoukosRadu FlorianAvirup Sil
2019-09-11
Reasoning Over Semantic-Level Graph for Fact Checking
| Wanjun ZhongJingjing XuDuyu TangZenan XuNan DuanMing ZhouJiahai WangJian Yin
2019-09-09
Transfer Learning Robustness in Multi-Class Categorization by Fine-Tuning Pre-Trained Contextualized Language Models
| Xinyi LiuArtit Wangperawong
2019-09-08
Integrating Multimodal Information in Large Pretrained Transformers
| Wasifur RahmanMd. Kamrul HasanSangwu LeeAmir ZadehChengfeng MaoLouis-Philippe MorencyEhsan Hoque
2019-08-15
Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding
| Oren BarkanNoam RazinItzik MalkielOri KatzAvi CaciularuNoam Koenigstein
2019-08-14
BioFLAIR: Pretrained Pooled Contextualized Embeddings for Biomedical Sequence Labeling Tasks
| Shreyas SharmaRon Daniel Jr
2019-08-13
ERNIE 2.0: A Continual Pre-training Framework for Language Understanding
| Yu SunShuohuan WangYukun LiShikun FengHao TianHua WuHaifeng Wang
2019-07-29
XLNet: Generalized Autoregressive Pretraining for Language Understanding
| Zhilin YangZihang DaiYiming YangJaime CarbonellRuslan SalakhutdinovQuoc V. Le
2019-06-19

Categories