GPT is a Transformer-based architecture and training procedure for natural language processing tasks. Training follows a two-stage procedure. First, a language modeling objective is used on the unlabeled data to learn the initial parameters of a neural network model. Subsequently, these parameters are adapted to a target task using the corresponding supervised objective.

Source: Improving Language Understanding by Generative Pre-Training

Latest Papers

PAPER DATE
Hierarchical GPT with Congruent Transformers for Multi-Sentence Language Models
Jihyeon RohHuiseong GimSoo-Young Lee
2020-09-18
Comparative Evaluation of Pretrained Transfer Learning Models on Automatic Short Answer Grading
Sasi Kiran GaddipatiDeebul NairPaul G. Plöger
2020-09-02
Knowledge Efficient Deep Learning for Natural Language Processing
Hai Wang
2020-08-28
On-The-Fly Information Retrieval Augmentation for Language Models
Hai WangDavid McAllester
2020-07-03
Roles and Utilization of Attention Heads in Transformer-based Neural Language Models
Jae-young JoSung-Hyon Myaeng
2020-07-01
Emergence of Separable Manifolds in Deep Language Representations
Jonathan MamouHang LeMiguel Del RioCory StephensonHanlin TangYoon KimSueYeon Chung
2020-06-01
On the Generation of Medical Dialogues for COVID-19
| Wenmian YangGuangtao ZengBowen TanZeqian JuSubrato ChakravortyXuehai HeShu ChenXingyi YangQingyang WuZhou YuEric XingPengtao Xie
2020-05-11
Spying on your neighbors: Fine-grained probing of contextual embeddings for information about surrounding words
Josef KlafkaAllyson Ettinger
2020-05-04
Multilingual Corpus Creation for Multilingual Semantic Similarity Task
Mahtab AhmedChahna DixitRobert E. MercerAtif KhanMuhammad Rifayat SameeFelipe Urra
2020-05-01
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
| Kevin ClarkMinh-Thang LuongQuoc V. LeChristopher D. Manning
2020-03-23
Training Large Neural Networks with Constant Memory using a New Execution Algorithm
Bharadwaj PudipeddiMaral MesmakhosroshahiJinwen XiSujeeth Bharadwaj
2020-02-13
Joint Contextual Modeling for ASR Correction and Language Understanding
Yue WengSai Sumanth MiryalaChandra KhatriRunze WangHuaixiu ZhengPiero MolinoMahdi NamazifarAlexandros PapangelisHugh WilliamsFranziska BellGokhan Tur
2020-01-28
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
| Anonymous
2020-01-01
A Comparative Study of Pretrained Language Models on Thai Social Text Categorization
Thanapapas HorsuwanKasidis KanwatcharaPeerapon VateekulBoonserm Kijsirikul
2019-12-03
Evaluating Commonsense in Pre-trained Language Models
| Xuhui ZhouYue ZhangLeyang CuiDandan Huang
2019-11-27
Zero-Shot Paraphrase Generation with Multilingual Language Models
Yinpeng GuoYi LiaoXin JiangQing ZhangYibo ZhangQun Liu
2019-11-09
Inspecting Unification of Encoding and Matching with Transformer: A Case Study of Machine Reading Comprehension
Hangbo BaoLi DongFuru WeiWenhui WangNan YangLei CuiSonghao PiaoMing Zhou
2019-11-01
Natural Language Generation for Effective Knowledge Distillation
Raphael TangYao LuJimmy Lin
2019-11-01
An Empirical Study of Efficient ASR Rescoring with Transformers
Hongzhao HuangFuchun Peng
2019-10-24
Evolution of transfer learning in natural language processing
Aditya MaltePratik Ratadiya
2019-10-16
Q8BERT: Quantized 8Bit BERT
| Ofir ZafrirGuy BoudoukhPeter IzsakMoshe Wasserblat
2019-10-14
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
| Samyam RajbhandariJeff RasleyOlatunji RuwaseYuxiong He
2019-10-04
Extreme Language Model Compression with Optimal Subwords and Shared Projections
Sanqiang ZhaoRaghav GuptaYang SongDenny Zhou
2019-09-25
How Additional Knowledge can Improve Natural Language Commonsense Question Answering?
Arindam MitraPratyay BanerjeeKuntal Kumar PalSwaroop MishraChitta Baral
2019-09-19
Reasoning Over Semantic-Level Graph for Fact Checking
| Wanjun ZhongJingjing XuDuyu TangZenan XuNan DuanMing ZhouJiahai WangJian Yin
2019-09-09
Semantics-aware BERT for Language Understanding
| Zhuosheng ZhangYuwei WuHai ZhaoZuchao LiShuailiang ZhangXi ZhouXiang Zhou
2019-09-05
Effective Use of Transformer Networks for Entity Tracking
Aditya GuptaGreg Durrett
2019-09-05
Quantity doesn't buy quality syntax with neural language models
Marten van SchijndelAaron MuellerTal Linzen
2019-08-31
Adversarial Learning with Contextual Embeddings for Zero-resource Cross-lingual Classification and NER
Phillip KeungYichao LuVikas Bhardwaj
2019-08-31
BioFLAIR: Pretrained Pooled Contextualized Embeddings for Biomedical Sequence Labeling Tasks
| Shreyas SharmaRon Daniel Jr
2019-08-13
GPT-based Generation for Classical Chinese Poetry
Yi LiaoYasheng WangQun LiuXin Jiang
2019-06-29
Fine-tuning Pre-Trained Transformer Language Models to Distantly Supervised Relation Extraction
| Christoph AltMarc HübnerLeonhard Hennig
2019-06-19
CODAH: An Adversarially-Authored Question Answering Dataset for Common Sense
Michael ChenMike D{'}ArcyAlisa LiuFernJared ezDoug Downey
2019-06-01
Figure Eight at SemEval-2019 Task 3: Ensemble of Transfer Learning Methods for Contextual Emotion Detection
Joan Xiao
2019-06-01
Story Ending Prediction by Transferable BERT
| Zhongyang LiXiao DingTing Liu
2019-05-17
Language Models with Transformers
| Chenguang WangMu LiAlexander J. Smola
2019-04-20
[email protected] at SemEval-2019 Task 6 and Task 5: Linguistically enhanced deep learning offensive sentence classifier
Alessandro SegantiHelena SobolIryna OrlovaHannam KimJakub StaniszewskiTymoteusz KrumholcKrystian Koziel
2019-04-10
Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
| Raphael TangYao LuLinqing LiuLili MouOlga VechtomovaJimmy Lin
2019-03-28
Passage Re-ranking with BERT
| Rodrigo NogueiraKyunghyun Cho
2019-01-13
Linguistic Analysis of Pretrained Sentence Encoders with Acceptability Judgments
Alex WarstadtSamuel R. Bowman
2019-01-11
Improving Language Understanding by Generative Pre-Training
| Alec RadfordKarthik NarasimhanTim SalimansIlya Sutskever
2018-06-11

Categories