Methods > General > Activation Functions

Gaussian Linear Error Units

Introduced by Hendrycks et al. in Gaussian Error Linear Units (GELUs)

The Gaussian Error Linear Unit, or GELU, is an activation function. The GELU activation function is $x\Phi(x)$, where $\Phi(x)$ the standard Gaussian cumulative distribution function. The GELU nonlinearity weights inputs by their percentile, rather than gates inputs by their sign as in ReLUs ($x\mathbf{1}_{x>0}$). Consequently the GELU can be thought of as a smoother ReLU.

$$\text{GELU}\left(x\right) = x{P}\left(X\leq{x}\right) = x\Phi\left(x\right) = x \cdot \frac{1}{2}\left[1 + \text{erf}(x/\sqrt{2})\right],$$ if $X\sim \mathcal{N}(0,1)$.

One can approximate the GELU with $0.5x\left(1+\tanh\left[\sqrt{2/\pi}\left(x + 0.044715x^{3}\right)\right]\right)$ or $x\sigma\left(1.702x\right),$ but PyTorch's exact implementation is sufficiently fast such that these approximations may be unnecessary. (See also the SiLU $x\sigma(x)$ which was also coined in the paper that introduced the GELU.)

GELUs are used in GPT-3, BERT, and most other Transformers.

Source: Gaussian Error Linear Units (GELUs)

Latest Papers

PAPER DATE
Aligning Subtitles in Sign Language Videos
Hannah BullTriantafyllos AfourasGül VarolSamuel AlbanieLiliane MomeniAndrew Zisserman
2021-05-06
Adapting Monolingual Models: Data can be Scarce when Language Similarity is High
Wietse de VriesMartijn BarteldsMalvina NissimMartijn Wieling
2021-05-06
Introducing Information Retrieval for Biomedical Informatics Students
| Sanya B. TanejaRichard D. BoyceWilliam T. ReynoldsDenis Newman-Griffis
2021-05-06
Bird's Eye: Probing for Linguistic Graph Structures with a Simple Information-Theoretic Approach
Yifan HouMrinmaya Sachan
2021-05-06
TABBIE: Pretrained Representations of Tabular Data
Hiroshi IidaDung ThaiVarun ManjunathaMohit Iyyer
2021-05-06
One Model to Rule them All: Towards Zero-Shot Learning for Databases
Benjamin HilprechtCarsten Binnig
2021-05-03
Goldilocks: Just-Right Tuning of BERT for Technology-Assisted Review
Eugene YangSean MacAvaneyDavid D. LewisOphir Frieder
2021-05-03
SmoothI: Smooth Rank Indicators for Differentiable IR Metrics
Thibaut ThonetYagmur Gizem CinarEric GaussierMinghan LiJean-Michel Renders
2021-05-03
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks
Tatyana IazykovaDenis KapelyushnikOlga BystrovaAndrey Kutuzov
2021-05-03
MathBERT: A Pre-Trained Model for Mathematical Formula Understanding
Shuai PengKe YuanLiangcai GaoZhi Tang
2021-05-02
MRCBert: A Machine Reading ComprehensionApproach for Unsupervised Summarization
| Saurabh JainGuokai TangLim Sze Chi
2021-05-01
When to Fold'em: How to answer Unanswerable questions
| Marshall HoZhipeng ZhouJudith He
2021-05-01
BERT Meets Relational DB: Contextual Representations of Relational Databases
Siddhant AroraVinayak GuptaGarima GaurSrikanta Bedathur
2021-04-30
Mitigating Political Bias in Language Models Through Reinforced Calibration
Ruibo LiuChenyan JiaJason WeiGuangxuan XuLili WangSoroush Vosoughi
2021-04-30
Word Sense Disambiguation with Transformer Models
Pierre-Yves VandenbusscheTony ScerriRon Daniel Jr.
2021-04-30
Let's Play Mono-Poly: BERT Can Reveal Words' Polysemy Level and Partitionability into Senses
Aina Garí SolerMarianna Apidianaki
2021-04-29
Entailment as Few-Shot Learner
Sinong WangHan FangMadian KhabsaHanzi MaoHao Ma
2021-04-29
Societal Biases in Retrieved Contents: Measurement Framework and Adversarial Mitigation for BERT Rankers
| Navid RekabsazSimone KopeinikMarkus Schedl
2021-04-28
MelBERT: Metaphor Detection via Contextualized Late Interaction using Metaphorical Identification Theories
| Minjin ChoiSunkyung LeeEunseong ChoiHeesoo ParkJunhyuk LeeDongwon LeeJongwuk Lee
2021-04-28
Improving BERT Model Using Contrastive Learning for Biomedical Relation Extraction
| Peng SuYifan PengK. Vijay-Shanker
2021-04-28
UoT-UWF-PartAI at SemEval-2021 Task 5: Self Attention Based Bi-GRU with Multi-Embedding Representation for Toxicity Highlighter
Hamed Babaei GiglouTaher RahgooyMostafa RahgouyJafar Razmara
2021-04-27
Extractive and Abstractive Explanations for Fact-Checking and Evaluation of News
Ashkan KazemiZehua LiVerónica Pérez-RosasRada Mihalcea
2021-04-27
Semi-supervised Interactive Intent Labeling
Saurav SahayEda OkurNagib HakimLama Nachman
2021-04-27
Multi-class Text Classification using BERT-based Active Learning
Sumanth PrabhuMoosa MohamedHemant Misra
2021-04-27
Easy and Efficient Transformer : Scalable Inference Solution For large NLP mode
| Gongzheng liYadong XiJingzhen DingDuan WangBai LiuChangjie FanXiaoxi MaoZeng Zhao
2021-04-26
Phrase break prediction with bidirectional encoder representations in Japanese text-to-speech synthesis
Kosuke FutamataByeongseon ParkRyuichi YamamotoKentaro Tachibana
2021-04-26
Focused Attention Improves Document-Grounded Generation
| Shrimai PrabhumoyeKazuma HashimotoYingbo ZhouAlan W BlackRuslan Salakhutdinov
2021-04-26
PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
Wei ZengXiaozhe RenTeng SuHui WangYi LiaoZhiwei WangXin JiangZhenZhang YangKaisheng WangXiaoda ZhangChen LiZiyan GongYifan YaoXinjing HuangJun WangJianfeng YuQi GuoYue YuYan ZhangJin WangHengtao TaoDasen YanZexuan YiFang PengFangqing JiangHan ZhangLingfeng DengYehong ZhangZhe LinChao ZhangShaojie ZhangMingyue GuoShanzhi GuGaojun FanYaoWei WangXuefeng JinQun LiuYonghong Tian
2021-04-26
Accounting for Agreement Phenomena in Sentence Comprehension with Transformer Language Models: Effects of Similarity-based Interference on Surprisal and Attention
Soo Hyun RyuRichard L. Lewis
2021-04-26
Extract then Distill: Efficient and Effective Task-Agnostic BERT Distillation
Cheng ChenYichun YinLifeng ShangZhi WangXin JiangXiao ChenQun Liu
2021-04-24
Learning Passage Impacts for Inverted Indexes
Antonio MalliaOmar KhattabNicola TonellottoTorsten Suel
2021-04-24
Optimizing small BERTs trained for German NER
Jochen ZöllnerKonrad SperfeldChristoph WickRoger Labahn
2021-04-23
Multimodal Fusion with BERT and Attention Mechanism for Fake News Detection
Nguyen Manh Duc TuanPham Quang Nhat Minh
2021-04-23
BERT-CoQAC: BERT-based Conversational Question Answering in Context
Munazza ZaibDai Hoang TranSubhash SagarAdnan MahmoodWei E. ZhangQuan Z. Sheng
2021-04-23
Towards Trustworthy Deception Detection: Benchmarking Model Robustness across Domains, Modalities, and Languages
Maria GlenskiEllyn AytonRobin CosbeyDustin ArendtSvitlana Volkova
2021-04-23
On Geodesic Distances and Contextual Embedding Compression for Text Classification
| Rishi JhaKai Mihata
2021-04-22
Discriminative Self-training for Punctuation Prediction
Qian ChenWen WangMengzhe ChenQinglin Zhang
2021-04-21
Disfluency Detection with Unlabeled Data and Small BERT Models
Johann C. RochollVicky ZayatsDaniel D. WalkerNoah B. MuradAaron SchneiderDaniel J. Liebling
2021-04-21
Efficient pre-training objectives for Transformers
Luca Di LielloMatteo GabburoAlessandro Moschitti
2021-04-20
UIT-ISE-NLP at SemEval-2021 Task 5: Toxic Spans Detection with BiLSTM-CRF and Toxic Bert Comment Classification
Son T. LuuNgan Luu-Thuy Nguyen
2021-04-20
Measuring Shifts in Attitudes Towards COVID-19 Measures in Belgium Using Multilingual BERT
| Kristen ScottPieter DelobelleBettina Berendt
2021-04-20
B-PROP: Bootstrapped Pre-training with Representative Words Prediction for Ad-hoc Retrieval
Xinyu MaJiafeng GuoRuqing ZhangYixing FanYingyan LiXueqi Cheng
2021-04-20
WASSA@IITK at WASSA 2021: Multi-task Learning and Transformer Finetuning for Emotion Classification and Empathy Prediction
Jay MundraRohan GuptaSagnik Mukherjee
2021-04-20
CATE meets ML -- The Conditional Average Treatment Effect and Machine Learning
Daniel Jacob
2021-04-20
Analyzing COVID-19 Tweets with Transformer-based Language Models
Philip FeldmanSim TiwariCharissa S. L. CheahJames R. FouldsSHimei Pan
2021-04-20
Subsentence Extraction from Text Using Coverage-Based Deep Learning Language Models
| JongYoon LimInkyu SaHo Seok AhnNorina GasteigerSanghyub John LeeBruce MacDonald
2021-04-20
BigGreen at SemEval-2021 Task 1: Lexical Complexity Prediction with Assembly Models
Aadil IslamWeicheng MaSoroush Vosoughi
2021-04-19
Probing for Bridging Inference in Transformer Language Models
Onkar PanditYufang Hou
2021-04-19
Sentiment Classification in Swahili Language Using Multilingual BERT
Gati L. MartinMedard E. MswahiliYoung-Seob Jeong
2021-04-19
TeamUNCC@LT-EDI-EACL2021: Hope Speech Detection using Transfer Learning with Transformers
| Khyati MahajanErfan Al-HossamiSamira Shaikh
2021-04-19
OCTIS: Comparing and Optimizing Topic models is Simple!
| Silvia TerragniElisabetta FersiniBruno Giovanni GaluzziPietro TropeanoAntonio Candelieri
2021-04-19
Neural Language Models with Distant Supervision to Identify Major Depressive Disorder from Clinical Notes
Bhavani Singh Agnikula KshatriyaNicolas A NunezManuel Gardea- ResendezEuijung RyuBrandon J CoombesSunyang FuMark A FryeJoanna M BiernackaYanshan Wang
2021-04-19
Operationalizing a National Digital Library: The Case for a Norwegian Transformer Model
Per E KummervoldJavier de la RosaFreddy WetjenSvein Arne Brygfjeld
2021-04-19
Modeling "Newsworthiness" for Lead-Generation Across Corpora
Alexander SpangherNanyun PengJonathan MayEmilio Ferrara
2021-04-19
ELECTRAMed: a new pre-trained language representation model for biomedical NLP
| Giacomo MioloGiulio MantoanCarlotta Orsenigo
2021-04-19
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
Kang Min YooDongju ParkJaewook KangSang-Woo LeeWoomyeong Park
2021-04-18
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Yao LuMax BartoloAlastair MooreSebastian RiedelPontus Stenetorp
2021-04-18
Natural Instructions: Benchmarking Generalization to New Tasks from Natural Language Instructions
Swaroop MishraDaniel KhashabiChitta BaralHannaneh Hajishirzi
2021-04-18
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation
Tianyu LiuYizhe ZhangChris BrockettYi MaoZhifang SuiWeizhu ChenBill Dolan
2021-04-18
Rethinking Network Pruning -- under the Pre-train and Fine-tune Paradigm
Dongkuan XuIan E. H. YenJinxi ZhaoZhibin Xiao
2021-04-18
Dual-View Distilled BERT for Sentence Embedding
Xingyi Cheng
2021-04-18
Language in a (Search) Box: Grounding Language Learning in Real-World Human-Machine Interaction
Federico BianchiCiro GrecoJacopo Tagliabue
2021-04-18
Zero-shot Cross-lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders
Guanhua ChenShuming MaYun ChenLi DongDongdong ZhangJia PanWenping WangFuru Wei
2021-04-18
CEAR: Cross-Entity Aware Reranker for Knowledge Base Completion
Keshav KolluruMayank Singh ChauhanYatin NandwaniParag SinglaMausam
2021-04-18
The Power of Scale for Parameter-Efficient Prompt Tuning
| Brian LesterRami Al-RfouNoah Constant
2021-04-18
mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs
Zewen ChiLi DongShuming MaShaohan Huang Xian-Ling MaoHeyan HuangFuru Wei
2021-04-18
Identifying the Limits of Cross-Domain Knowledge Transfer for Pretrained Models
| Zhengxuan WuNelson F. LiuChristopher Potts
2021-04-17
Multi-source Neural Topic Modeling in Multi-view Embedding Spaces
| Pankaj GuptaYatin ChaudharyHinrich Schütze
2021-04-17
Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training
Kuan-Hao HuangWasi Uddin AhmadNanyun PengKai-Wei Chang
2021-04-17
UPB at SemEval-2021 Task 5: Virtual Adversarial Training for Toxic Spans Detection
Andrei ParaschivDumitru-Clementin CercelMihai Dascalu
2021-04-17
The Topic Confusion Task: A Novel Scenario for Authorship Attribution
Malik H. AltakroriJackie Chi Kit CheungBenjamin C. M. Fung
2021-04-17
Frequency-based Distortions in Contextualized Word Embeddings
Kaitlyn ZhouKawin EthayarajhDan Jurafsky
2021-04-17
A multilabel approach to morphosyntactic probing
Naomi Tachikawa ShapiroAmandalynne PaulladaShane Steinert-Threlkeld
2021-04-17
Hierarchical Transformer Networks for Longitudinal Clinical Document Classification
Yuqi SiKirk Roberts
2021-04-17
ASBERT: Siamese and Triplet network embedding for open question answering
Olabanji Shonibare
2021-04-17
Co-BERT: A Context-Aware BERT Retrieval Model Incorporating Local and Query-specific Context
Xiaoyang ChenKai HuiBen HeXianpei HanLe SunZheng Ye
2021-04-17
Editing Factual Knowledge in Language Models
| Nicola De CaoWilker AzizIvan Titov
2021-04-16
Fast, Effective and Self-Supervised: Transforming Masked LanguageModels into Universal Lexical and Sentence Encoders
Fangyu LiuIvan VulićAnna KorhonenNigel Collier
2021-04-16
An Adversarially-Learned Turing Test for Dialog Generation Models
| Xiang GaoYizhe ZhangMichel GalleyBill Dolan
2021-04-16
Towards Variable-Length Textual Adversarial Attacks
Junliang GuoZhirui ZhangLinlin ZhangLinli XuBoxing ChenEnhong ChenWeihua Luo
2021-04-16
Temporal Adaptation of BERT and Performance on Downstream Document Classification: Insights from Social Media
| Paul RöttgerJanet B. Pierrehumbert
2021-04-16
Probing Across Time: What Does RoBERTa Know and When?
Leo Z. LiuYizhong WangJungo KasaiHannaneh HajishirziNoah A. Smith
2021-04-16
Text2App: A Framework for Creating Android Apps from Text Descriptions
| Masum HasanKazi Sajeed MehrabWasi Uddin AhmadRifat Shahriyar
2021-04-16
An Analysis of a BERT Deep Learning Strategy on a Technology Assisted Review Task
Alexandros Ioannidis
2021-04-16
Surface Form Competition: Why the Highest Probability Answer Isn't Always Right
| Ari HoltzmanPeter WestVered SchwartzYejin ChoiLuke Zettlemoyer
2021-04-16
Membership Inference Attack Susceptibility of Clinical Language Models
Abhyuday JagannathaBhanu Pratap Singh RawatHong Yu
2021-04-16
BERT memorisation and pitfalls in low-resource scenarios
Michael TänzerSebastian RuderMarek Rei
2021-04-16
A Sample-Based Training Method for Distantly Supervised Relation Extraction with Pre-Trained Transformers
Mehrdad NasserMohamad Bagher SajadiBehrouz Minaei-Bidgoli
2021-04-15
Emotion Dynamics Modeling via BERT
Haiqin YangJianping Shen
2021-04-15
Text Guide: Improving the quality of long text classification by a text selection method based on feature importance
Krzysztof FiokWaldemar KarwowskiEdgar GutierrezMohammad Reza DavahliMaciej WilamowskiTareq AhramAwad Al-JuaidJozef Zurada
2021-04-15
Are Multilingual BERT models robust? A Case Study on Adversarial Attacks for Multilingual Question Answering
Sara RosenthalMihaela BorneaAvirup Sil
2021-04-15
SINA-BERT: A pre-trained Language Model for Analysis of Medical Texts in Persian
Nasrin TaghizadehEhsan DoostmohammadiElham SeifossadatHamid R. RabieeMaedeh S. Tahaei
2021-04-15
Privacy-Adaptive BERT for Natural Language Understanding
Chen QuWeize KongLiu YangMingyang ZhangMichael BenderskyMarc Najork
2021-04-15
UHD-BERT: Bucketed Ultra-High Dimensional Sparse Representations for Full Ranking
Kyoung-Rok JangJunmo KangGiwon HongSung-Hyon MyaengJoohee ParkTaewon YoonHeecheol Seo
2021-04-15
ExplaGraphs: An Explanation Graph Generation Task for Structured Commonsense Reasoning
| Swarnadeep SahaPrateek YadavLisa BauerMohit Bansal
2021-04-15
UIT-E10dot3 at SemEval-2021 Task 5: Toxic Spans Detection with Named Entity Recognition and Question-Answering Approaches
Phu Gia HoangLuan Thanh NguyenKiet Van Nguyen
2021-04-15
BERT based Transformers lead the way in Extraction of Health Information from Social Media
Sidharth RAbhiraj TiwariParthivi ChoubeySaisha KashyapSahil KhoseKumud LakaraNishesh SinghUjjwal Verma
2021-04-15
NT5?! Training T5 to Perform Numerical Reasoning
| Peng-Jian YangYing Ting ChenYuechan ChenDaniel Cer
2021-04-15
TorontoCL at CMCL 2021 Shared Task: RoBERTa with Multi-Stage Fine-Tuning for Eye-Tracking Prediction
| Bai LiFrank Rudzicz
2021-04-15
Does BERT Pretrained on Clinical Notes Reveal Sensitive Data?
| Eric LehmanSarthak JainKarl PichottaYoav GoldbergByron C. Wallace
2021-04-15
How to Train BERT with an Academic Budget
| Peter IzsakMoshe BerchanskyOmer Levy
2021-04-15
A Survey of Recent Abstract Summarization Techniques
Diyah Puspitaningrum
2021-04-15
NAREOR: The Narrative Reordering Problem
Varun GangalSteven Y. FengEduard HovyTeruko Mitamura
2021-04-14
Dependency Parsing based Semantic Representation Learning with Graph Neural Network for Enhancing Expressiveness of Text-to-Speech
Yixuan ZhouChanghe SongJingbei LiZhiyong WuHelen Meng
2021-04-14
On the Robustness of Goal Oriented Dialogue Systems to Real-world Noise
Jason KroneSailik SenguptaSaab Mansoor
2021-04-14
Disentangling Representations of Text by Masking Transformers
Xiongyi ZhangJan-Willem van de MeentByron C. Wallace
2021-04-14
An Interpretability Illusion for BERT
Tolga BolukbasiAdam PearceAnn YuanAndy CoenenEmily ReifFernanda ViégasMartin Wattenberg
2021-04-14
Static Embeddings as Efficient Knowledge Bases?
| Philipp DufterNora KassnerHinrich Schütze
2021-04-14
Demystifying BERT: Implications for Accelerator Design
Suchita PatiShaizeen AgaNuwan JayasenaMatthew D. Sinclair
2021-04-14
Semantic maps and metrics for science Semantic maps and metrics for science using deep transformer encoders
Brendan ChambersJames Evans
2021-04-13
Understanding Transformers for Bot Detection in Twitter
| Andres Garcia-SilvaCristian BerrioJose Manuel Gomez-Perez
2021-04-13
1-bit LAMB: Communication Efficient Large-Scale Large-Batch Training with LAMB's Convergence Speed
| Conglong LiAmmar Ahmad AwanHanlin TangSamyam RajbhandariYuxiong He
2021-04-13
Mediators in Determining what Processing BERT Performs First
| Aviv SlobodkinLeshem ChoshenOmri Abend
2021-04-13
Discourse Probing of Pretrained Language Models
Fajri KotoJey Han LauTimothy Baldwin
2021-04-13
MS2: Multi-Document Summarization of Medical Studies
| Jay DeYoungIz BeltagyMadeleine van ZuylenBailey KuehlLucy Lu Wang
2021-04-13
Large-Scale Contextualised Language Modelling for Norwegian
| Andrey KutuzovJeremy BarnesErik VelldalLilja ØvrelidStephan Oepen
2021-04-13
WHOSe Heritage: Classification of UNESCO World Heritage "Outstanding Universal Value" Documents with Smoothed Labels
| Nan BaiRenqian LuoPirouz NourianAna Pereira Roders
2021-04-12
Fine-Tuning Transformers for Identifying Self-Reporting Potential Cases and Symptoms of COVID-19 in Tweets
| Max FlemingPriyanka DondetiCaitlin N. DreisbachAdam Poliak
2021-04-12
Multilingual Language Models Predict Human Reading Behavior
| Nora HollensteinFederico PirovanoCe ZhangLena JägerLisa Beinborn
2021-04-12
Learning to Remove: Towards Isotropic Pre-trained BERT Embedding
| Yuxin LiangRui CaoJie ZhengJie RenLing Gao
2021-04-12
Learning to Synthesize Data for Semantic Parsing
| Bailin WangWenpeng YinXi Victoria LinCaiming Xiong
2021-04-12
Fighting the COVID-19 Infodemic with a Holistic BERT Ensemble
| Giorgos TziafasKonstantinos KogkalidisTommaso Caselli
2021-04-12
BERT based freedom to operate patent analysis
Michael FreunekAndré Bodmer
2021-04-12
Does syntax matter? A strong baseline for Aspect-based Sentiment Analysis with RoBERTa
| Junqi DaiHang YanTianxiang SunPengFei LiuXipeng Qiu
2021-04-11
Innovative Bert-based Reranking Language Models for Speech Recognition
Shih-Hsuan ChiuBerlin Chen
2021-04-11
UniDrop: A Simple yet Effective Technique to Improve Transformer without Extra Cost
Zhen WuLijun WuQi MengYingce XiaShufang XieTao QinXinyu DaiTie-Yan Liu
2021-04-11
Fine-tuning Encoders for Improved Monolingual and Zero-shot Polylingual Neural Topic Modeling
| Aaron MuellerMark Dredze
2021-04-11
MIPT-NSU-UTMN at SemEval-2021 Task 5: Ensembling Learning with Pre-trained Language Models for Toxic Spans Detection
| Mikhail KotyushevAnna GlazkovaDmitry Morozov
2021-04-10
Meta-tuning Language Models to Answer Prompts Better
Ruiqi ZhongKristy LeeZheng ZhangDan Klein
2021-04-10
ZS-BERT: Towards Zero-Shot Relation Extraction with Attribute Representation Learning
| Chih-Yao ChenCheng-Te Li
2021-04-10
Non-autoregressive Transformer-based End-to-end ASR using BERT
Fu-Hao YuKuan-Yu Chen
2021-04-10
Know What and Know Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation
Yuankai QiZizheng PanYicong HongMing-Hsuan YangAnton Van Den HengelQi Wu
2021-04-09
Knowledge-Aware Graph-Enhanced GPT-2 for Dialogue State Tracking
Weizhe LinBo-Hsian TsengBill Byrne
2021-04-09
Text2Chart: A Multi-Staged Chart Generator from Natural Language Text
| Md. Mahinur RashidHasin Kawsar JahanAnnysha HuzzatRiyasaat Ahmed RahulTamim Bin ZakirFarhana MeemMd. Saddam Hossain MuktaSwakkhar Shatabda
2021-04-09
KI-BERT: Infusing Knowledge Context for Better Language and Domain Understanding
Keyur FalduAmit ShethPrashant KikaniHemang Akabari
2021-04-09
Transformers: "The End of History" for NLP?
Anton ChernyavskiyDmitry IlvovskyPreslav Nakov
2021-04-09
Lone Pine at SemEval-2021 Task 5: Fine-Grained Detection of Hate Speech Using BERToxic
Yakoob KhanWeicheng MaSoroush Vosoughi
2021-04-08
Uppsala NLP at SemEval-2021 Task 2: Multilingual Language Models for Fine-tuning and Feature Extraction in Word-in-Context Disambiguation
Huiling YouXingran ZhuSara Stymne
2021-04-08
Probing BERT in Hyperbolic Spaces
| Boli ChenYao FuGuangwei XuPengjun XieChuanqi TanMosha ChenLiping Jing
2021-04-08
Layer Reduction: Accelerating Conformer-Based Self-Supervised Model via Layer Consistency
Jinchuan TianRongzhi GuHelin WangYuexian Zou
2021-04-08
Combining Pre-trained Word Embeddings and Linguistic Features for Sequential Metaphor Identification
Rui MaoChenghua LinFrank Guerin
2021-04-07
Better Neural Machine Translation by Extracting Linguistic Information from BERT
| Hassan S. ShavaraniAnoop Sarkar
2021-04-07
Interpreting Verbal Metaphors by Paraphrasing
Rui MaoChenghua LinFrank Guerin
2021-04-07
Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs
| Sujeong ChaWangrui HouHyun JungMy PhungMichael PichenyHong-Kwang KuoSamuel ThomasEdmilson Morais
2021-04-07
MuSLCAT: Multi-Scale Multi-Level Convolutional Attention Transformer for Discriminative Music Modeling on Raw Waveforms
Kai MiddlebrookShyam SudhakaranDavid Guy Brizan
2021-04-06
hBert + BiasCorp -- Fighting Racism on the Web
Olawale OnabolaZhuang MaYang XieBenjamin AkeraAbdulrahman IbraheemJia XueDianbo LiuYoshua Bengio
2021-04-06
Attention Head Masking for Inference Time Content Selection in Abstractive Summarization
Shuyang CaoLu Wang
2021-04-06
Variable selection with missing data in both covariates and outcomes: Imputation and machine learning
| Liangyuan HuJung-Yi Joyce LinJiayi Ji
2021-04-06
CodeTrans: Towards Cracking the Language of Silicone's Code Through Self-Supervised Deep Learning and High Performance Computing
| Ahmed ElnaggarWei DingLlion JonesTom GibbsTamas FeherChristoph AngererSilvia SeveriniFlorian MatthesBurkhard Rost
2021-04-06
Efficient transfer learning for NLP with ELECTRA
| François Mercier
2021-04-06
Exploring Transformers in Emotion Recognition: a comparison of BERT, DistillBERT, RoBERTa, XLNet and ELECTRA
Diogo Cortiz
2021-04-05
What's the best place for an AI conference, Vancouver or ______: Why completing comparative questions is difficult
Avishai ZagouryEinat MinkovIdan SzpektorWilliam W. Cohen
2021-04-05
Semantic Distance: A New Metric for ASR Performance Analysis Towards Spoken Language Understanding
Suyoun KimAbhinav AroraDuc LeChing-Feng YehChristian FuegenOzlem KalinliMichael L. Seltzer
2021-04-05
ReCAM@IITK at SemEval-2021 Task 4: BERT and ALBERT based Ensemble for Abstract Word Prediction
| Abhishek MittalAshutosh Modi
2021-04-04
Improving Pretrained Models for Zero-shot Multi-label Text Classification through Reinforced Label Hierarchy Reasoning
| Hui LiuDanqing ZhangBing YinXiaodan Zhu
2021-04-04
MCL@IITK at SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation using Augmented Data, Signals, and Transformers
Rohan GuptaJay MundraDeepak MahajanAshutosh Modi
2021-04-04
Unsupervised Domain Adaptation with Global and Local Graph Neural Networks in Limited Labeled Data Scenario: Application to Disaster Management
Samujjwal GhoshSubhadeep MajiMaunendra Sankar Desarkar
2021-04-03
Exploring the Role of BERT Token Representations to Explain Sentence Probing Results
Hosein MohebbiAli ModarressiMohammad Taher Pilehvar
2021-04-03
IITK@LCP at SemEval 2021 Task 1: Classification for Lexical Complexity Regression Task
| Neil Rajiv ShirudeSagnik MukherjeeTushar ShandhilyaAnanta MukherjeeAshutosh Modi
2021-04-02
The Coronavirus is a Bioweapon: Analysing Coronavirus Fact-Checked Stories
Lynnette Hui Xian NgKathleen M. Carley
2021-04-02
Using GPT-2 to Create Synthetic Data to Improve the Prediction Performance of NLP Machine Learning Classification Models
Dewayne Whitfield
2021-04-02
HLE-UPC at SemEval-2021 Task 5: Multi-Depth DistilBERT for Toxic Spans Detection
| Rafel Palliser-SansAlbert Rial-Farràs
2021-04-01
Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
| Mingchen ZhugeDehong GaoDeng-Ping FanLinbo JinBen ChenHaoming ZhouMinghui QiuLing Shao
2021-03-30
Automatic Graph Partitioning for Very Large-scale Deep Learning
Masahiro TanakaKenjiro TauraToshihiro HanawaKentaro Torisawa
2021-03-30
Grounding Dialogue Systems via Knowledge Graph Aware Decoding with Pre-trained Transformers
| Debanjan ChaudhuriMd Rashad Al Hasan RonyJens Lehmann
2021-03-30
An In-depth Analysis of Passage-Level Label Transfer for Contextual Document Ranking
Koustav RudraZeon Trevor FernandoAvishek Anand
2021-03-30
Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding
| Pengchuan ZhangXiyang DaiJianwei YangBin XiaoLu YuanLei ZhangJianfeng Gao
2021-03-29
Retraining DistilBERT for a Voice Shopping Assistant by Using Universal Dependencies
Pratik JayaraoArpit Sharma
2021-03-29
Whitening Sentence Representations for Better Semantics and Faster Retrieval
| Jianlin SuJiarun CaoWeijie LiuYangyiwen Ou
2021-03-29
Contextual Text Embeddings for Twi
Paul AzunreSalomey OseiSalomey AddoLawrence Asamoah Adu-GyamfiStephen MooreBernard AdabankahBernard OpokuClara Asare-NyarkoSamuel NyarkoCynthia AmoabaEsther Dansoa AppiahFelix AkwerhRichard Nii Lante LawsonJoel BuduEmmanuel DebrahNana BoatengWisdom OforiEdwin Buabeng-MunkohFranklin AdjeiIsaac Kojo Essel AmpomahJoseph OtooReindorf BorkorStandylove Birago MensahLucien MensahMark Amoako MarcelAnokye Acheampong AmponsahJames Ben Hayfron-Acquah
2021-03-29
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
Ye JiaHeiga ZenJonathan ShenYu ZhangYonghui Wu
2021-03-28
Unsupervised Self-Training for Sentiment Analysis of Code-Switched Data
Akshat GuptaSargam MenghaniSai Krishna RallabandiAlan W Black
2021-03-27
Machine Learning Meets Natural Language Processing -- The story so far
N. -I. GalanisP. VafiadisK. -G. MirzaevG. A. Papakostas
2021-03-27
A Practical Survey on Faster and Lighter Transformers
Quentin FournierGaétan Marceau CaronDaniel Aloise
2021-03-26
BART based semantic correction for Mandarin automatic speech recognition system
Yun ZhaoXuerui YangJinchao WangYongyu GaoChao YanYuanfu Zhou
2021-03-26
Predicting Directionality in Causal Relations in Text
| Pedram HosseiniDavid A. BroniatowskiMona Diab
2021-03-25
Visual Grounding Strategies for Text-Only Natural Language Processing
Damien Sileo
2021-03-25
Bertinho: Galician BERT Representations
David VilaresMarcos GarciaCarlos Gómez-Rodríguez
2021-03-25
BERT4SO: Neural Sentence Ordering by Fine-tuning BERT
Yutao ZhuJian-Yun NieKun ZhouShengchao LiuYabo LingPan Du
2021-03-25
An Image is Worth 16x16 Words, What is a Video Worth?
| Gilad SharirAsaf NoyLihi Zelnik-Manor
2021-03-25
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
| Ze LiuYutong LinYue CaoHan HuYixuan WeiZheng ZhangStephen LinBaining Guo
2021-03-25
K-XLNet: A General Method for Combining Explicit Knowledge with Language Model Pretraining
Ruiqing YanLanchang SunFang WangXiaoMing Zhang
2021-03-25
Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning Performance of GPT-2
Gregor BetzKyle RichardsonChristian Voigt
2021-03-24
Czert -- Czech BERT-like Model for Language Representation
| Jakub SidoOndřej PražákPavel PřibáňJan PašekMichal SejákMiloslav Konopík
2021-03-24
Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection
Jan Philip WahleTerry RuasNorman MeuschkeBela Gipp
2021-03-23
Detecting Hate Speech with GPT-3
| Ke-Li ChiuRohan Alexander
2021-03-23
TMR: Evaluating NER Recall on Tough Mentions
Jingxuan TuConstantine Lignos
2021-03-23
Repairing Pronouns in Translation with BERT-Based Post-Editing
Reid Pryzant
2021-03-23
Variable Name Recovery in Decompiled Binary Code using Constrained Masked Language Modeling
Pratyay BanerjeeKuntal Kumar PalFish WangChitta Baral
2021-03-23
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures
Sushant SinghAusif Mahmood
2021-03-23
Identifying Machine-Paraphrased Plagiarism
| Jan Philip WahleTerry RuasTomáš FoltýnekNorman MeuschkeBela Gipp
2021-03-22
Open Domain Question Answering over Tables via Dense Retrieval
| Jonathan HerzigThomas MüllerSyrine KricheneJulian Martin Eisenschlos
2021-03-22
BERT: A Review of Applications in Natural Language Processing and Understanding
M. V. Koroteev
2021-03-22
Bridging the gap between supervised classification and unsupervised topic modelling for social-media assisted crisis management
Mikael BrunilaRosie ZhaoAndrei MirceaSam LumleyRenee Sieber
2021-03-22
Hybrid Model for Patent Classification using Augmented SBERT and KNN
| Hamid BekamiriDaniel S. HainRoman Jurowetzki
2021-03-22
ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques
| Yuanxin LiuZheng LinFengcheng Yuan
2021-03-21
NameRec*: Highly Accurate and Fine-grained Person Name Recognition
Rui ZhangYimeng DaiShijie Liu
2021-03-21
Play the Shannon Game With Language Models: A Human-Free Approach to Summary Evaluation
Nicholas EganOleg VasilyevJohn Bohannon
2021-03-19
MuRIL: Multilingual Representations for Indian Languages
Simran KhanujaDiksha BansalSarvesh MehtaniSavya KhoslaAtreyee DeyBalaji GopalanDilip Kumar MargamPooja AggarwalRajiv Teja NagipoguShachi DaveShruti GuptaSubhash Chandra Bose GaliVish SubramanianPartha Talukdar
2021-03-19
Cost-effective Deployment of BERT Models in Serverless Environment
Katarína BenešováAndrej ŠvecMarek Šuppa
2021-03-19
Let Your Heart Speak in its Mother Tongue: Multilingual Captioning of Cardiac Signals
| Dani KiyassehTingting ZhuDavid Clifton
2021-03-19
GPT Understands, Too
| Xiao LiuYanan ZhengZhengxiao DuMing DingYujie QianZhilin YangJie Tang
2021-03-18
All NLP Tasks Are Generation Tasks: A General Pretraining Framework
| Zhengxiao DuYujie QianXiao LiuMing DingJiezhong QiuZhilin YangJie Tang
2021-03-18
Contextual Biasing of Language Models for Speech Recognition in Goal-Oriented Conversational Agents
Ashish ShenoySravan BodapatiKatrin Kirchhoff
2021-03-18
Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!
Xuanli HeLingjuan LyuQiongkai XuLichao Sun
2021-03-18
On the Role of Images for Analyzing Claims in Social Media
| Gullal S. CheemaSherzod HakimovEric Müller-BudackRalph Ewerth
2021-03-17
UniParma at SemEval-2021 Task 5: Toxic Spans Detection Using CharacterBERT and Bag-of-Words Model
Akbar KarimiLeonardo RossiAndrea Prati
2021-03-17
Code Word Detection in Fraud Investigations using a Deep-Learning Approach
Youri van der ZeeJan C. ScholtesMarcel WesterhoudJulien Rossi
2021-03-17
KGSynNet: A Novel Entity Synonyms Discovery Framework with Knowledge Graph
Yiying YangXi YinHaiqin YangXingjian FeiHao PengKaijie ZhouKunfeng LaiJianping Shen
2021-03-16
Robustly Optimized and Distilled Training for Natural Language Understanding
Haytham ElFadeelStan Peshterliev
2021-03-16
Text Mining of Stocktwits Data for Predicting Stock Prices
Mukul JaggiPriyanka MandalShreya NarangUsman NaseemMatloob Khushi
2021-03-13
Is BERT a Cross-Disciplinary Knowledge Learner? A Surprising Finding of Pre-trained Models' Transferability
Wei-Tsung KaoHung-Yi Lee
2021-03-12
Explaining and Improving BERT Performance on Lexical Semantic Change Detection
Severin LaicherSinan KurtyigitDominik SchlechtwegJonas KuhnSabine Schulte im Walde
2021-03-12
Comparing the Performance of NLP Toolkits and Evaluation measures in Legal Tech
Muhammad Zohaib Khan
2021-03-12
Evaluation of Morphological Embeddings for the Russian Language
Vitaly RomanovAlbina Khusainova
2021-03-11
Improving Bi-encoder Document Ranking Models with Two Rankers and Multi-teacher Distillation
Jaekeol ChoiEuna JungJangwon SuhWonjong Rhee
2021-03-11
Composite Re-Ranking for Efficient Document Search with BERT
Yingrui YangYifan QiaoJinjin ShaoMayuresh AnandXifeng YanTao Yang
2021-03-11
Towards Multi-Sense Cross-Lingual Alignment of Contextual Embeddings
Linlin LiuThien Hai NguyenShafiq JotyLidong BingLuo Si
2021-03-11
LightMBERT: A Simple Yet Effective Method for Multilingual BERT Distillation
Xiaoqi JiaoYichun YinLifeng ShangXin JiangXiao ChenLinlin LiFang WangQun Liu
2021-03-11
FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders
Pengyu ChengWeituo HaoSiyang YuanShijing SiLawrence Carin
2021-03-11
Self-supervised Text-to-SQL Learning with Header Alignment Training
Donggyu KimSeanie Lee
2021-03-11
Majority Voting with Bidirectional Pre-translation For Bitext Retrieval
| Alex JonesDerry Tanti Wijaya
2021-03-10
CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review
| Dan HendrycksCollin BurnsAnya ChenSpencer Ball
2021-03-10
CEQE: Contextualized Embeddings for Query Expansion
Shahrzad NaseriJeffrey DaltonAndrew YatesJames Allan
2021-03-09
Language Models have a Moral Dimension
Patrick SchramowskiCigdem TuranNico AndersenConstantin RothkopfKristian Kersting
2021-03-08
Syntax-BERT: Improving Pre-trained Transformers with Syntax Trees
| Jiangang BaiYujing WangYiren ChenYaming YangJing BaiJing YuYunhai Tong
2021-03-07
Orthogonal Attention: A Cloze-Style Approach to Negation Scope Resolution
Aditya KhandelwalVahida Attar
2021-03-07
MalBERT: Using Transformers for Cybersecurity and Malicious Software Detection
Abir RahaliMoulay A. Akhloufi
2021-03-05
Fine-tuning Pretrained Multilingual BERT Model for Indonesian Aspect-based Sentiment Analysis
Annisa Nurul AzharMasayu Leylia Khodra
2021-03-05
Non-invasive Self-attention for Side Information Fusion in Sequential Recommendation
Chang LiuXiaoguang LiGuohao CaiZhenhua DongHong ZhuLifeng Shang
2021-03-05
Measuring Mathematical Problem Solving With the MATH Dataset
| Dan HendrycksCollin BurnsSaurav KadavathAkul AroraSteven BasartEric TangDawn SongJacob Steinhardt
2021-03-05
Hardware Acceleration of Fully Quantized BERT for Efficient Natural Language Processing
Zejian LiuGang LiJian Cheng
2021-03-04
Few-shot Learning for Slot Tagging with Attentive Relational Network
Cennet OguzNgoc Thang Vu
2021-03-03
Hate Towards the Political Opponent: A Twitter Corpus Study of the 2020 US Elections on the Basis of Offensive Speech and Stance Detection
Lara GrimmingerRoman Klinger
2021-03-02
BERT-based knowledge extraction method of unstructured domain text
Wang ZijiaLi YeZhu Zhongkai
2021-03-01
Combat COVID-19 Infodemic Using Explainable Natural Language Processing Models
Jackie AyoubX. Jessie YangFeng Zhou
2021-03-01
Long Document Summarization in a Low Resource Setting using Pretrained Language Models
Ahsaas BajajPavitra DangatiKalpesh KrishnaPradhiksha Ashok KumarRheeya UppaalBradford WindsorEliot BrennerDominic DotterrerRajarshi DasAndrew McCallum
2021-03-01
BERT based patent novelty search by training claims to their own description
Michael FreunekAndré Bodmer
2021-03-01
NLP-CUET@DravidianLangTech-EACL2021: Investigating Visual and Textual Features to Identify Trolls from Multimodal Social Media Memes
Eftekhar HossainOmar SharifMohammed Moshiul Hoque
2021-02-28
NLP-CUET@LT-EDI-EACL2021: Multilingual Code-Mixed Hope Speech Detection using Cross-lingual Representation Learner
| Eftekhar HossainOmar SharifMohammed Moshiul Hoque
2021-02-28
NLP-CUET@DravidianLangTech-EACL2021: Offensive Language Detection from Multilingual Code-Mixed Text using Transformers
| Omar SharifEftekhar HossainMohammed Moshiul Hoque
2021-02-28
Transformers with Competitive Ensembles of Independent Mechanisms
Alex LambDi HeAnirudh GoyalGuolin KeChien-Feng LiaoMirco RavanelliYoshua Bengio
2021-02-27
COVID-19 Tweets Analysis through Transformer Language Models
| Abdul Hameed AzeemiAdeel Waheed
2021-02-27
Multi-task transfer learning for finding actionable information from crisis-related messages on social media
Congcong WangDavid Lillis
2021-02-26
Sentiment Analysis of Persian-English Code-mixed Texts
| Nazanin SabriAli EdalatBehnam Bahrak
2021-02-25
Emotion-Aware, Emotion-Agnostic, or Automatic: Corpus Creation Strategies to Obtain Cognitive Event Appraisal Annotations
Jan HofmannEnrica TroianoRoman Klinger
2021-02-25
PharmKE: Knowledge Extraction Platform for Pharmaceutical Texts using Transfer Learning
Nasi JofcheKostadin MishevRiste StojanovMilos JovanovikDimitar Trajanov
2021-02-25
BERT-based Acronym Disambiguation with Multiple Training Strategies
Chunguang PanBingyan SongShengguang WangZhipeng Luo
2021-02-25
Task-Specific Pre-Training and Cross Lingual Transfer for Code-Switched Data
Akshat GuptaSai Krishna RallabandiAlan Black
2021-02-24
LRG at SemEval-2021 Task 4: Improving Reading Comprehension with Abstract Words using Augmentation, Linguistic Features and Voting
| Abheesht SharmaHarshit PandeyGunjan ChhablaniYash BhartiaTirtharaj Dash
2021-02-24
NLRG at SemEval-2021 Task 5: Toxic Spans Detection Leveraging BERT-based Token Classification and Span Prediction Techniques
| Gunjan ChhablaniYash BhartiaAbheesht SharmaHarshit PandeyShan Suthaharan
2021-02-24
PADA: A Prompt-based Autoregressive Approach for Adaptation to Unseen Domains
| Eyal Ben-DavidNadav OvedRoi Reichart
2021-02-24
From Universal Language Model to Downstream Task: Improving RoBERTa-Based Vietnamese Hate Speech Detection
Quang Huu PhamViet Anh NguyenLinh Bao DoanNgoc N. TranTa Minh Thanh
2021-02-24
Hopeful_Men@LT-EDI-EACL2021: Hope Speech Detection Using Indic Transliteration and Transformers
Ishan Sanjeev UpadhyayNikhil EAnshul WadhawanRadhika Mamidi
2021-02-24
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
| Wenhai WangEnze XieXiang LiDeng-Ping FanKaitao SongDing LiangTong LuPing LuoLing Shao
2021-02-24
Robust and Transferable Anomaly Detection in Log Data using Pre-Trained Language Models
Harold OttJasmin BogatinovskiAlexander AckerSasho NedelkoskiOdej Kao
2021-02-23
Minimally-Supervised Structure-Rich Text Categorization via Learning on Text-Rich Networks
Xinyang ZhangChenwei ZhangLuna Xin DongJingbo ShangJiawei Han
2021-02-23
VisualCheXbert: Addressing the Discrepancy Between Radiology Report Labels and Image Labels
| Saahil JainAkshay SmitSteven QH TruongChanh DT NguyenMinh-Thanh HuynhMudit JainVictoria A. YoungAndrew Y. NgMatthew P. LungrenPranav Rajpurkar
2021-02-23
Using Prior Knowledge to Guide BERT's Attention in Semantic Textual Matching Tasks
| Tingyu XiaYue WangYuan TianYi Chang
2021-02-22
Evaluating Contextualized Language Models for Hungarian
| Judit ÁcsDániel LévaiDávid Márk NemeskeyAndrás Kornai
2021-02-22
Generating Human Readable Transcript for Automatic Speech Recognition with Pre-trained Language Model
Junwei LiaoYu ShiMing GongLinjun ShouSefik EskimezLiyang LuHong QuMichael Zeng
2021-02-22
Few Shot Learning for Information Verification
Usama KhalidMirza Omer Beg
2021-02-22
MixUp Training Leads to Reduced Overfitting and Improved Calibration for the Transformer Architecture
Wancong ZhangIeshan Vaidya
2021-02-22
RUBERT: A Bilingual Roman Urdu BERT Using Cross Lingual Transfer Learning
Usama KhalidMirza Omer BegMuhammad Umair Arshad
2021-02-22
Parallelizing Legendre Memory Unit Training
| Narsimha ChilkuriChris Eliasmith
2021-02-22
Pre-Training BERT on Arabic Tweets: Practical Considerations
Ahmed AbdelaliSabit HassanHamdy MubarakKareem DarwishYounes Samih
2021-02-21
Web-based Application for Detecting Indonesian Clickbait Headlines using IndoBERT
Muhammad Noor FakhruzzamanSie Wildan Gunawan
2021-02-21
Learning Dynamic BERT via Trainable Gate Variables and a Bi-modal Regularizer
Seohyeong JeongNojun Kwak
2021-02-19
Towards Emotion Recognition in Hindi-English Code-Mixed Data: A Transformer Based Approach
Anshul WadhawanAkshita Aggarwal
2021-02-19
Using Transformer based Ensemble Learning to classify Scientific Articles
| Sohom GhoshAnkush Chopra
2021-02-19
Calibrate Before Use: Improving Few-Shot Performance of Language Models
| Tony Z. ZhaoEric WallaceShi FengDan KleinSameer Singh
2021-02-19
Quiz-Style Question Generation for News Stories
| Adam D. LelkesVinh Q. TranCong Yu
2021-02-18
UnibucKernel: Geolocating Swiss German Jodels Using Ensemble Learning
Mihaela GamanSebastian CojocariuRadu Tudor Ionescu
2021-02-18
Training Large-Scale News Recommenders with Pretrained Language Models in the Loop
Shitao XiaoZheng LiuYingxia ShaoTao DiXing Xie
2021-02-18
SciDr at SDU-2020: IDEAS -- Identifying and Disambiguating Everyday Acronyms for Scientific Domain
| Aadarsh SinghPriyanshu Kumar
2021-02-17
Leveraging Query Resolution and Reading Comprehension for Conversational Passage Retrieval
Svitlana VakulenkoNikos VoskaridesZhucheng TuShayne Longpre
2021-02-17
THEaiTRE 1.0: Interactive generation of theatre play scripts
Rudolf RosaTomáš MusilOndřej DušekDominik JurkoPatrícia SchmidtováDavid MarečekOndřej BojarTom KocmiDaniel HrbekDavid KošťákMartina KinskáMarie NovákováJosef DoležalKlára VoseckáTomáš StudeníkPetr Žabka
2021-02-17
TCN: Table Convolutional Network for Web Table Interpretation
Daheng WangPrashant ShiralkarColin LockardBinxuan HuangXin Luna DongMeng Jiang
2021-02-17
Non-Autoregressive Text Generation with Pre-trained Language Models
Yixuan SuDeng CaiYan WangDavid VandykeSimon BakerPiji LiNigel Collier
2021-02-16
Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet
M. Onat TopalAnil BasImke van Heerden
2021-02-16
TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models
Zhuohan LiSiyuan ZhuangShiyuan GuoDanyang ZhuoHao ZhangDawn SongIon Stoica
2021-02-16
Have Attention Heads in BERT Learned Constituency Grammar?
Ziyang Luo
2021-02-16
The corruptive force of AI-generated advice
Margarita LeibNils C. KöbisRainer Michael RilkeMarloes HagensBernd Irlenbusch
2021-02-15
Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm
Laria ReynoldsKyle McDonell
2021-02-15
Fast End-to-End Speech Recognition via a Non-Autoregressive Model and Cross-Modal Knowledge Transferring from BERT
Ye BaiJiangyan YiJianHua TaoZhengkun TianZhengqi WenShuai Zhang
2021-02-15
DOBF: A Deobfuscation Pre-Training Objective for Programming Languages
Baptiste RoziereMarie-Anne LachauxMarc SzafraniecGuillaume Lample
2021-02-15
Improved Customer Transaction Classification using Semi-Supervised Knowledge Distillation
Rohan Sukumaran
2021-02-15
Within-Document Event Coreference with BERT-Based Contextualized Representations
Shafiuddin Rehan AhmedJames H. Martin
2021-02-15
indicnlp@kgp at DravidianLangTech-EACL2021: Offensive Language Identification in Dravidian Languages
| Kushal KediaAbhilash Nandy
2021-02-14
indicnlp@ kgp at DravidianLangTech-EACL2021: Offensive Language Identification in Dravidian Languages
| Kushal KediaAbhilash Nandy
2021-02-14
Multiversal views on language models
Laria ReynoldsKyle McDonell
2021-02-12
Optimizing Inference Performance of Transformers on CPUs
Dave DiceAlex Kogan
2021-02-12
Dancing along Battery: Enabling Transformer with Run-time Reconfigurability on Mobile Devices
Yuhong SongWeiwen JiangBingbing LiPanjie QiQingfeng ZhugeEdwin Hsing-Mean ShaSakyasingha DasguptaYiyu ShiCaiwen Ding
2021-02-12
Dynamic Precision Analog Computing for Neural Networks
| Sahaj GargJoe LouAnirudh JainMitchell Nahmias
2021-02-12
Exploring Classic and Neural Lexical Translation Models for Information Retrieval: Interpretability, Effectiveness, and Efficiency Benefits
| Leonid BoytsovZico Kolter
2021-02-12
Characterizing English Variation across Social Media Communities with BERT
| Li LucyDavid Bamman
2021-02-12
High-Performance Large-Scale Image Recognition Without Normalization
| Andrew BrockSoham DeSamuel L. SmithKaren Simonyan
2021-02-11
NewsBERT: Distilling Pre-trained Language Model for Intelligent News Application
Chuhan WuFangzhao WuYang YuTao QiYongfeng HuangQi Liu
2021-02-09
AuGPT: Dialogue with Pre-trained Language Models and Data Augmentation
| Jonáš KulhánekVojtěch HudečekTomáš NekvindaOndřej Dušek
2021-02-09
Transfer Learning Approach for Arabic Offensive Language Detection System -- BERT-Based Model
Fatemah HusainOzlem Uzuner
2021-02-09
Generating Fake Cyber Threat Intelligence Using Transformer-Based Models
Priyanka RanadeAritran PiplaiSudip MittalAnupam JoshiTim Finin
2021-02-08
How True is GPT-2? An Empirical Analysis of Intersectional Occupational Biases
| Hannah KirkYennie JunHaider IqbalElias BenussiFilippo VolpinFrederic A. DreyerAleksandar ShtedritskiYuki M. Asano
2021-02-08
A Hybrid Task-Oriented Dialog System with Domain and Task Adaptive Pretraining
| Boliang ZhangYing LyuNing DingTianhao ShenZhaoyang JiaKun HanKevin Knight
2021-02-08
Spoiler Alert: Using Natural Language Processing to Detect Spoilers in Book Reviews
| Allen BaoMarshall HoSaarthak Sangamnerkar
2021-02-07
Neural Data-to-Text Generation with LM-based Text Augmentation
Ernie ChangXiaoyu ShenDawei ZhuVera DembergHui Su
2021-02-06
Jointly Improving Language Understanding and Generation with Quality-Weighted Weak Supervision of Automatic Labeling
Ernie ChangVera DembergAlex Marin
2021-02-06
PipeTransformer: Automated Elastic Pipelining for Distributed Training of Transformers
Chaoyang HeShen LiMahdi SoltanolkotabiSalman Avestimehr
2021-02-05
RpBERT: A Text-image Relation Propagation-based BERT Model for Multimodal NER
| Lin SunJiquan WangKai ZhangYindu SuFangsheng Weng
2021-02-05
Understanding Emails and Drafting Responses -- An Approach Using GPT-3
Jonas ThiergartStefan HuberThomas Übellacker
2021-02-05
Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models
Alex TamkinMiles BrundageJack ClarkDeep Ganguli
2021-02-04
1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed
| Hanlin TangShaoduo GanAmmar Ahmad AwanSamyam RajbhandariConglong LiXiangru LianJi LiuCe ZhangYuxiong He
2021-02-04
Hierarchical Multi-head Attentive Network for Evidence-aware Fake News Detection
| Nguyen VoKyumin Lee
2021-02-04
Bootstrapping Multilingual AMR with Contextual Word Alignments
| Janaki ShethYoung-suk LeeRamon Fernandez AstudilloTahira NaseemRadu FlorianSalim RoukosTodd Ward
2021-02-03
HeBERT & HebEMO: a Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition
Avihay ChriquiInbal Yahav
2021-02-03
Neural Transfer Learning with Transformers for Social Science Text Analysis
Sandra Wankmüller
2021-02-03
Clickbait Headline Detection in Indonesian News Sites using Multilingual Bidirectional Encoder Representations from Transformers (M-BERT)
Muhammad N. FakhruzzamanSaidah Z. JannahRatih A. NingrumIndah Fahmiyah
2021-02-02
AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning
| YuHan LiuSaurabh AgarwalShivaram Venkataraman
2021-02-02
Scaling Federated Learning for Fine-tuning of Large Language Models
Agrin HilmkilSebastian CallhMatteo BarbieriLeon René SütfeldEdvin Listo ZecOlof Mogren
2021-02-01
Polyphone Disambiguition in Mandarin Chinese with Semi-Supervised Learning
Yi ShiCongyi WangYu ChenBin Wang
2021-02-01
SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of Multilingual BERT models for Offensive Language Identification
| Sai Muralidhar JayanthiAkshat Gupta
2021-02-01
Text-to-hashtag Generation using Seq2seq Learning
Augusto CamargoWesley CarvalhoFelipe Peressim
2021-02-01
Improving Distantly-Supervised Relation Extraction through BERT-based Label & Instance Embeddings
Despina ChristouGrigorios Tsoumakas
2021-02-01
"Is depression related to cannabis?": A knowledge-infused model for Entity and Relation Extraction with Limited Supervision
Kaushik RoyUsha LokalaVedant KhandelwalAmit Sheth
2021-02-01
EmpathBERT: A BERT-based Framework for Demographic-aware Empathy Prediction
Bhanu Prakash Reddy GudaAparna GarimellaNiyati Chhaya
2021-01-30
ShufText: A Simple Black Box Approach to Evaluate the Fragility of Text Classification Models
Rutuja TawareShraddha VaratGaurav SalunkeChaitanya GawandeGeetanjali KaleRahul KhengareRaviraj Joshi
2021-01-30
Learning From How Human Correct
Tong Guo
2021-01-30
Speech Recognition by Simply Fine-tuning BERT
Wen-Chin HuangChia-Hua WuShang-Bao LuoKuan-Yu ChenHsin-Min WangTomoki Toda
2021-01-30
Adversarially learning disentangled speech representations for robust multi-factor voice conversion
Jie WangJingbei LiXintao ZhaoZhiyong WuHelen Meng
2021-01-30
Synthesizing Monolingual Data for Neural Machine Translation
Benjamin MarieAtsushi Fujita
2021-01-29
Fine-tuning BERT-based models for Plant Health Bulletin Classification
| Shufan JiangRafael AngaritaStephane CormierFrancis Rousseaux
2021-01-29
BERTaú: Itaú BERT for digital customer service
Paulo FinardiJosé Dié ViegasGustavo T. FerreiraAlex F. MansanoVinicius F. Caridá
2021-01-28
A Graph-based Relevance Matching Model for Ad-hoc Retrieval
Yufeng ZhangJinghao ZhangZeyu CuiShu WuLiang Wang
2021-01-28
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
| Li YuanYunpeng ChenTao WangWeihao YuYujun ShiZihang JiangFrancis EH TayJiashi FengShuicheng Yan
2021-01-28
Consequence of Enterprise Resource Planning in the Environs of Pedagogical Organization.
Dr. MALLIKARJUNA B E; Mr. VASUDENDRA H K; Mr. VINEETH R
2021-01-28
Consequence of Enterprise Resource Planning in the Environs of Pedagogical Organization.
Dr. MALLIKARJUNA B E; Mr. VASUDENDRA H K; Mr. VINEETH R
2021-01-28
On the Evolution of Syntactic Information Encoded by BERT's Contextualized Representations
Laura Pérez-MayosRoberto CarliniMiguel BallesterosLeo Wanner
2021-01-27
KoreALBERT: Pretraining a Lite BERT Model for Korean Language Understanding
Hyunjae LeeJaewoong YoonBonggyu HwangSeongho JoeSeungjai MinYoungjune Gwon
2021-01-27
Attention Can Reflect Syntactic Structure (If You Let It)
Vinit RavishankarArtur KulmizevMostafa AbdouAnders SøgaardJoakim Nivre
2021-01-26
Regulatory Compliance through Doc2Doc Information Retrieval: A case study in EU/UK legislation where text similarity has limitations
Ilias ChalkidisManos FergadiotisNikolaos ManginasEva KatakalouProdromos Malakasiotis
2021-01-26
Evaluation of BERT and ALBERT Sentence Embedding Performance on Downstream NLP Tasks
Hyunjin ChoiJudong KimSeongho JoeYoungjune Gwon
2021-01-26
Analyzing Zero-shot Cross-lingual Transfer in Supervised NLP Tasks
Hyunjin ChoiJudong KimSeongho JoeSeungjai MinYoungjune Gwon
2021-01-26
CLiMP: A Benchmark for Chinese Language Model Evaluation
Beilei XiangChangbing YangYu LiAlex WarstadtKatharina Kann
2021-01-26
Named Entity Recognition in the Style of Object Detection
Bing Li
2021-01-26
First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT
Benjamin MullerYanai ElazarBenoît SagotDjamé Seddah
2021-01-26
Deep Subjecthood: Higher-Order Grammatical Features in Multilingual BERT
| Isabel PapadimitriouEthan A. ChiRichard FutrellKyle Mahowald
2021-01-26
A Hybrid Approach to Measure Semantic Relatedness in Biomedical Concepts
Katikapalli Subramanyam KalyanSivanesan Sangeetha
2021-01-25
Stereotype and Skew: Quantifying Gender Bias in Pre-trained and Fine-tuned Language Models
| Daniel de Vassimon ManelaDavid ErringtonThomas FisherBoris van BreugelPasquale Minervini
2021-01-24
Does Dialog Length matter for Next Response Selection task? An Empirical Study
Jatin GanhotraSachindra Joshi
2021-01-24
RomeBERT: Robust Training of Multi-Exit BERT
| Shijie GengPeng GaoZuohui FuYongfeng Zhang
2021-01-24
Training Multilingual Pre-trained Language Model with Byte-level Subwords
| Junqiu WeiQun LiuYinpeng GuoXin Jiang
2021-01-23
Extracting Lifestyle Factors for Alzheimer's Disease from Clinical Notes Using Deep Learning with Weak Supervision
Zitao ShenYoonkwon YiAnusha BompelliFang YuYanshan WangRui Zhang
2021-01-22
HASOCOne@FIRE-HASOC2020: Using BERT and Multilingual BERT models for Hate Speech Detection
| Suman DowlagarRadhika Mamidi
2021-01-22
Multilingual Pre-Trained Transformers and Convolutional NN Classification Models for Technical Domain Identification
Suman DowlagarRadhika Mamidi
2021-01-22
A multi-perspective combined recall and rank framework for Chinese procedure terminology normalization
Ming LiangKui XueTong Ruan
2021-01-22
The heads hypothesis: A unifying statistical approach towards understanding multi-headed attention in BERT
| Madhura PandeAakriti BudhrajaPreksha NemaPratyush KumarMitesh M. Khapra
2021-01-22
Drug and Disease Interpretation Learning with Biomedical Entity Representation Transformer
Zulfat MiftahutdinovArtur KadurinRoman KudrinElena Tutubalina
2021-01-22
BERT Transformer model for Detecting Arabic GPT2 Auto-Generated Tweets
Fouzi HarragMaria DebbahKareem DarwishAhmed Abdelali
2021-01-22
Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval
| Robert LitschkoIvan VulićSimone Paolo PonzettoGoran Glavaš
2021-01-21
Classifying Scientific Publications with BERT -- Is Self-Attention a Feature Selection Method?
| Andres Garcia-SilvaJose Manuel Gomez-Perez
2021-01-20
Divide and Conquer: An Ensemble Approach for Hostile Post Detection in Hindi
| Varad BhatnagarPrince KumarSairam MoghiliPushpak Bhattacharyya
2021-01-20
Learning to Augment for Data-Scarce Domain BERT Knowledge Distillation
Lingyun FengMinghui QiuYaliang LiHai-Tao ZhengYing Shen
2021-01-20
Situation and Behavior Understanding by Trope Detection on Films
| Chen-Hsi ChangHung-Ting SuJui-heng HsuYu-Siang WangYu-Cheng ChangZhe Yu LiuYa-Liang ChangWen-Feng ChengKe-Jyun WangWinston H. Hsu
2021-01-19
Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach
| ASHISH SHARMAInna W. LinAdam S. MinerDavid C. AtkinsTim Althoff
2021-01-19
Can a Fruit Fly Learn Word Embeddings?
| Yuchen LiangChaitanya K. RyaliBenjamin HooverLeopold GrinbergSaket NavlakhaMohammed J. ZakiDmitry Krotov
2021-01-18
Inference for BART with Multinomial Outcomes
Yizhen XuJoseph W. HoganMichael J. DanielsRami KantorAnn Mwangi
2021-01-18
Automatic punctuation restoration with BERT models
| Attila NagyBence BialJudit Ács
2021-01-18
Transformer-Based Models for Question Answering on COVID19
Hillary NgaiYoona ParkJohn ChenMahboobeh Parsapoor
2021-01-16
Hostility Detection and Covid-19 Fake News Detection in Social Media
Ayush GuptaRohan SukumaranKevin JohnSundeep Teki
2021-01-15
KDLSQ-BERT: A Quantized Bert Combining Knowledge Distillation with Learned Step Size Quantization
Jing JinCai LiangTiancheng WuLiqin ZouZhiliang Gan
2021-01-15
Grid Search Hyperparameter Benchmarking of BERT, ALBERT, and LongFormer on DuoRC
Alex John QuijanoSam NguyenJuanita Ordonez
2021-01-15
Persistent Anti-Muslim Bias in Large Language Models
Abubakar AbidMaheen FarooqiJames Zou
2021-01-14
WER-BERT: Automatic WER Estimation with BERT in a Balanced Ordinal Classification Paradigm
Akshay Krishna SheshadriAnvesh Rao VijjiniSukhdeep Kharbanda
2021-01-14
ECOL: Early Detection of COVID Lies Using Content, Prior Knowledge and Source Information
| Ipek BarisZeyd Boukhers
2021-01-14
Transformer-based Language Model Fine-tuning Methods for COVID-19 Fake News Detection
Ben ChenBin ChenDehong GaoQijin ChenChengfu HuoXiaonan MengWeijun RenYang Zhou
2021-01-14
Heterogeneous Network Embedding for Deep Semantic Relevance Match in E-commerce Search
Ziyang LiuZhaomeng ChengYunjiang JiangYue ShangWei XiongSulong XuBo LongDi Jin
2021-01-13
Experimental Evaluation of Deep Learning models for Marathi Text Classification
Atharva KulkarniMeet MandhaneManali LikhitkarGayatri KshirsagarJayashree JagdaleRaviraj Joshi
2021-01-13
LaDiff ULMFiT: A Layer Differentiated training approach for ULMFiT
| Mohammed AzhanMohammad Ahmad
2021-01-13
Of Non-Linearity and Commutativity in BERT
| Sumu ZhaoDamian PascualGino BrunnerRoger Wattenhofer
2021-01-12
Neural Contract Element Extraction Revisited: Letters from Sesame Street
Ilias ChalkidisManos FergadiotisProdromos MalakasiotisIon Androutsopoulos
2021-01-12
Fake News Detection System using XLNet model with Topic Distributions: CONSTRAINT@AAAI2021 Shared Task
Akansha GautamVenktesh VSarah Masud
2021-01-12
AT-BERT: Adversarial Training BERT for Acronym Identification Winning Solution for SDU@AAAI-21
Danqing ZhuWangli LinYang ZhangQiwei ZhongGuanxiong ZengWeilin WuJiayu Tang
2021-01-11
Evaluation of Deep Learning Models for Hostility Detection in Hindi Text
Ramchandra JoshiRushabh KarnavatKaustubh JirapureRaviraj Joshi
2021-01-11
A More Efficient Chinese Named Entity Recognition base on BERT and Syntactic Analysis
Xiao FuGuijun Zhang
2021-01-11
BERT & Family Eat Word Salad: Experiments with Text Understanding
| Ashim GuptaGiorgi KvernadzeVivek Srikumar
2021-01-10
Cisco at AAAI-CAD21 shared task: Predicting Emphasis in Presentation Slides using Contextualized Embeddings
| Sreyan GhoshSonal KumarHarsh JalanHemant YadavRajiv Ratn Shah
2021-01-10
Learning Better Sentence Representation with Syntax Information
Chen Yang
2021-01-09
Misspelling Correction with Pre-trained Contextual Language Model
Yifei HuXiaonan JingYoulim KoJulia Taylor Rayz
2021-01-08
Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search
| Chenyang GaoGuanyu CaiXinyang JiangFeng ZhengJun ZhangYifei GongPai PengXiaowei GuoXing Sun
2021-01-08
Exploring Text-transformers in AAAI 2021 Shared Task: COVID-19 Fake News Detection in English
Xiangyang LiYu XiaXiang LongZheng LiSujian Li
2021-01-07
Applying Transfer Learning for Improving Domain-Specific Search Experience Using Query to Question Similarity
Ankush ChopraShruti AgrawalSohom Ghosh
2021-01-07
Homonym Identification using BERT -- Using a Clustering Approach
Rohan Saha
2021-01-07
Transformer-based approach towards music emotion recognition from lyrics
| Yudhik AgrawalRamaguru Guru Ravi ShankerVinoo Alluri
2021-01-06
I-BERT: Integer-only BERT Quantization
| Sehoon KimAmir GholamiZhewei YaoMichael W. MahoneyKurt Keutzer
2021-01-05
COVID-19: Comparative Analysis of Methods for Identifying Articles Related to Therapeutics and Vaccines without Using Labeled Data
Mihir ParmarAshwin Karthik AmbalavananHong GuanRishab BanerjeeJitesh PablaMurthy Devarakonda
2021-01-05
Improving reference mining in patents with BERT
| Ken VoskuilSuzan Verberne
2021-01-04
KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation
Yiran XingZai ShiZhao MengYunpu MaRoger Wattenhofer
2021-01-02
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
Wangchunshu ZhouTao GeKe XuFuru Wei
2021-01-02
Cross-Document Language Modeling
| Avi CaciularuArman CohanIz BeltagyMatthew E. PetersArie CattanIdo Dagan
2021-01-02
Superbizarre Is Not Superb: Derivational Morphology Improves BERT's Interpretation of Complex Words
Valentin HofmannJanet B. PierrehumbertHinrich Schütze
2021-01-02
Lex-BERT: Enhancing BERT based NER with lexicons
Wei ZhuDaniel Cheung
2021-01-02
What all do audio transformer models hear? Probing Acoustic Representations for Language Delivery and its Structure
Jui ShahYaman Kumar SinglaChangyou ChenRajiv Ratn Shah
2021-01-02
A Robust and Domain-Adaptive Approach for Low-Resource Named Entity Recognition
| Houjin YuXian-Ling MaoZewen ChiWei WeiHeyan Huang
2021-01-02
Polyjuice: Automated, General-purpose Counterfactual Generation
Tongshuang WuMarco Tulio RibeiroJeffrey HeerDaniel S. Weld
2021-01-01
On Explaining Your Explanations of BERT: An Empirical Study with Sequence Classification
| Zhengxuan WuDesmond C. Ong
2021-01-01
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Xiang Lisa LiPercy Liang
2021-01-01
Transformer based Automatic COVID-19 Fake News Detection System
Sunil GundapuRadhika Mamidi
2021-01-01
Syntactic Relevance XLNet Word Embedding Generation in Low-Resource Machine Translation
Anonymous
2021-01-01
Cluster-Former: Clustering-based Sparse Transformer for Question Answering
Anonymous
2021-01-01
DACT-BERT: Increasing the efficiency and interpretability of BERT by using adaptive computation time.
Anonymous
2021-01-01
Evaluating Gender Bias in Natural Language Inference
Anonymous
2021-01-01
Deep Learning Proteins using a Triplet-BERT network
Anonymous
2021-01-01
Modelling Drug-Target Binding Affinity using a BERT based Graph Neural network
Anonymous
2021-01-01
SkillBERT: “Skilling” the BERT to classify skills!
Anonymous
2021-01-01
UserBERT: Self-supervised User Representation Learning
Anonymous
2021-01-01
Speeding up Deep Learning Training by Sharing Weights and Then Unsharing
Anonymous
2021-01-01
Erasure for Advancing: Dynamic Self-Supervised Learning for Commonsense Reasoning
Anonymous
2021-01-01
Pre-training Text-to-Text Transformers to Write and Reason with Concepts
Anonymous
2021-01-01
Pretrain Knowledge-Aware Language Models
Anonymous
2021-01-01
Adding Recurrence to Pretrained Transformers
Anonymous
2021-01-01
How Multipurpose Are Language Models?
Anonymous
2021-01-01
AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly
| Anonymous
2021-01-01
Task-Agnostic and Adaptive-Size BERT Compression
Anonymous
2021-01-01
Data-aware Low-Rank Compression for Large NLP Models
Anonymous
2021-01-01
Cluster & Tune: Enhance BERT Performance in Low Resource Text Classification
Anonymous
2021-01-01
Domain-slot Relationship Modeling using a Pre-trained Language Encoder for Multi-Domain Dialogue State Tracking
Anonymous
2021-01-01
EXPLORING VULNERABILITIES OF BERT-BASED APIS
Anonymous
2021-01-01
KETG: A Knowledge Enhanced Text Generation Framework
Anonymous
2021-01-01
MULTI-SPAN QUESTION ANSWERING USING SPAN-IMAGE NETWORK
Anonymous
2021-01-01
Towards Practical Second Order Optimization for Deep Learning
Anonymous
2021-01-01
Cross-Probe BERT for Efficient and Effective Cross-Modal Search
Anonymous
2021-01-01
BROS: A Pre-trained Language Model for Understanding Texts in Document
Anonymous
2021-01-01
Post-Training Weighted Quantization of Neural Networks for Language Models
Anonymous
2021-01-01
Taking Notes on the Fly Helps Language Pre-Training
Anonymous
2021-01-01
Isotropy in the Contextual Embedding Space: Clusters and Manifolds
Anonymous
2021-01-01
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
| Leo GaoStella BidermanSid BlackLaurence GoldingTravis HoppeCharles FosterJason PhangHorace HeAnish ThiteNoa NabeshimaShawn PresserConnor Leahy
2020-12-31
A Multi-modal Deep Learning Model for Video Thumbnail Selection
Zhifeng YuNanchun Shi
2020-12-31
EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets
Xiaohan ChenYu ChengShuohang WangZhe GanZhangyang WangJingjing Liu
2020-12-31
KART: Privacy Leakage Framework of Language Models Pre-trained with Clinical Records
Yuta NakamuraShouhei HanaokaYukihiro NomuraNaoto HayashiOsamu AbeShuntaro YadaShoko WakamiyaEiji Aramaki
2020-12-31
Studying Strategically: Learning to Mask for Closed-book QA
Qinyuan YeBelinda Z. LiSinong WangBenjamin BolteHao MaWen-tau YihXiang RenMadian Khabsa
2020-12-31
Making Pre-trained Language Models Better Few-shot Learners
| Tianyu GaoAdam FischDanqi Chen
2020-12-31
Shortformer: Better Language Modeling using Shorter Inputs
| Ofir PressNoah A. SmithMike Lewis
2020-12-31
Conditional Generation of Temporally-ordered Event Sequences
Shih-ting LinNathanael ChambersGreg Durrett
2020-12-31
Directed Beam Search: Plug-and-Play Lexically Constrained Language Generation
| Damian PascualBeni EgressyFlorian BolliRoger Wattenhofer
2020-12-31
UNKs Everywhere: Adapting Multilingual Language Models to New Scripts
Jonas PfeifferIvan VulićIryna GurevychSebastian Ruder
2020-12-31
CoCoLM: COmplex COmmonsense Enhanced Language Model
Changlong YuHongming ZhangYangqiu SongWilfred Ng
2020-12-31
Better Robustness by More Coverage: Adversarial Training with Mixup Augmentation for Robust Fine-tuning
| Chenglei SiZhengyan ZhangFanchao QiZhiyuan LiuYasheng WangQun LiuMaosong Sun
2020-12-31
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli BaiWei zhangLu HouLifeng ShangJing JinXin JiangQun LiuMichael LyuIrwin King
2020-12-31
Unified Mandarin TTS Front-end Based on Distilled BERT Model
Yang ZhangLiqun DengYasheng Wang
2020-12-31
Unnatural Language Inference
Koustuv SinhaPrasanna ParthasarathiJoelle PineauAdina Williams
2020-12-30
DEER: A Data Efficient Language Model for Event Temporal Reasoning
Rujun HanXiang RenNanyun Peng
2020-12-30
Out of Order: How important is the sequential order of words in a sentence in Natural Language Understanding tasks?
Thang M. PhamTrung BuiLong MaiAnh Nguyen
2020-12-30
SemGloVe: Semantic Co-occurrences for GloVe from BERT
Leilei GanZhiyang TengYue ZhangLinchao ZhuFei WuYi Yang
2020-12-30
Improving BERT with Syntax-aware Local Attention
Zhongli LiQingyu ZhouChao LiKe XuYunbo Cao
2020-12-30
Deriving Contextualised Semantic Features from BERT (and Other Transformer Model) Embeddings
Jacob TurtonDavid VinsonRobert Elliott Smith
2020-12-30
Optimizing Deeper Transformers on Small Datasets: An Application on Text-to-SQL Semantic Parsing
Peng XuWei YangWenjie ZiKeyi TangChengyang HuangJackie Chi Kit CheungYanshuai Cao
2020-12-30
Robust Dialogue Utterance Rewriting as Sequence Tagging
Jie HaoLinfeng SongLiWei WangKun XuZhaopeng TuDong Yu
2020-12-29
BURT: BERT-inspired Universal Representation from Learning Meaningful Segment
Yian LiHai Zhao
2020-12-28
Syntax-Enhanced Pre-trained Model
Zenan XuDaya GuoDuyu TangQinliang SuLinjun ShouMing GongWanjun ZhongXiaojun QuanNan DuanDaxin Jiang
2020-12-28
A Paragraph-level Multi-task Learning Model for Scientific Fact-Verification
| Xiangci LiGully BurnsNanyun Peng
2020-12-28
Inserting Information Bottlenecks for Attribution in Transformers
| Zhiying JiangRaphael TangJi XinJimmy Lin
2020-12-27
ALP-KD: Attention-Based Layer Projection for Knowledge Distillation
Peyman PassbanYimeng WuMehdi RezagholizadehQun Liu
2020-12-27
An Embarrassingly Simple Model for Dialogue Relation Extraction
Fuzhao XueAixin SunHao ZhangEng Siong Chng
2020-12-27
To what extent do human explanations of model behavior align with actual model behavior?
Grusha PrasadYixin NieMohit BansalRobin JiaDouwe KielaAdina Williams
2020-12-24
Sentence-Based Model Agnostic NLP Interpretability
| Yves RychenerXavier RenardDjamé SeddahPascal FrossardMarcin Detyniecki
2020-12-24
Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing
| Xi Victoria LinRichard SocherCaiming Xiong
2020-12-23
Detecting Hate Speech in Memes Using Multimodal Deep Learning Approaches: Prize-winning solution to Hateful Memes Challenge
| Riza VeliogluJewgeni Rose
2020-12-23
Applying Wav2vec2.0 to Speech Recognition in Various Low-resource Languages
Cheng YiJianzhong WangNing ChengShiyu ZhouBo Xu
2020-12-22
Uncertainty and Surprisal Jointly Deliver the Punchline: Exploiting Incongruity-Based Features for Humor Recognition
Yubo XieJunze LiPearl Pu
2020-12-22
Learning to Retrieve Entity-Aware Knowledge and Generate Responses with Copy Mechanism for Task-Oriented Dialogue Systems
| Chao-Hong TanXiaoyu YangZi'ou ZhengTianda LiYufei FengJia-Chen GuQuan LiuDan LiuZhen-Hua LingXiaodan Zhu
2020-12-22
Improved Biomedical Word Embeddings in the Transformer Era
| Jiho NohRamakanth Kavuluru
2020-12-22
Recognizing Emotion Cause in Conversations
| Soujanya PoriaNavonil MajumderDevamanyu HazarikaDeepanway GhosalRishabh BhardwajSamson Yu Bai JianRomila GhoshNiyati ChhayaAlexander GelbukhRada Mihalcea
2020-12-22
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning
Armen AghajanyanLuke ZettlemoyerSonal Gupta
2020-12-22
Domain specific BERT representation for Named Entity Recognition of lab protocol
| Tejas VaidhyaAyush Kaushal
2020-12-21
A Graph Reasoning Network for Multi-turn Response Selection via Customized Pre-training
Yongkang LiuShi FengDaling WangKaisong SongFeiliang RenYifei Zhang
2020-12-21
Towards Incorporating Entity-specific Knowledge Graph Information in Predicting Drug-Drug Interactions
Ishani Mondal
2020-12-21
Cross-domain Retrieval in the Legal and Patent Domains: a Reproducibility Study
| Sophia AlthammerSebastian HofstätterAllan Hanbury
2020-12-21
Leveraging ParsBERT and Pretrained mT5 for Persian Abstractive Text Summarization
| Mehrdad FarahaniMohammad GharachorlooMohammad Manthouri
2020-12-21
Breaking Writer's Block: Low-cost Fine-tuning of Natural Language Generation Models
Alexandre DuvalThomas LamsonGael de Leseleuc de KerouaraMatthias Gallé
2020-12-19
An Empirical Study of Using Pre-trained BERT Models for Vietnamese Relation Extraction Task at VLSP 2020
Pham Quang Nhat Minh
2020-12-18
On Modality Bias in the TVQA Dataset
| Thomas WinterbottomSarah XiaoAlistair McLeanNoura Al Moubayed
2020-12-18
HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection
| Binny MathewPunyajoy SahaSeid Muhie YimamChris BiemannPawan GoyalAnimesh Mukherjee
2020-12-18
ShineOn: Illuminating Design Choices for Practical Video-based Virtual Clothing Try-on
| Gaurav KuppaAndrew JongVera LiuZiwei LiuTeng-Sheng Moh
2020-12-18
MASKER: Masked Keyword Regularization for Reliable Text Classification
| Seung Jun MoonSangwoo MoKimin LeeJaeho LeeJinwoo Shin
2020-12-17
MIX : a Multi-task Learning Approach to Solve Open-Domain Question Answering
Sofian ChayboutiAchraf SagheAymen Shabou
2020-12-17
A White Box Analysis of ColBERT
Thibault FormalBenjamin PiwowarskiStéphane Clinchant
2020-12-17
Literature Retrieval for Precision Medicine with Neural Matching and Faceted Summarization
| Jiho NohRamakanth Kavuluru
2020-12-17
SceneFormer: Indoor Scene Generation with Transformers
| Xinpeng WangChandan YeshwanthMatthias Nießner
2020-12-17
DialogXL: All-in-One XLNet for Multi-Party Conversation Emotion Recognition
| Weizhou ShenJunqing ChenXiaojun QuanZhixian Xie
2020-12-16
Focusing More on Conflicts with Mis-Predictions Helps Language Pre-Training
Chen XingWencong XiaoYong LiWei Lin
2020-12-16
A Lightweight Neural Model for Biomedical Entity Linking
| Lihu ChenGaël VaroquauxFabian M. Suchanek
2020-12-16
R$^2$-Net: Relation of Relation Learning Network for Sentence Semantic Matching
Kun ZhangLe WuGuangyi LvMeng WangEnhong ChenShulan Ruan
2020-12-16
Query expansion with artificially generated texts
Vincent Claveau
2020-12-16
Revisiting Linformer with a modified self-attention with linear complexity
Madhusudan Verma
2020-12-16
Pre-Training Transformers as Energy-Based Cloze Models
| Kevin ClarkMinh-Thang LuongQuoc V. LeChristopher D. Manning
2020-12-15
RecipeNLG: A Cooking Recipes Dataset for Semi-Structured Text Generation
| Michał BieńMichał GilskiMartyna MaciejewskaWojciech TaisnerDawid WiśniewskiAgnieszka Ławrynowicz
2020-12-15
LRC-BERT: Latent-representation Contrastive Knowledge Distillation for Natural Language Understanding
Hao FuShaojun ZhouQihong YangJunjie TangGuiquan LiuKaikui LiuXiaolong Li
2020-12-14
Vartani Spellcheck -- Automatic Context-Sensitive Spelling Correction of OCR-generated Hindi Text Using BERT and Levenshtein Distance
Aditya PalAbhijit Mustafi
2020-12-14
Extracting Training Data from Large Language Models
Nicholas CarliniFlorian TramerEric WallaceMatthew JagielskiAriel Herbert-VossKatherine LeeAdam RobertsTom BrownDawn SongUlfar ErlingssonAlina OpreaColin Raffel
2020-12-14
KVL-BERT: Knowledge Enhanced Visual-and-Linguistic BERT for Visual Commonsense Reasoning
Dandan songSiyi MaZhanchen SunSicheng YangLejian Liao
2020-12-13

Components

COMPONENT TYPE