Methods > Natural Language Processing > Subword Segmentation

Byte Pair Encoding

Introduced by Sennrich et al. in Neural Machine Translation of Rare Words with Subword Units

Byte Pair Encoding, or BPE, is a subword segmentation algorithm that encodes rare and unknown words as sequences of subword units. The intuition is that various word classes are translatable via smaller units than words, for instance names (via character copying or transliteration), compounds (via compositional translation), and cognates and loanwords (via phonological and morphological transformations).

Lei Mao has a detailed blog post that explains how this works.

Source: Neural Machine Translation of Rare Words with Subword Units

Latest Papers

PAPER DATE
Aligning Subtitles in Sign Language Videos
Hannah BullTriantafyllos AfourasGül VarolSamuel AlbanieLiliane MomeniAndrew Zisserman
2021-05-06
Adapting Monolingual Models: Data can be Scarce when Language Similarity is High
Wietse de VriesMartijn BarteldsMalvina NissimMartijn Wieling
2021-05-06
Sequential Encryption of Sparse Neural Networks Toward Optimum Representation of Irregular Sparsity
Baeseong ParkSe Jung KwonDongsoo LeeDaehwan OhByeongwook KimYongkweon JeonYeonju Ro
2021-05-05
Visual Composite Set Detection Using Part-and-Sum Transformers
Qi DongZhuowen TuHaofu LiaoYuting ZhangVijay MahadevanStefano Soatto
2021-05-05
Attention for Image Registration (AiR): an unsupervised Transformer approach
ZiHao WangHervé Delingette
2021-05-05
MLP-Mixer: An all-MLP Architecture for Vision
| Ilya TolstikhinNeil HoulsbyAlexander KolesnikovLucas BeyerXiaohua ZhaiThomas UnterthinerJessica YungDaniel KeysersJakob UszkoreitMario LucicAlexey Dosovitskiy
2021-05-04
Moving Towards Centers: Re-ranking with Attention and Memory for Re-identification
Yunhao ZhouYi WangLap-Pui Chau
2021-05-04
Retrieving Complex Tables with Multi-Granular Graph Representation Learning
| Fei WangKexuan SunMuhao ChenJay PujaraPedro Szekely
2021-05-04
ISTR: End-to-End Instance Segmentation with Transformers
| Jie HuLiujuan CaoYao LuShengchuan ZhangYan WangKe LiFeiyue HuangLing ShaoRongrong Ji
2021-05-03
One Model to Rule them All: Towards Zero-Shot Learning for Databases
Benjamin HilprechtCarsten Binnig
2021-05-03
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks
Tatyana IazykovaDenis KapelyushnikOlga BystrovaAndrey Kutuzov
2021-05-03
Anatomy-Guided Parallel Bottleneck Transformer Network for Automated Evaluation of Root Canal Therapy
Yunxiang LiGuodong ZengYifan ZhangJun WangQianni ZhangQun JinLingling SunQisi LianNeng XiaRuizi PengKai TangYaqi WangShuai Wang
2021-05-02
Incorporating Transformer and LSTM to Kalman Filter with EM algorithm for state estimation
| Zhuangwei Shi
2021-05-01
Audio Transformers:Transformer Architectures For Large Scale Audio Understanding. Adieu Convolutions
Prateek VermaJonathan Berger
2021-05-01
SVT-Net: A Super Light-Weight Network for Large Scale Place Recognition using Sparse Voxel Transformers
Zhaoxin FanZhenbo SongHongyan LiuJun HeXiaoyong Du
2021-05-01
Mitigating Political Bias in Language Models Through Reinforced Calibration
Ruibo LiuChenyan JiaJason WeiGuangxuan XuLili WangSoroush Vosoughi
2021-04-30
CAT: Cross-Attention Transformer for One-Shot Object Detection
Weidong LinYuyan DengYang GaoNing WangJinghao ZhouLingqiao LiuLei ZhangPeng Wang
2021-04-30
GTN-ED: Event Detection Using Graph Transformer Networks
Sanghamitra DuttaLiang MaTanay Kumar SahaDi LuJoel TetreaultAlejandro Jaimes
2021-04-30
Chop Chop BERT: Visual Question Answering by Chopping VisualBERT's Heads
Chenyu GaoQi ZhuPeng WangQi Wu
2021-04-30
CoSformer: Detecting Co-Salient Object with Transformers
Lv Tang
2021-04-30
CTLR@WiC-TSV: Target Sense Verification using Marked Inputs andPre-trained Models
José G. MorenoElvys Linhares PontesGaël Dias
2021-04-30
GasHis-Transformer: A Multi-scale Visual Transformer Approach for Gastric Histopathology Image Classification
HaoYuan ChenChen LiXiaoyan LiWeiming HuYixin LiWanli LiuChanghao SunYuDong YaoMarcin Grzegorzek
2021-04-29
Emerging Properties in Self-Supervised Vision Transformers
| Mathilde CaronHugo TouvronIshan MisraHervé JégouJulien MairalPiotr BojanowskiArmand Joulin
2021-04-29
Using Adaptive Gradient for Texture Learning in Single-View 3D Reconstruction
Luoyang LinDihong Tian
2021-04-29
Entailment as Few-Shot Learner
Sinong WangHan FangMadian KhabsaHanzi MaoHao Ma
2021-04-29
AMR Parsing with Action-Pointer Transformer
| Jiawei ZhouTahira NaseemRamón Fernandez AstudilloRadu Florian
2021-04-29
Pyramid Medical Transformer for Medical Image Segmentation
Zhuangzhuang ZhangBaozhou SunWeixiong Zhang
2021-04-29
HandsFormer: Keypoint Transformer for Monocular 3D Pose Estimation ofHands and Object in Interaction
Shreyas HampaliSayan Deb SarkarMahdi RadVincent Lepetit
2021-04-29
Medical Transformer: Universal Brain Encoder for 3D MRI Analysis
Eunji JunSeungwoo JeongDa-Woon HeoHeung-Il Suk
2021-04-28
Point Cloud Learning with Transformer
Xian-Feng HanYu-Jia KuangGuo-Qiang Xiao
2021-04-28
Inpainting Transformer for Anomaly Detection
Jonathan PirnayKeng Chai
2021-04-28
Dual Transformer for Point Cloud Analysis
Xian-Feng HanYi-Fei JinHui-Xian ChengGuo-Qiang Xiao
2021-04-27
Generating Lead Sheets with Affect: A Novel Conditional seq2seq Framework
| Dimos MakrisKat R. AgresDorien Herremans
2021-04-27
UoT-UWF-PartAI at SemEval-2021 Task 5: Self Attention Based Bi-GRU with Multi-Embedding Representation for Toxicity Highlighter
Hamed Babaei GiglouTaher RahgooyMostafa RahgouyJafar Razmara
2021-04-27
Extractive and Abstractive Explanations for Fact-Checking and Evaluation of News
Ashkan KazemiZehua LiVerónica Pérez-RosasRada Mihalcea
2021-04-27
Easy and Efficient Transformer : Scalable Inference Solution For large NLP mode
| Gongzheng liYadong XiJingzhen DingDuan WangBai LiuChangjie FanXiaoxi MaoZeng Zhao
2021-04-26
Rich Semantics Improve Few-shot Learning
| Mohamed AfhamSalman KhanMuhammad Haris KhanMuzammal NaseerFahad Shahbaz Khan
2021-04-26
Visformer: The Vision-friendly Transformer
| Zhengsu ChenLingxi XieJianwei NiuXuefeng LiuLonghui WeiQi Tian
2021-04-26
Focused Attention Improves Document-Grounded Generation
| Shrimai PrabhumoyeKazuma HashimotoYingbo ZhouAlan W BlackRuslan Salakhutdinov
2021-04-26
PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
Wei ZengXiaozhe RenTeng SuHui WangYi LiaoZhiwei WangXin JiangZhenZhang YangKaisheng WangXiaoda ZhangChen LiZiyan GongYifan YaoXinjing HuangJun WangJianfeng YuQi GuoYue YuYan ZhangJin WangHengtao TaoDasen YanZexuan YiFang PengFangqing JiangHan ZhangLingfeng DengYehong ZhangZhe LinChao ZhangShaojie ZhangMingyue GuoShanzhi GuGaojun FanYaoWei WangXuefeng JinQun LiuYonghong Tian
2021-04-26
Accounting for Agreement Phenomena in Sentence Comprehension with Transformer Language Models: Effects of Similarity-based Interference on Surprisal and Attention
Soo Hyun RyuRichard L. Lewis
2021-04-26
Head-synchronous Decoding for Transformer-based Streaming ASR
Mohan LiCatalin ZorilaRama Doddipatla
2021-04-26
Transformer Meets DCFAM: A Novel Semantic Segmentation Scheme for Fine-Resolution Remote Sensing Images
Libo WangRui LiChenxi DuanShenghui Fang
2021-04-25
Visual Saliency Transformer
Nian LiuNi ZhangKaiyuan WanJunwei HanLing Shao
2021-04-25
baller2vec++: A Look-Ahead Multi-Entity Transformer For Modeling Coordinated Agents
| Michael A. AlcornAnh Nguyen
2021-04-24
VidTr: Video Transformer Without Convolutions
Xinyu LiYanyi ZhangChunhui LiuBing ShuaiYi ZhuBiagio BrattoliHao ChenIvan MarsicJoseph Tighe
2021-04-23
Learning to Cluster Faces via Transformer
Jinxing YeXioajiang PengBaigui SunKai WangXiuyu SunHao LiHanqing Wu
2021-04-23
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
| Hassan AkbariLinagzhe YuanRui QianWei-Hong ChuangShih-Fu ChangYin CuiBoqing Gong
2021-04-22
Carbon Emissions and Large Neural Network Training
David PattersonJoseph GonzalezQuoc LeChen LiangLluis-Miquel MunguiaDaniel RothchildDavid SoMaud TexierJeff Dean
2021-04-21
Label-Synchronous Speech-to-Text Alignment for ASR Using Forward and Backward Transformers
Yusuke KidaTatsuya KomatsuMasahito Togami
2021-04-21
Efficient pre-training objectives for Transformers
Luca Di LielloMatteo GabburoAlessandro Moschitti
2021-04-20
CATE meets ML -- The Conditional Average Treatment Effect and Machine Learning
Daniel Jacob
2021-04-20
Analyzing COVID-19 Tweets with Transformer-based Language Models
Philip FeldmanSim TiwariCharissa S. L. CheahJames R. FouldsSHimei Pan
2021-04-20
Modeling Event Plausibility with Consistent Conceptual Abstraction
Ian PoradaKaheer SulemanAdam TrischlerJackie Chi Kit Cheung
2021-04-20
TeamUNCC@LT-EDI-EACL2021: Hope Speech Detection using Transfer Learning with Transformers
| Khyati MahajanErfan Al-HossamiSamira Shaikh
2021-04-19
Improving Transformer-Kernel Ranking Model Using Conformer and Query Term Independence
Bhaskar MitraSebastian HofstatterHamed ZamaniNick Craswell
2021-04-19
Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
| Aditya PrakashKashyap ChittaAndreas Geiger
2021-04-19
A novel Time-frequency Transformer and its Application in Fault Diagnosis of Rolling Bearings
Yifei DingMinping JiaQiuhua MiaoYudong Cao
2021-04-19
TransCrowd: Weakly-Supervised Crowd Counting with Transformer
| Dingkang LiangXiwu ChenWei XuYu ZhouXiang Bai
2021-04-19
Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers
Takaaki HoriNiko MoritzChiori HoriJonathan Le Roux
2021-04-19
Code Structure Guided Transformer for Source Code Summarization
Shuzheng GaoCuiyun GaoYulan HeJichuan ZengLun Yiu NieXin Xia
2021-04-19
Extracting Temporal Event Relation with Syntactic-Guided Temporal Graph Transformer
Shuaicheng ZhangLifu HuangQiang Ning
2021-04-19
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
Kang Min YooDongju ParkJaewook KangSang-Woo LeeWoomyeong Park
2021-04-18
FedNLP: A Research Platform for Federated Learning in Natural Language Processing
| Bill Yuchen LinChaoyang HeZihang ZengHulin WangYufen HuangMahdi SoltanolkotabiXiang RenSalman Avestimehr
2021-04-18
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Yao LuMax BartoloAlastair MooreSebastian RiedelPontus Stenetorp
2021-04-18
Natural Instructions: Benchmarking Generalization to New Tasks from Natural Language Instructions
Swaroop MishraDaniel KhashabiChitta BaralHannaneh Hajishirzi
2021-04-18
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation
Tianyu LiuYizhe ZhangChris BrockettYi MaoZhifang SuiWeizhu ChenBill Dolan
2021-04-18
Demystifying the Better Performance of Position Encoding Variants for Transformer
Pu-Chin ChenHenry TsaiSrinadh BhojanapalliHyung Won ChungYin-Wen ChangChun-Sung Ferng
2021-04-18
On the Strengths of Cross-Attention in Pretrained Transformers for Machine Translation
Mozhdeh GheiniXiang RenJonathan May
2021-04-18
mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs
Zewen ChiLi DongShuming MaShaohan Huang Xian-Ling MaoHeyan HuangFuru Wei
2021-04-18
The Power of Scale for Parameter-Efficient Prompt Tuning
| Brian LesterRami Al-RfouNoah Constant
2021-04-18
When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset
| Lucia ZhengNeel GuhaBrandon R. AndersonPeter HendersonDaniel E. Ho
2021-04-18
Higher Order Recurrent Space-Time Transformer
Tsung-Ming TaiGiuseppe FiameniCheng-Kuang LeeOswald Lanz
2021-04-17
Visual Transformer Pruning
Mingjian ZhuKai HanYehui TangYunhe Wang
2021-04-17
Hierarchical Transformer Networks for Longitudinal Clinical Document Classification
Yuqi SiKirk Roberts
2021-04-17
Editing Factual Knowledge in Language Models
| Nicola De CaoWilker AzizIvan Titov
2021-04-16
Serial or Parallel? Plug-able Adapter for multilingual machine translation
Yaoming ZhuJiangtao FengChengqi ZhaoMingxuan WangLei LI
2021-04-16
Is Your Language Model Ready for Dense Representation Fine-tuning?
| Luyu GaoJamie Callan
2021-04-16
An Adversarially-Learned Turing Test for Dialog Generation Models
| Xiang GaoYizhe ZhangMichel GalleyBill Dolan
2021-04-16
Comparison of Grammatical Error Correction Using Back-Translation Models
Aomi KoyamaKengo HotateMasahiro KanekoMamoru Komachi
2021-04-16
Text2App: A Framework for Creating Android Apps from Text Descriptions
| Masum HasanKazi Sajeed MehrabWasi Uddin AhmadRifat Shahriyar
2021-04-16
Surface Form Competition: Why the Highest Probability Answer Isn't Always Right
| Ari HoltzmanPeter WestVered SchwartzYejin ChoiLuke Zettlemoyer
2021-04-16
ExplaGraphs: An Explanation Graph Generation Task for Structured Commonsense Reasoning
| Swarnadeep SahaPrateek YadavLisa BauerMohit Bansal
2021-04-15
Demystify Optimization Challenges in Multilingual Transformers
Xian LiHongyu Gong
2021-04-15
Self-supervised Video Object Segmentation by Motion Grouping
Charig YangHala LamdouarErika LuAndrew ZissermanWeidi Xie
2021-04-15
Vision Transformer using Low-level Chest X-ray Feature Corpus for COVID-19 Diagnosis and Severity Quantification
Sangjoon ParkGwanghyun KimYujin OhJoon Beom SeoSang Min LeeJin Hwan KimSungjun MoonJae-Kwang LimJong Chul Ye
2021-04-15
Cross-domain Speech Recognition with Unsupervised Character-level Distribution Matching
| Wenxin HouJindong WangXu TanTao QinTakahiro Shinozaki
2021-04-15
NT5?! Training T5 to Perform Numerical Reasoning
| Peng-Jian YangYing Ting ChenYuechan ChenDaniel Cer
2021-04-15
TorontoCL at CMCL 2021 Shared Task: RoBERTa with Multi-Stage Fine-Tuning for Eye-Tracking Prediction
| Bai LiFrank Rudzicz
2021-04-15
Points as Queries: Weakly Semi-supervised Object Detection by Points
Liangyu ChenTong YangXiangyu ZhangWei zhangJian Sun
2021-04-15
Rethinking Text Line Recognition Models
Daniel Hernandez DiazSiyang QinReeve IngleYasuhisa FujiiAlessandro Bissacco
2021-04-15
Shoulder Implant X-Ray Manufacturer Classification: Exploring with Vision Transformer
| Meng ZhouShanglin Mo
2021-04-15
Syntax-Aware Graph-to-Graph Transformer for Semantic Role Labelling
Alireza MohammadshahiJames Henderson
2021-04-15
A Survey of Recent Abstract Summarization Techniques
Diyah Puspitaningrum
2021-04-15
An Introduction of mini-AlphaStar
| Ruo-Ze LiuWenhai WangYanjie ShenZhiqi LiYang YuTong Lu
2021-04-14
NAREOR: The Narrative Reordering Problem
Varun GangalSteven Y. FengEduard HovyTeruko Mitamura
2021-04-14
Decoupled Spatial-Temporal Transformer for Video Inpainting
| Rui LiuHanming DengYangyi HuangXiaoyu ShiLewei LuWenxiu SunXiaogang WangJifeng DaiHongsheng Li
2021-04-14
Sparse Attention with Linear Units
| Biao ZhangIvan TitovRico Sennrich
2021-04-14
Knowledge-driven Answer Generation for Conversational Search
Mariana LeiteRafael FerreiraDavid SemedoJoão Magalhães
2021-04-14
Non-autoregressive sequence-to-sequence voice conversion
Tomoki HayashiWen-Chin HuangKazuhiro KobayashiTomoki Toda
2021-04-14
TWEAC: Transformer with Extendable QA Agent Classifiers
| Gregor GeigleNils ReimersAndreas RückléIryna Gurevych
2021-04-14
QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering
| Michihiro YasunagaHongyu RenAntoine BosselutPercy LiangJure Leskovec
2021-04-13
UPB at SemEval-2021 Task 7: Adversarial Multi-Task Learning for Detecting and Rating Humor and Offense
Răzvan-Alexandru SmăduDumitru-Clementin CercelMihai Dascalu
2021-04-13
Semantic maps and metrics for science Semantic maps and metrics for science using deep transformer encoders
Brendan ChambersJames Evans
2021-04-13
Understanding Transformers for Bot Detection in Twitter
| Andres Garcia-SilvaCristian BerrioJose Manuel Gomez-Perez
2021-04-13
Transformer-based Methods for Recognizing Ultra Fine-grained Entities (RUFES)
Emanuela BorosAntoine Doucet
2021-04-13
Discourse Probing of Pretrained Language Models
Fajri KotoJey Han LauTimothy Baldwin
2021-04-13
MS2: Multi-Document Summarization of Medical Studies
| Jay DeYoungIz BeltagyMadeleine van ZuylenBailey KuehlLucy Lu Wang
2021-04-13
ViT-V-Net: Vision Transformer for Unsupervised Volumetric Medical Image Registration
| Junyu ChenYufan HeEric C. FreyYe LiYong Du
2021-04-13
Can a Transformer Pass the Wug Test? Tuning Copying Bias in Neural Morphological Inflection Models
Ling LiuMans Hulden
2021-04-13
Updater-Extractor Architecture for Inductive World State Representations
Arseny MoskvichevJames A. Liu
2021-04-12
Learning dynamic and hierarchical traffic spatiotemporal features with Transformer
Haoyang YanXiaolei Ma
2021-04-12
Cloth Interactive Transformer for Virtual Try-On
| Bin RenHao TangFanyang MengRunwei DingLing ShaoPhilip H. S. TorrNicu Sebe
2021-04-12
Multilingual Language Models Predict Human Reading Behavior
| Nora HollensteinFederico PirovanoCe ZhangLena JägerLisa Beinborn
2021-04-12
Family of Origin and Family of Choice: Massively Parallel Lexiconized Iterative Pretraining for Severely Low Resource Machine Translation
Zhong ZhouAlex Waibel
2021-04-12
On Representation Learning for Scientific News Articles Using Heterogeneous Knowledge Graphs
Angelika RomanouPanayiotis SmerosKarl Aberer
2021-04-12
Learning to Synthesize Data for Semantic Parsing
| Bailin WangWenpeng YinXi Victoria LinCaiming Xiong
2021-04-12
Paragraph-level Simplification of Medical Texts
Ashwin DevarajIain J. MarshallByron C. WallaceJunyi Jessy Li
2021-04-12
UniDrop: A Simple yet Effective Technique to Improve Transformer without Extra Cost
Zhen WuLijun WuQi MengYingce XiaShufang XieTao QinXinyu DaiTie-Yan Liu
2021-04-11
Meta-tuning Language Models to Answer Prompts Better
Ruiqi ZhongKristy LeeZheng ZhangDan Klein
2021-04-10
Knowledge-Aware Graph-Enhanced GPT-2 for Dialogue State Tracking
Weizhe LinBo-Hsian TsengBill Byrne
2021-04-09
Deep Transformer Networks for Time Series Classification: The NPP Safety Case
Bing ZhaAlessandro VanniYassin HassanTunc AldemirAlper Yilmaz
2021-04-09
KI-BERT: Infusing Knowledge Context for Better Language and Domain Understanding
Keyur FalduAmit ShethPrashant KikaniHemang Akabari
2021-04-09
Transformers: "The End of History" for NLP?
Anton ChernyavskiyDmitry IlvovskyPreslav Nakov
2021-04-09
Revisiting Simple Neural Probabilistic Language Models
Simeng SunMohit Iyyer
2021-04-08
Layer Reduction: Accelerating Conformer-Based Self-Supervised Model via Layer Consistency
Jinchuan TianRongzhi GuHelin WangYuexian Zou
2021-04-08
Facial Attribute Transformers for Precise and Robust Makeup Transfer
Zhaoyi WanHaoran ChenJielei ZhangWentao JiangCong YaoJiebo Luo
2021-04-07
LI-Net: Large-Pose Identity-Preserving Face Reenactment Network
Jin LiuPeng ChenTao LiangZhaoxing LiCai YuShuqiao ZouJiao DaiJizhong Han
2021-04-07
Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
Zhicheng HuangZhaoyang ZengYupan HuangBei LiuDongmei FuJianlong Fu
2021-04-07
Interpreting A Pre-trained Model Is A Key For Model Architecture Optimization: A Case Study On Wav2Vec 2.0
Liu ChenMeysam Asgari
2021-04-07
Attention Head Masking for Inference Time Content Selection in Abstractive Summarization
Shuyang CaoLu Wang
2021-04-06
Fourier Image Transformer
| Tim-Oliver BuchholzFlorian Jug
2021-04-06
Variational Transformer Networks for Layout Generation
Diego Martin ArroyoJanis PostelsFederico Tombari
2021-04-06
LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring
Anton MitrofanovMariya KorenevskayaIvan PodluzhnyYuri KhokhlovAleksandr LaptevAndrei AndrusenkoAleksei IlinMaxim KorenevskyIvan MedennikovAleksei Romanenko
2021-04-06
MuSLCAT: Multi-Scale Multi-Level Convolutional Attention Transformer for Discriminative Music Modeling on Raw Waveforms
Kai MiddlebrookShyam SudhakaranDavid Guy Brizan
2021-04-06
ODE Transformer: An Ordinary Differential Equation-Inspired Model for Neural Machine Translation
Bei LiQuan DuTao ZhouShuhan ZhouXin ZengTong XiaoJingbo Zhu
2021-04-06
Variable selection with missing data in both covariates and outcomes: Imputation and machine learning
| Liangyuan HuJung-Yi Joyce LinJiayi Ji
2021-04-06
CodeTrans: Towards Cracking the Language of Silicone's Code Through Self-Supervised Deep Learning and High Performance Computing
| Ahmed ElnaggarWei DingLlion JonesTom GibbsTamas FeherChristoph AngererSilvia SeveriniFlorian MatthesBurkhard Rost
2021-04-06
AST: Audio Spectrogram Transformer
Yuan GongYu-An ChungJames Glass
2021-04-05
Exploring Transformers in Emotion Recognition: a comparison of BERT, DistillBERT, RoBERTa, XLNet and ELECTRA
Diogo Cortiz
2021-04-05
TransfoRNN: Capturing the Sequential Information in Self-Attention Representations for Language Modeling
Tze Yuang ChongXuyang WangLin YangJunjie Wang
2021-04-04
IITK@Detox at SemEval-2021 Task 5: Semi-Supervised Learning and Dice Loss for Toxic Spans Detection
| Archit BansalAbhay KaushikAshutosh Modi
2021-04-04
IndT5: A Text-to-Text Transformer for 10 Indigenous Languages
El Moatez Billah NagoudiWei-Rui ChenMuhammad Abdul-MageedHasan Cavusogl
2021-04-04
Deepfake Detection Scheme Based on Vision Transformer and Distillation
Young-Jin HeoYoung-Ju ChoiYoung-Woon LeeByung-Gyu Kim
2021-04-03
Efficient DETR: Improving End-to-End Object Detector with Dense Prior
Zhuyu YaoJiangbo AiBoxun LiChi Zhang
2021-04-03
Language-based Video Editing via Multi-Modal Multi-Level Transformer
Tsu-Jui FuXin Eric WangScott T. GraftonMiguel P. EcksteinWilliam Yang Wang
2021-04-02
AAformer: Auto-Aligned Transformer for Person Re-Identification
Kuan ZhuHaiyun GuoShiliang ZhangYaoWei WangGaopan HuangHonglin QiaoJing LiuJinqiao WangMing Tang
2021-04-02
Effect of depth order on iterative nested named entity recognition models
Perceval WajsburtYoann TailléXavier Tannier
2021-04-02
Using GPT-2 to Create Synthetic Data to Improve the Prediction Performance of NLP Machine Learning Classification Models
Dewayne Whitfield
2021-04-02
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis
Ajay JainMatthew TancikPieter Abbeel
2021-04-01
WakaVT: A Sequential Variational Transformer for Waka Generation
Yuka TakeishiMingxuan NiuJing LuoZhong JinXinyu Yang
2021-04-01
LoFTR: Detector-Free Local Feature Matching with Transformers
| Jiaming SunZehong ShenYuang WangHujun BaoXiaowei Zhou
2021-04-01
TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking
Peng ChuJiang WangQuanzeng YouHaibin LingZicheng Liu
2021-04-01
Next Generation Multitarget Trackers: Random Finite Set Methods vs Transformer-based Deep Learning
| Juliano PintoGeorg HessWilliam LjungberghYuxuan XiaLennart SvenssonHenk Wymeersch
2021-04-01
Adversarial Attacks and Defenses for Speech Recognition Systems
Piotr ŻelaskoSonal JoshiYiwen ShaoJesus VillalbaJan TrmalNajim DehakSanjeev Khudanpur
2021-03-31
Learning Spatio-Temporal Transformer for Visual Tracking
| Bin YanHouwen PengJianlong FuDong WangHuchuan Lu
2021-03-31
Spatiotemporal Transformer for Video-based Person Re-identification
Tianyu ZhangLonghui WeiLingxi XieZijie ZhuangYongfei ZhangBo LiQi Tian
2021-03-30
Read and Attend: Temporal Localisation in Sign Language Videos
Gül VarolLiliane MomeniSamuel AlbanieTriantafyllos AfourasAndrew Zisserman
2021-03-30
Rethinking Spatial Dimensions of Vision Transformers
| Byeongho HeoSangdoo YunDongyoon HanSanghyuk ChunJunsuk ChoeSeong Joon Oh
2021-03-30
Automatic Graph Partitioning for Very Large-scale Deep Learning
Masahiro TanakaKenjiro TauraToshihiro HanawaKentaro Torisawa
2021-03-30
CvT: Introducing Convolutions to Vision Transformers
| Haiping WuBin XiaoNoel CodellaMengchen LiuXiyang DaiLu YuanLei Zhang
2021-03-29
Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding
| Pengchuan ZhangXiyang DaiJianwei YangBin XiaoLu YuanLei ZhangJianfeng Gao
2021-03-29
Transformer Tracking
| Xin ChenBin YanJiawen ZhuDong WangXiaoyun YangHuchuan Lu
2021-03-29
PENELOPIE: Enabling Open Information Extraction for the Greek Language through Machine Translation
| Dimitris PapadopoulosNikolaos PapadakisNikolaos Matsatsinis
2021-03-28
HiT: Hierarchical Transformer with Momentum Contrast for Video-Text Retrieval
Song LiuHaoqi FanShengsheng QianYiru ChenWenkui DingZhongyuan Wang
2021-03-28
Face Transformer for Recognition
| Yaoyao ZhongWeihong Deng
2021-03-27
Automated radiology report generation using conditioned transformers
| Omar AlfarghalyRana KhaledAbeer ElkoranyMaha HelalAly Fahmy
2021-03-26
Leveraging neural representations for facilitating access to untranscribed speech from endangered languages
| Nay SanMartijn BarteldsMitchell BrowneLily CliffordFiona GibsonJohn MansfieldDavid NashJane SimpsonMyfany TurpinMaria VollmerSasha WilmothDan Jurafsky
2021-03-26
A Practical Survey on Faster and Lighter Transformers
Quentin FournierGaétan Marceau CaronDaniel Aloise
2021-03-26
Understanding Robustness of Transformers for Image Classification
Srinadh BhojanapalliAyan ChakrabartiDaniel GlasnerDaliang LiThomas UnterthinerAndreas Veit
2021-03-26
Gated Transformer Networks for Multivariate Time Series Classification
| Minghao LiuShengqi RenSiyuan MaJiahui JiaoYizhou ChenZhiguang WangWei Song
2021-03-26
Lifting Transformer for 3D Human Pose Estimation in Video
Wenhao LiHong LiuRunwei DingMengyuan LiuPichao Wang
2021-03-26
BART based semantic correction for Mandarin automatic speech recognition system
Yun ZhaoXuerui YangJinchao WangYongyu GaoChao YanYuanfu Zhou
2021-03-26
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
| Ze LiuYutong LinYue CaoHan HuYixuan WeiZheng ZhangStephen LinBaining Guo
2021-03-25
AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting
| Ye YuanXinshuo WengYanglan OuKris Kitani
2021-03-25
Mask Attention Networks: Rethinking and Strengthen Transformer
Zhihao FanYeyun GongDayiheng LiuZhongyu WeiSiyuan WangJian JiaoNan DuanRuofei ZhangXuanjing Huang
2021-03-25
K-XLNet: A General Method for Combining Explicit Knowledge with Language Model Pretraining
Ruiqing YanLanchang SunFang WangXiaoMing Zhang
2021-03-25
Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning Performance of GPT-2
Gregor BetzKyle RichardsonChristian Voigt
2021-03-24
Multi-view 3D Reconstruction with Transformer
Dan WangXinrui CuiXun ChenZhengxia ZouTianyang ShiSeptimiu SalcudeanZ. Jane WangRabab Ward
2021-03-24
Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning
| Amaia SalvadorErhan GundogduLoris BazzaniMichael Donoser
2021-03-24
Detecting Hate Speech with GPT-3
| Ke-Li ChiuRohan Alexander
2021-03-23
Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection
Jan Philip WahleTerry RuasNorman MeuschkeBela Gipp
2021-03-23
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures
Sushant SinghAusif Mahmood
2021-03-23
Hybrid Model for Patent Classification using Augmented SBERT and KNN
| Hamid BekamiriDaniel S. HainRoman Jurowetzki
2021-03-22
Incorporating Convolution Designs into Visual Transformers
| Kun YuanShaopeng GuoZiwei LiuAojun ZhouFengwei YuWei Wu
2021-03-22
Tiny Transformers for Environmental Sound Classification at the Edge
David ElliottCarlos E. OteroSteven WyattEvan Martino
2021-03-22
End-to-End Trainable Multi-Instance Pose Estimation with Transformers
Lucas StofflMaxime VidalAlexander Mathis
2021-03-22
Paying Attention to Activation Maps in Camera Pose Regression
Yoli ShavitRon FerensYosi Keller
2021-03-21
Non-Autoregressive Translation by Learning Target Categorical Codes
Yu BaoShuJian HuangTong XiaoDongqi WangXinyu DaiJiajun Chen
2021-03-21
MaAST: Map Attention with Semantic Transformersfor Efficient Visual Navigation
Zachary SeymourKowshik ThopalliNiluthpol MithunHan-Pang ChiuSupun SamarasekeraRakesh Kumar
2021-03-21
An Unsupervised Sampling Approach for Image-Sentence Matching Using Document-Level Structural Information
Zejun LiZhongyu WeiZhihao FanHaijun ShanXuanjing Huang
2021-03-21
The Effectiveness of Morphology-aware Segmentation in Low-Resource Neural Machine Translation
Jonne SäleväConstantine Lignos
2021-03-20
Paying Attention to Multiscale Feature Maps in Multimodal Image Matching
Aviad MoreshetYosi Keller
2021-03-20
Play the Shannon Game With Language Models: A Human-Free Approach to Summary Evaluation
Nicholas EganOleg VasilyevJohn Bohannon
2021-03-19
Hopper: Multi-hop Transformer for Spatiotemporal Reasoning
| Honglu ZhouAsim KadavFarley LaiAlexandru Niculescu-MizilMartin Renqiang MinMubbasir KapadiaHans Peter Graf
2021-03-19
Transferable Model for Shape Optimization subject to Physical Constraints
Lukas HarschJohannes BurgbacherStefan Riedelbauch
2021-03-19
API2Com: On the Improvement of Automatically Generated Code Comments Using API Documentations
Ramin ShahbaziRishab SharmaFatemeh H. Fard
2021-03-19
GPT Understands, Too
| Xiao LiuYanan ZhengZhengxiao DuMing DingYujie QianZhilin YangJie Tang
2021-03-18
Enhancing Transformer for Video Understanding Using Gated Multi-Level Attention and Temporal Adversarial Training
Saurabh SahuPalash Goyal
2021-03-18
Trans-SVNet: Accurate Phase Recognition from Surgical Videos via Hybrid Embedding Aggregation Transformer
Xiaojie GaoYueming JinYonghao LongQi DouPheng-Ann Heng
2021-03-17
You Only Look One-level Feature
| Qiang ChenYingming WangTong YangXiangyu ZhangJian ChengJian Sun
2021-03-17
Dense Interaction Learning for Video-based Person Re-identification
Tianyu HeXin JinXu ShenJianqiang HuangZhibo ChenXian-Sheng Hua
2021-03-16
LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval
| Siqi SunYen-Chun ChenLinjie LiShuohang WangYuwei FangJingjing Liu
2021-03-16
Knowledge driven Description Synthesis for Floor Plan Interpretation
Shreya GoyalChiranjoy ChattopadhyayGaurav Bhatnagar
2021-03-15
SemVLP: Vision-Language Pre-training by Aligning Semantics at Multiple Levels
Chenliang LiMing YanHaiyang XuFuli LuoWei WangBin BiSongfang Huang
2021-03-14
Improving Code Summarization with Block-wise Abstract Syntax Tree Splitting
| Chen LinZhichao OuyangJunqing ZhuangJianqiang ChenHui LiRongxin Wu
2021-03-14
Embedding Calibration for Music Semantic Similarity using Auto-regressive Transformer
Xinran ZhangMaosong SunJiafeng LiuXiaobing Li
2021-03-13
Bilingual Dictionary-based Language Model Pretraining for Neural Machine Translation
Yusen LinJiayong LinShuaicheng ZhangHaoying Dai
2021-03-12
Vision Transformer for COVID-19 CXR Diagnosis using Chest X-ray Feature Corpus
Sangjoon ParkGwanghyun KimYujin OhJoon Beom SeoSang Min LeeJin Hwan KimSungjun MoonJae-Kwang LimJong Chul Ye
2021-03-12
Severity Quantification and Lesion Localization of COVID-19 on CXR using Vision Transformer
Gwanghyun KimSangjoon ParkYujin OhJoon Beom SeoSang Min LeeJin Hwan KimSungjun MoonJae-Kwang LimJong Chul Ye
2021-03-12
Sequential Random Network for Fine-grained Image Classification
Chaorong LiMalu ZhangWei HuangFengqing QinAnping ZengYuanyuan Huang
2021-03-12
Predicting the Behavior of Dealers in Over-The-Counter Corporate Bond Markets
Yusen LinJinming XueLouiqa Raschid
2021-03-12
Comparing the Performance of NLP Toolkits and Evaluation measures in Legal Tech
Muhammad Zohaib Khan
2021-03-12
Unknown Object Segmentation from Stereo Images
Maximilian DurnerWout BoerdijkMartin SundermeyerWerner FriedlZoltan-Csaba MartonRudolph Triebel
2021-03-11
LightMBERT: A Simple Yet Effective Method for Multilingual BERT Distillation
Xiaoqi JiaoYichun YinLifeng ShangXin JiangXiao ChenLinlin LiFang WangQun Liu
2021-03-11
On Improving Deep Learning Trace Analysis with System Call Arguments
Quentin FournierDaniel AloiseSeyed Vahid AzhariFrançois Tetreault
2021-03-11
Continuous 3D Multi-Channel Sign Language Production via Progressive Transformers and Mixture Density Networks
Ben SaundersNecati Cihan CamgozRichard Bowden
2021-03-11
CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review
| Dan HendrycksCollin BurnsAnya ChenSpencer Ball
2021-03-10
Majority Voting with Bidirectional Pre-translation For Bitext Retrieval
| Alex JonesDerry Tanti Wijaya
2021-03-10
Pretrained Transformers as Universal Computation Engines
| Kevin LuAditya GroverPieter AbbeelIgor Mordatch
2021-03-09
TransBTS: Multimodal Brain Tumor Segmentation Using Transformer
| Wenxuan WangChen ChenMeng DingJiangyun LiHong YuSen Zha
2021-03-07
Syntax-BERT: Improving Pre-trained Transformers with Syntax Trees
| Jiangang BaiYujing WangYiren ChenYaming YangJing BaiJing YuYunhai Tong
2021-03-07
MTLHealth: A Deep Learning System for Detecting Disturbing Content in Student Essays
Joseph ValenciaErin Yao
2021-03-07
Orthogonal Attention: A Cloze-Style Approach to Negation Scope Resolution
Aditya KhandelwalVahida Attar
2021-03-07
Measuring Mathematical Problem Solving With the MATH Dataset
| Dan HendrycksCollin BurnsSaurav KadavathAkul AroraSteven BasartEric TangDawn SongJacob Steinhardt
2021-03-05
SpecTr: Spectral Transformer for Hyperspectral Pathology Image Segmentation
| Boxiang YunYan WangJieneng ChenHuiyu WangWei ShenQingli Li
2021-03-05
Hierarchical Transformer for Multilingual Machine Translation
Albina KhusainovaAdil KhanAdín Ramírez RiveraVitaly Romanov
2021-03-05
IOT: Instance-wise Layer Reordering for Transformer Structures
| Jinhua ZhuLijun WuYingce XiaShufang XieTao QinWengang ZhouHouqiang LiTie-Yan Liu
2021-03-05
CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation
| Yutong XieJianpeng ZhangChunhua ShenYong Xia
2021-03-04
The Transformer Network for the Traveling Salesman Problem
| Xavier BressonThomas Laurent
2021-03-04
End-to-end acoustic modelling for phone recognition of young readers
Lucile GelinMorgane DanielJulien PinquierThomas Pellegrini
2021-03-04
University of Copenhagen Participation in TREC Health Misinformation Track 2020
Lucas Chaves LimaDustin Brandon WrightIsabelle AugensteinMaria Maistro
2021-03-03
Dual Reinforcement-Based Specification Generation for Image De-Rendering
Ramakanth PasunuruDavid RosenbergGideon MannMohit Bansal
2021-03-02
Probing Product Description Generation via Posterior Distillation
Haolan ZhanHainan ZhangHongshen ChenLei ShenZhuoye DingYongjun BaoWeipeng YanYanyan Lan
2021-03-02
A HINT from Arithmetic: On Systematic Generalization of Perception, Syntax, and Semantics
Qing LiSiyuan HuangYining HongYixin ZhuYing Nian WuSong-Chun Zhu
2021-03-02
Long Document Summarization in a Low Resource Setting using Pretrained Language Models
Ahsaas BajajPavitra DangatiKalpesh KrishnaPradhiksha Ashok KumarRheeya UppaalBradford WindsorEliot BrennerDominic DotterrerRajarshi DasAndrew McCallum
2021-03-01
CrossMap Transformer: A Crossmodal Masked Path Transformer Using Double Back-Translation for Vision-and-Language Navigation
Aly MagassoubaKomei SugiuraHisashi Kawai
2021-03-01
NLP-CUET@DravidianLangTech-EACL2021: Investigating Visual and Textual Features to Identify Trolls from Multimodal Social Media Memes
Eftekhar HossainOmar SharifMohammed Moshiul Hoque
2021-02-28
Transformers with Competitive Ensembles of Independent Mechanisms
Alex LambDi HeAnirudh GoyalGuolin KeChien-Feng LiaoMirco RavanelliYoshua Bengio
2021-02-27
Generative chemical transformer: attention makes neural machine learn molecular geometric structures via text
Hyunseung KimJonggeol NaWon Bo Lee
2021-02-27
Multi-task transfer learning for finding actionable information from crisis-related messages on social media
Congcong WangDavid Lillis
2021-02-26
MixSpeech: Data Augmentation for Low-resource Automatic Speech Recognition
Linghui MengJin XuXu TanJindong WangTao QinBo Xu
2021-02-25
LET: Linguistic Knowledge Enhanced Graph Transformer for Chinese Short Text Matching
| Boer LyuLu ChenSu ZhuKai Yu
2021-02-25
LazyFormer: Self Attention with Lazy Update
Chengxuan YingGuolin KeDi HeTie-Yan Liu
2021-02-25
When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute
| Tao Lei
2021-02-24
From Universal Language Model to Downstream Task: Improving RoBERTa-Based Vietnamese Hate Speech Detection
Quang Huu PhamViet Anh NguyenLinh Bao DoanNgoc N. TranTa Minh Thanh
2021-02-24
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
| Wenhai WangEnze XieXiang LiDeng-Ping FanKaitao SongDing LiangTong LuPing LuoLing Shao
2021-02-24
PADA: A Prompt-based Autoregressive Approach for Adaptation to Unseen Domains
| Eyal Ben-DavidNadav OvedRoi Reichart
2021-02-24
Accurate Learning of Graph Representations with Graph Multiset Pooling
| Jinheon BaekMinki KangSung Ju Hwang
2021-02-23
Robust and Transferable Anomaly Detection in Log Data using Pre-Trained Language Models
Harold OttJasmin BogatinovskiAlexander AckerSasho NedelkoskiOdej Kao
2021-02-23
Deep Deformation Detail Synthesis for Thin Shell Models
Lan ChenLin GaoJie YangShibiao XuJuntao YeXiaopeng ZhangYu-Kun Lai
2021-02-23
Do Transformer Modifications Transfer Across Implementations and Applications?
| Sharan NarangHyung Won ChungYi TayWilliam FedusThibault FevryMichael MatenaKarishma MalkanNoah FiedelNoam ShazeerZhenzhong LanYanqi ZhouWei LiNan DingJake MarcusAdam RobertsColin Raffel
2021-02-23
Deepfake Video Detection Using Convolutional Vision Transformer
| Deressa WodajoSolomon Atnafu
2021-02-22
Position Information in Transformers: An Overview
Philipp DufterMartin SchmittHinrich Schütze
2021-02-22
Determination of Fault Location in Transmission Lines with Image Processing and Artificial Neural Networks
Serkan BudakBahadir Akbal
2021-02-22
Conditional Positional Encodings for Vision Transformers
| Xiangxiang ChuZhi TianBo ZhangXinlong WangXiaolin WeiHuaxia XiaChunhua Shen
2021-02-22
UniT: Multimodal Multitask Learning with a Unified Transformer
Ronghang HuAmanpreet Singh
2021-02-22
Few Shot Learning for Information Verification
Usama KhalidMirza Omer Beg
2021-02-22
Medical Transformer: Gated Axial-Attention for Medical Image Segmentation
| Jeya Maria Jose ValanarasuPoojan OzaIlker HacihalilogluVishal M. Patel
2021-02-21
Towards Accurate and Compact Architectures via Neural Architecture Transformer
| Yong GuoYin ZhengMingkui TanQi ChenZhipeng LiJian ChenPeilin ZhaoJunzhou Huang
2021-02-20
Multilingual Answer Sentence Reranking via Automatically Translated Data
Thuy VuAlessandro Moschitti
2021-02-20
Calibrate Before Use: Improving Few-Shot Performance of Language Models
| Tony Z. ZhaoEric WallaceShi FengDan KleinSameer Singh
2021-02-19
Dialect Identification in Nuanced Arabic Tweets Using Farasa Segmentation and AraBERT
Anshul Wadhawan
2021-02-19
Latent Variable Nested Set Transformers & AutoBots
Roger GirgisFlorian GolemoFelipe CodevillaJim Aldon D'SouzaSamira Ebrahimi KahouFelix HeideChristopher Pal
2021-02-19
Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer
Rafał PowalskiŁukasz BorchmannDawid JurkiewiczTomasz DwojakMichał PietruszkaGabriela Pałka
2021-02-18
Quiz-Style Question Generation for News Stories
| Adam D. LelkesVinh Q. TranCong Yu
2021-02-18
THEaiTRE 1.0: Interactive generation of theatre play scripts
Rudolf RosaTomáš MusilOndřej DušekDominik JurkoPatrícia SchmidtováDavid MarečekOndřej BojarTom KocmiDaniel HrbekDavid KošťákMartina KinskáMarie NovákováJosef DoležalKlára VoseckáTomáš StudeníkPetr Žabka
2021-02-17
Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with $1/n$ Parameters
Aston ZhangYi TayShuai ZhangAlvin ChanAnh Tuan LuuSiu Cheung HuiJie Fu
2021-02-17
TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models
Zhuohan LiSiyuan ZhuangShiyuan GuoDanyang ZhuoHao ZhangDawn SongIon Stoica
2021-02-16
Revisiting Language Encoding in Learning Multilingual Representations
| Shengjie LuoKaiyuan GaoShuxin ZhengGuolin KeDi HeLiWei WangTie-Yan Liu
2021-02-16
GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training
| Chen ZhuRenkun NiZheng XuKezhi KongW. Ronny HuangTom Goldstein
2021-02-16
Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet
M. Onat TopalAnil BasImke van Heerden
2021-02-16
The corruptive force of AI-generated advice
Margarita LeibNils C. KöbisRainer Michael RilkeMarloes HagensBernd Irlenbusch
2021-02-15
Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm
Laria ReynoldsKyle McDonell
2021-02-15
Translational Equivariance in Kernelizable Attention
| Max HornKumar ShridharElrich GroenewaldPhilipp F. M. Baumann
2021-02-15
Multiversal views on language models
Laria ReynoldsKyle McDonell
2021-02-12
Optimizing Inference Performance of Transformers on CPUs
Dave DiceAlex Kogan
2021-02-12
Improving Zero-shot Neural Machine Translation on Language-specific Encoders-Decoders
Junwei LiaoYu ShiMing GongLinjun ShouHong QuMichael Zeng
2021-02-12
Dancing along Battery: Enabling Transformer with Run-time Reconfigurability on Mobile Devices
Yuhong SongWeiwen JiangBingbing LiPanjie QiQingfeng ZhugeEdwin Hsing-Mean ShaSakyasingha DasguptaYiyu ShiCaiwen Ding
2021-02-12
Transformer Language Models with LSTM-based Cross-utterance Information Representation
| G. SunC. ZhangP. C. Woodland
2021-02-12
Proof Artifact Co-training for Theorem Proving with Language Models
| Jesse Michael HanJason RuteYuhuai WuEdward W. AyersStanislas Polu
2021-02-11
Text Compression-aided Transformer Encoding
Zuchao LiZhuosheng ZhangHai ZhaoRui WangKehai ChenMasao UtiyamaEiichiro Sumita
2021-02-11
NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting
| Kai ChenGuang ChenDan XuLijun ZhangYuyao HuangAlois Knoll
2021-02-10
Joint Intent Detection and Slot Filling with Wheel-Graph Attention Networks
Pengfei WeiBi ZengWenxiong Liao
2021-02-09
Conversational Query Rewriting with Self-supervised Learning
Hang LiuMeng ChenYouzheng WuXiaodong HeBoWen Zhou
2021-02-09
Bayesian Transformer Language Models for Speech Recognition
Boyang XueJianwei YuJunhao XuShansong LiuShoukang HuZi YeMengzhe GengXunying LiuHelen Meng
2021-02-09
AuGPT: Dialogue with Pre-trained Language Models and Data Augmentation
| Jonáš KulhánekVojtěch HudečekTomáš NekvindaOndřej Dušek
2021-02-09
Point Cloud Transformers applied to Collider Physics
| Vinicius MikuniFlorencia Canelli
2021-02-09
Generating Fake Cyber Threat Intelligence Using Transformer-Based Models
Priyanka RanadeAritran PiplaiSudip MittalAnupam JoshiTim Finin
2021-02-08
How True is GPT-2? An Empirical Analysis of Intersectional Occupational Biases
| Hannah KirkYennie JunHaider IqbalElias BenussiFilippo VolpinFrederic A. DreyerAleksandar ShtedritskiYuki M. Asano
2021-02-08
Colorization Transformer
| Manoj KumarDirk WeissenbornNal Kalchbrenner
2021-02-08
TransReID: Transformer-based Object Re-Identification
| Shuting HeHao LuoPichao WangFan WangHao LiWei Jiang
2021-02-08
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation
| Jieneng ChenYongyi LuQihang YuXiangde LuoEhsan AdeliYan WangLe LuAlan L. YuilleYuyin Zhou
2021-02-08
A Hybrid Task-Oriented Dialog System with Domain and Task Adaptive Pretraining
| Boliang ZhangYing LyuNing DingTianhao ShenZhaoyang JiaKun HanKevin Knight
2021-02-08
Wake Word Detection with Streaming Transformers
Yiming WangHang LvDaniel PoveyLei XieSanjeev Khudanpur
2021-02-08
Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention
| Yunyang XiongZhanpeng ZengRudrasis ChakrabortyMingxing TanGlenn FungYin LiVikas Singh
2021-02-07
Neural Data-to-Text Generation with LM-based Text Augmentation
Ernie ChangXiaoyu ShenDawei ZhuVera DembergHui Su
2021-02-06
Jointly Improving Language Understanding and Generation with Quality-Weighted Weak Supervision of Automatic Labeling
Ernie ChangVera DembergAlex Marin
2021-02-06
ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision
| Wonjae KimBokyung SonIldoo Kim
2021-02-05
PipeTransformer: Automated Elastic Pipelining for Distributed Training of Transformers
Chaoyang HeShen LiMahdi SoltanolkotabiSalman Avestimehr
2021-02-05
Understanding Emails and Drafting Responses -- An Approach Using GPT-3
Jonas ThiergartStefan HuberThomas Übellacker
2021-02-05
Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models
Alex TamkinMiles BrundageJack ClarkDeep Ganguli
2021-02-04
Pitfalls of Static Language Modelling
Angeliki LazaridouAdhiguna KuncoroElena GribovskayaDevang AgrawalAdam LiskaTayfun TerziMai GimenezCyprien de Masson d'AutumeSebastian RuderDani YogatamaKris CaoTomas KociskySusannah YoungPhil Blunsom
2021-02-03
Neural Transfer Learning with Transformers for Social Science Text Analysis
Sandra Wankmüller
2021-02-03
Relaxed Transformer Decoders for Direct Action Proposal Generation
| Jing TanJiaqi TangLiMin WangGangshan Wu
2021-02-03
Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram
Shengkui ZhaoHao WangTrung Hieu NguyenBin Ma
2021-02-03
MUFASA: Multimodal Fusion Architecture Search for Electronic Health Records
Zhen XuDavid R. SoAndrew M. Dai
2021-02-03
Automated Query Reformulation for Efficient Search based on Query Logs From Stack Overflow
| Kaibo CaoChunyang ChenSebastian BaltesChristoph TreudeXiang Chen
2021-02-01
GTAE: Graph-Transformer based Auto-Encoders for Linguistic-Constrained Text Style Transfer
Yukai ShiSen ZhangChenxing ZhouXiaodan LiangXiaojun YangLiang Lin
2021-02-01
"Is depression related to cannabis?": A knowledge-infused model for Entity and Relation Extraction with Limited Supervision
Kaushik RoyUsha LokalaVedant KhandelwalAmit Sheth
2021-02-01
Computational Performance Predictions for Deep Neural Network Training: A Runtime-Based Approach
Geoffrey X. YuYubo GaoPavel GolikovGennady Pekhimenko
2021-01-31
Short Text Clustering with Transformers
Leonid PugachevMikhail Burtsev
2021-01-31
Transition based Graph Decoder for Neural Machine Translation
Leshem ChoshenOmri Abend
2021-01-29
Synthesizing Monolingual Data for Neural Machine Translation
Benjamin MarieAtsushi Fujita
2021-01-29
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
| Li YuanYunpeng ChenTao WangWeihao YuYujun ShiZihang JiangFrancis EH TayJiashi FengShuicheng Yan
2021-01-28
LSTM-SAKT: LSTM-Encoded SAKT-like Transformer for Knowledge Tracing
Takashi OyaShigeo Morishima
2021-01-28
Spatial-Channel Transformer Network for Trajectory Prediction on the Traffic Scenes
Jingwen ZhaoXuanpeng LiQifan XueWeigong Zhang
2021-01-27
An explainable Transformer-based deep learning model for the prediction of incident heart failure
Shishir RaoYikuan LiRema RamakrishnanAbdelaali HassaineDexter CanoyJohn ClelandThomas LukasiewiczGholamreza Salimi-KhorshidiKazem Rahimi
2021-01-27
Exploring multi-task multi-lingual learning of transformer models for hate speech and offensive speech identification in social media
| Sudhanshu MishraShivangi PrasadShubhanshu Mishra
2021-01-27
Attention Can Reflect Syntactic Structure (If You Let It)
Vinit RavishankarArtur KulmizevMostafa AbdouAnders SøgaardJoakim Nivre
2021-01-26
CPTR: Full Transformer Network for Image Captioning
Wei LiuSihan ChenLongteng GuoXinxin ZhuJing Liu
2021-01-26
Analyzing Zero-shot Cross-lingual Transfer in Supervised NLP Tasks
Hyunjin ChoiJudong KimSeongho JoeSeungjai MinYoungjune Gwon
2021-01-26
Randomized Deep Structured Prediction for Discourse-Level Processing
Manuel WidmoserMaria Leonor PachecoJean HonorioDan Goldwasser
2021-01-25
Multi-Task Time Series Forecasting With Shared Attention
Zekai ChenJiaze EXiao ZhangHao ShengXiuzheng Cheng
2021-01-24
WangchanBERTa: Pretraining transformer-based Thai Language Models
| Lalita LowphansirikulCharin PolpanumasNawat JantrakulchaiSarana Nutanong
2021-01-24
Training Multilingual Pre-trained Language Model with Byte-level Subwords
| Junqiu WeiQun LiuYinpeng GuoXin Jiang
2021-01-23
Enriching Non-Autoregressive Transformer with Syntactic and SemanticStructures for Neural Machine Translation
Ye LiuYao WanJian-Guo ZhangWenting ZhaoPhilip S. Yu
2021-01-22
BERT Transformer model for Detecting Arabic GPT2 Auto-Generated Tweets
Fouzi HarragMaria DebbahKareem DarwishAhmed Abdelali
2021-01-22
DAF:re: A Challenging, Crowd-Sourced, Large-Scale, Long-Tailed Dataset For Anime Character Recognition
| Edwin Arkel RiosWen-Huang ChengBo-Cheng Lai
2021-01-21
Activity Graph Transformer for Temporal Action Localization
Megha NawhalGreg Mori
2021-01-21
Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval
| Robert LitschkoIvan VulićSimone Paolo PonzettoGoran Glavaš
2021-01-21
PGT: Pseudo Relevance Feedback Using a Graph-Based Transformer
HongChien YuZhuyun DaiJamie Callan
2021-01-20
UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers
| Siyi HuFengda ZhuXiaojun ChangXiaodan Liang
2021-01-20
Active Learning for Sequence Tagging with Deep Pre-trained Models and Bayesian Uncertainty Estimates
Artem ShelmanovDmitri PuzyrevLyubov KupriyanovaDenis BelyakovDaniil LarionovNikita KhromovOlga KozlovaEkaterina ArtemovaDmitry V. DylovAlexander Panchenko
2021-01-20
Open-Domain Conversational Search Assistant with Transformers
Rafael FerreiraMariana LeiteDavid SemedoJoao Magalhaes
2021-01-20
Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach
| ASHISH SHARMAInna W. LinAdam S. MinerDavid C. AtkinsTim Althoff
2021-01-19
Fast Convergence of DETR with Spatially Modulated Co-Attention
| Peng GaoMinghang ZhengXiaogang WangJifeng DaiHongsheng Li
2021-01-19
Inference for BART with Multinomial Outcomes
Yizhen XuJoseph W. HoganMichael J. DanielsRami KantorAnn Mwangi
2021-01-18
Dual-Level Collaborative Transformer for Image Captioning
| Yunpeng LuoJiayi JiXiaoshuai SunLiujuan CaoYongjian WuFeiyue HuangChia-Wen LinRongrong Ji
2021-01-16
Match-Ignition: Plugging PageRank into Transformer for Long-form Text Matching
| Liang PangYanyan LanXueqi Cheng
2021-01-16
Transformer-Based Models for Question Answering on COVID19
Hillary NgaiYoona ParkJohn ChenMahboobeh Parsapoor
2021-01-16
Persistent Anti-Muslim Bias in Large Language Models
Abubakar AbidMaheen FarooqiJames Zou
2021-01-14
Exploration of Visual Features and their weighted-additive fusion for Video Captioning
Praveen S VAkhilesh BharadwajHarsh RajJanhavi DadhaniaGanesh Samarth C. ANikhil PareekS R M Prasanna
2021-01-14
Training Data Leakage Analysis in Language Models
Huseyin A. InanOsman RamadanLukas WutschitzDaniel JonesVictor RühleJames WithersRobert Sim
2021-01-14
Coarse and Fine-Grained Hostility Detection in Hindi Posts using Fine Tuned Multilingual Embeddings
| Arkadipta DeVenkatesh EKaushal Kumar MauryaMaunendra Sankar Desarkar
2021-01-13
Neural News Recommendation with Negative Feedback
Chuhan WuFangzhao WuYongfeng HuangXing Xie
2021-01-12
Fake News Detection System using XLNet model with Topic Distributions: CONSTRAINT@AAAI2021 Shared Task
Akansha GautamVenktesh VSarah Masud
2021-01-12
Spherical Transformer: Adapting Spherical Signal to CNNs
Haikuan DuHui CaoShen CaiJunchi YanSiyu Zhang
2021-01-11
Revisiting Mahalanobis Distance for Transformer-Based Out-of-Domain Detection
Alexander PodolskiyDmitry LipinAndrey BoutEkaterina ArtemovaIrina Piontkovskaya
2021-01-11
Investigating the Vision Transformer Model for Image Retrieval Tasks
Socratis GkeliosYiannis BoutalisSavvas A. Chatzichristofis
2021-01-11
BERT-GT: Cross-sentence n-ary relation extraction with BERT and Graph Transformer
Po-Ting LaiZhiyong Lu
2021-01-11
Channel Boosting Feature Ensemble for Radar-based Object Detection
Shoaib AzamFarzeen MunirMoongu Jeon
2021-01-10
Cisco at AAAI-CAD21 shared task: Predicting Emphasis in Presentation Slides using Contextualized Embeddings
| Sreyan GhoshSonal KumarHarsh JalanHemant YadavRajiv Ratn Shah
2021-01-10
Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing
| Minh Van NguyenViet LaiAmir Pouran Ben VeysehThien Huu Nguyen
2021-01-09
Leveraging Multilingual Transformers for Hate Speech Detection
| Sayar Ghosh RoyUjwal NarayanTathagata RahaZubair AbidVasudeva Varma
2021-01-08
TrackFormer: Multi-Object Tracking with Transformers
| Tim MeinhardtAlexander KirillovLaura Leal-TaixeChristoph Feichtenhofer
2021-01-07
Compound Word Transformer: Learning to Compose Full-Song Music over Dynamic Directed Hypergraphs
| Wen-Yi HsiaoJen-Yu LiuYin-Cheng YehYi-Hsuan Yang
2021-01-07
Transformer-based approach towards music emotion recognition from lyrics
| Yudhik AgrawalRamaguru Guru Ravi ShankerVinoo Alluri
2021-01-06
I-BERT: Integer-only BERT Quantization
| Sehoon KimAmir GholamiZhewei YaoMichael W. MahoneyKurt Keutzer
2021-01-05
AutoDropout: Learning Dropout Patterns to Regularize Deep Networks
| Hieu PhamQuoc V. Le
2021-01-05
Transformers in Vision: A Survey
Salman KhanMuzammal NaseerMunawar HayatSyed Waqas ZamirFahad Shahbaz KhanMubarak Shah
2021-01-04
Transformers and Transfer Learning for Improving Portuguese Semantic Role Labeling
| Sofia OliveiraDaniel LoureiroAlípio Jorge
2021-01-04
An Efficient Transformer Decoder with Compressed Sub-layers
Yanyang LiYe LinTong XiaoJingbo Zhu
2021-01-03
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
Wangchunshu ZhouTao GeKe XuFuru Wei
2021-01-02
KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation
Yiran XingZai ShiZhao MengYunpu MaRoger Wattenhofer
2021-01-02
Polyjuice: Automated, General-purpose Counterfactual Generation
Tongshuang WuMarco Tulio RibeiroJeffrey HeerDaniel S. Weld
2021-01-01
Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers
| Machel ReidEdison Marrese-TaylorYutaka Matsuo
2021-01-01
Representation and Bias in Multilingual NLP: Insights from Controlled Experiments on Conditional Language Modeling
Anonymous
2021-01-01
Transformer protein language models are unsupervised structure learners
Anonymous
2021-01-01
Syntactic Relevance XLNet Word Embedding Generation in Low-Resource Machine Translation
Anonymous
2021-01-01
Single Layers of Attention Suffice to Predict Protein Contacts
Anonymous
2021-01-01
Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation
Anonymous
2021-01-01
Information-theoretic Vocabularization via Optimal Transport
Anonymous
2021-01-01
Non-iterative Parallel Text Generation via Glancing Transformer
Anonymous
2021-01-01
Transformers are Deep Infinite-Dimensional Non-Mercer Binary Kernel Machines
Anonymous
2021-01-01
Memory Representation in Transformer
Anonymous
2021-01-01
Deep Representational Re-tuning using Contrastive Tension
| Anonymous
2021-01-01
Cluster-Former: Clustering-based Sparse Transformer for Question Answering
Anonymous
2021-01-01
You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling
Anonymous
2021-01-01
An Attention Free Transformer
Anonymous
2021-01-01
Self-supervised and Supervised Joint Training for Resource-rich Machine Translation
Anonymous
2021-01-01
Representational correlates of hierarchical phrase structure in deep language models
Anonymous
2021-01-01
Pre-training Text-to-Text Transformers to Write and Reason with Concepts
Anonymous
2021-01-01
Subformer: A Parameter Reduced Transformer
Anonymous
2021-01-01
Pretrain Knowledge-Aware Language Models
Anonymous
2021-01-01
Adding Recurrence to Pretrained Transformers
Anonymous
2021-01-01
Trans-Caps: Transformer Capsule Networks with Self-attention Routing
Anonymous
2021-01-01
Improving Generalizability of Protein Sequence Models via Data Augmentations
Anonymous
2021-01-01
AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly
| Anonymous
2021-01-01
Synthesizer: Rethinking Self-Attention for Transformer Models
Anonymous
2021-01-01
Block Skim Transformer for Efficient Question Answering
Anonymous
2021-01-01
Generalizing Tree Models for Improving Prediction Accuracy
Anonymous
2021-01-01
HyperGrid Transformers: Towards A Single Model for Multiple Tasks
Anonymous
2021-01-01
Long Range Arena : A Benchmark for Efficient Transformers
Anonymous
2021-01-01
AriEL: Volume Coding for Sentence Generation Comparisons
Anonymous
2021-01-01
PhraseTransformer: Self-Attention using Local Context for Semantic Parsing
Anonymous
2021-01-01
KETG: A Knowledge Enhanced Text Generation Framework
Anonymous
2021-01-01
Analogical Reasoning for Visually Grounded Compositional Generalization
Anonymous
2021-01-01
Exploring Routing Strategies for Multilingual Mixture-of-Experts Models
Anonymous
2021-01-01
Parameterization of Hypercomplex Multiplications
Anonymous
2021-01-01
Do Transformers Understand Polynomial Simplification?
Anonymous
2021-01-01
Scalable Transformers for Neural Machine Translation
Anonymous
2021-01-01
Transformer-QL: A Step Towards Making Transformer Network Quadratically Large
Anonymous
2021-01-01
VTNet: Visual Transformer Network for Object Goal Navigation
Anonymous
2021-01-01
On Position Embeddings in BERT
Anonymous
2021-01-01
Post-Training Weighted Quantization of Neural Networks for Language Models
Anonymous
2021-01-01
Predictive Attention Transformer: Improving Transformer with Attention Map Prediction
Anonymous
2021-01-01
Improving Machine Translation by Searching Skip Connections Efficiently
Anonymous
2021-01-01
UPDeT: Universal Multi-agent RL via Policy Decoupling with Transformers
Anonymous
2021-01-01
Transformers satisfy
Anonymous
2021-01-01
Transforming Recurrent Neural Networks with Attention and Fixed-point Equations
Anonymous
2021-01-01
How Multipurpose Are Language Models?
Anonymous
2021-01-01
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Xiang Lisa LiPercy Liang
2021-01-01
A Multi-modal Deep Learning Model for Video Thumbnail Selection
Zhifeng YuNanchun Shi
2020-12-31
EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets
Xiaohan ChenYu ChengShuohang WangZhe GanZhangyang WangJingjing Liu
2020-12-31
Studying Strategically: Learning to Mask for Closed-book QA
Qinyuan YeBelinda Z. LiSinong WangBenjamin BolteHao MaWen-tau YihXiang RenMadian Khabsa
2020-12-31
Conditional Generation of Temporally-ordered Event Sequences
Shih-ting LinNathanael ChambersGreg Durrett
2020-12-31
Directed Beam Search: Plug-and-Play Lexically Constrained Language Generation
| Damian PascualBeni EgressyFlorian BolliRoger Wattenhofer
2020-12-31
Making Pre-trained Language Models Better Few-shot Learners
| Tianyu GaoAdam FischDanqi Chen
2020-12-31
Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade
Jiatao GuXiang Kong
2020-12-31
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers
Wenhui WangHangbo BaoShaohan HuangLi DongFuru Wei
2020-12-31
Verb Knowledge Injection for Multilingual Event Processing
Olga MajewskaIvan VulićGoran GlavašEdoardo M. PontiAnna Korhonen
2020-12-31
XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders
Shuming MaJian YangHaoyang HuangZewen ChiLi DongDongdong ZhangHany Hassan AwadallaAlexandre MuzioAkiko EriguchiSaksham SinghalXia SongArul MenezesFuru Wei
2020-12-31
Revisiting Robust Neural Machine Translation: A Transformer Case Study
Peyman PassbanPuneeth S. M. SaladiQun Liu
2020-12-31
TransTrack: Multiple Object Tracking with Transformer
| Peize SunJinkun CaoYi JiangRufeng ZhangEnze XieZehuan YuanChanghu WangPing Luo
2020-12-31
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
| Leo GaoStella BidermanSid BlackLaurence GoldingTravis HoppeCharles FosterJason PhangHorace HeAnish ThiteNoa NabeshimaShawn PresserConnor Leahy
2020-12-31
Transformer for Image Quality Assessment
| Junyong YouJari Korhonen
2020-12-30
Optimizing Deeper Transformers on Small Datasets: An Application on Text-to-SQL Semantic Parsing
Peng XuWei YangWenjie ZiKeyi TangChengyang HuangJackie Chi Kit CheungYanshuai Cao
2020-12-30
Unnatural Language Inference
Koustuv SinhaPrasanna ParthasarathiJoelle PineauAdina Williams
2020-12-30
Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps
| Tri DaoNimit S. SohoniAlbert GuMatthew EichhornAmit BlonderMegan LeszczynskiAtri RudraChristopher Ré
2020-12-29
Robust Dialogue Utterance Rewriting as Sequence Tagging
Jie HaoLinfeng SongLiWei WangKun XuZhaopeng TuDong Yu
2020-12-29
SIT3: Code Summarization with Structure-Induced Transformer
Hongqiu WuHai ZhaoMin Zhang
2020-12-29
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding
| Yang XuYiheng XuTengchao LvLei CuiFuru WeiGuoxin WangYijuan LuDinei FlorencioCha ZhangWanxiang CheMin ZhangLidong Zhou
2020-12-29
A Hierarchical Transformer with Speaker Modeling for Emotion Recognition in Conversation
Jiangnan LiZheng LinPeng FuQingyi SiWeiping Wang
2020-12-29
Lattice-Free MMI Adaptation Of Self-Supervised Pretrained Acoustic Models
| Apoorv VyasSrikanth MadikeriHervé Bourlard
2020-12-28
TransPose: Towards Explainable Human Pose Estimation by Transformer
| Sen yangZhibin QuanMu NieWankou Yang
2020-12-28
Red Dragon AI at TextGraphs 2020 Shared Task: LIT : LSTM-Interleaved Transformer for Multi-Hop Explanation Ranking
| Yew Ken ChiaSam WitteveenMartin Andrews
2020-12-28
Syntax-Enhanced Pre-trained Model
Zenan XuDaya GuoDuyu TangQinliang SuLinjun ShouMing GongWanjun ZhongXiaojun QuanNan DuanDaxin Jiang
2020-12-28
Portfolio Optimization with 2D Relative-Attentional Gated Transformer
Tae Wan KimMatloob Khushi
2020-12-27
SG-Net: Syntax Guided Transformer for Language Representation
Zhuosheng ZhangYuwei WuJunru ZhouSufeng DuanHai ZhaoRui Wang
2020-12-27
Learning Light-Weight Translation Models from Deep Transformer
| Bei LiZiyang WangHui LiuQuan DuTong XiaoChunliang ZhangJingbo Zhu
2020-12-27
I like fish, especially dolphins: Addressing Contradictions in Dialogue Modeling
Yixin NieMary WilliamsonMohit BansalDouwe KielaJason Weston
2020-12-24
Detecting Hateful Memes Using a Multimodal Deep Ensemble
| Vlad Sandulescu
2020-12-24
Future-Guided Incremental Transformer for Simultaneous Translation
Shaolei ZhangYang FengLiangyou Li
2020-12-23
Domain Adaptation of NMT models for English-Hindi Machine Translation Task at AdapMT ICON 2020
Ramchandra JoshiRushabh KarnavatKaustubh JirapureRaviraj Joshi
2020-12-22
Uncertainty and Surprisal Jointly Deliver the Punchline: Exploiting Incongruity-Based Features for Humor Recognition
Yubo XieJunze LiPearl Pu
2020-12-22
Molecular CT: Unifying Geometry and Representation Learning for Molecules at Different Scales
Jun ZhangYaqiang ZhouYao-Kun LeiYi Isaac YangYi Qin Gao
2020-12-22
Multi-Head Self-Attention with Role-Guided Masks
Dongsheng WangCasper HansenLucas Chaves LimaChristian HansenMaria MaistroJakob Grue SimonsenChristina Lioma
2020-12-22
Sub-Linear Memory: How to Make Performers SLiM
| Valerii LikhosherstovKrzysztof ChoromanskiJared DavisXingyou SongAdrian Weller
2020-12-21
3D Object Detection with Pointformer
Xuran PanZhuofan XiaShiji SongLi Erran LiGao Huang
2020-12-21
Encoding Syntactic Knowledge in Transformer Encoder for Intent Detection and Slot Filling
Jixuan WangKai WeiMartin RadfarWeiwei ZhangClement Chung
2020-12-21
RealFormer: Transformer Likes Residual Attention
| Ruining HeAnirudh RavulaBhargav KanagalJoshua Ainslie
2020-12-21
Adaptive Bi-directional Attention: Exploring Multi-Granularity Representations for Machine Reading Comprehension
Nuo ChenFenglin LiuChenyu YouPeilin ZhouYuexian Zou
2020-12-20
Breaking Writer's Block: Low-cost Fine-tuning of Natural Language Generation Models
Alexandre DuvalThomas LamsonGael de Leseleuc de KerouaraMatthias Gallé
2020-12-19
NeurST: Neural Speech Translation Toolkit
| Chengqi ZhaoMingxuan WangLei LI
2020-12-18
Transformer Interpretability Beyond Attention Visualization
| Hila CheferShir GurLior Wolf
2020-12-17
End-to-end Deep Object Tracking with Circular Loss Function for Rotated Bounding Box
Vladislav BelyaevAleksandra MalyshevaAleksei Shpilman
2020-12-17
Pct: Point cloud transformer
| Meng-Hao GuoJun-Xiong CaiZheng-Ning LiuTai-Jiang MuRalph R. MartinShi-Min Hu
2020-12-17
A Generalization of Transformer Networks to Graphs
| Vijay Prakash DwivediXavier Bresson
2020-12-17
Toward Transformer-Based Object Detection
Josh BealEric KimEric TzengDong Huk ParkAndrew ZhaiDmitry Kislyuk
2020-12-17
Taming Transformers for High-Resolution Image Synthesis
| Patrick EsserRobin RombachBjörn Ommer
2020-12-17
Point Transformer
| Hengshuang ZhaoLi JiangJiaya JiaPhilip TorrVladlen Koltun
2020-12-16
DialogXL: All-in-One XLNet for Multi-Party Conversation Emotion Recognition
| Weizhou ShenJunqing ChenXiaojun QuanZhixian Xie
2020-12-16
Focusing More on Conflicts with Mis-Predictions Helps Language Pre-Training
Chen XingWencong XiaoYong LiWei Lin
2020-12-16
Query expansion with artificially generated texts
Vincent Claveau
2020-12-16
Revisiting Linformer with a modified self-attention with linear complexity
Madhusudan Verma
2020-12-16
High throughput screening with machine learning
Oleksandr GurbychMaksym DruchokDzvenymyra YarishSofiya Garkot
2020-12-15
Traditional IR rivals neural models on the MS MARCO Document Ranking Leaderboard
Leonid Boytsov
2020-12-15
RecipeNLG: A Cooking Recipes Dataset for Semi-Structured Text Generation
| Michał BieńMichał GilskiMartyna MaciejewskaWojciech TaisnerDawid WiśniewskiAgnieszka Ławrynowicz
2020-12-15
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting
| Haoyi ZhouShanghang ZhangJieqi PengShuai ZhangJianXin LiHui XiongWancai Zhang
2020-12-14
Extracting Training Data from Large Language Models
Nicholas CarliniFlorian TramerEric WallaceMatthew JagielskiAriel Herbert-VossKatherine LeeAdam RobertsTom BrownDawn SongUlfar ErlingssonAlina OpreaColin Raffel
2020-12-14
Contrastive Learning with Adversarial Perturbations for Conditional Text Generation
| Seanie LeeDong Bok LeeSung Ju Hwang
2020-12-14
Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension
| Xiuying ChenZhi CuiJiayi ZhangChen WeiJianwei CuiBin WangDongyan ZhaoRui Yan
2020-12-14
Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network
Jiayi JiYunpeng LuoXiaoshuai SunFuhai ChenGen LuoYongjian WuYue GaoRongrong Ji
2020-12-13
KVL-BERT: Knowledge Enhanced Visual-and-Linguistic BERT for Visual Commonsense Reasoning
Dandan songSiyi MaZhanchen SunSicheng YangLejian Liao
2020-12-13
Discriminative Pre-training for Low Resource Title Compression in Conversational Grocery
Snehasish MukherjeePhaniram SayapaneniShankar Subramanya
2020-12-13
DETR for Crowd Pedestrian Detection
| Matthieu LinChuming LiXingyuan BuMing SunChen LinJunjie YanWanli OuyangZhidong Deng
2020-12-12
Yelp Review Rating Prediction: Machine Learning and Deep Learning Models
| Zefang Liu
2020-12-12
Spatial Temporal Transformer Network for Skeleton-based Action Recognition
| Chiara PlizzariMarco CanniciMatteo Matteucci
2020-12-11
Hardware Beyond Backpropagation: a Photonic Co-Processor for Direct Feedback Alignment
Julien LaunayIacopo PoliKilian MüllerGustave ParienteIgor CarronLaurent DaudetFlorent KrzakalaSylvain Gigan
2020-12-11
TabTransformer: Tabular Data Modeling Using Contextual Embeddings
| Xin HuangAshish KhetanMilan CvitkovicZohar Karnin
2020-12-11
As good as new. How to successfully recycle English GPT-2 to make models for other languages
| Wietse de VriesMalvina Nissim
2020-12-10
Towards Neural Programming Interfaces
|