Transformer

Introduced by Vaswani et al. in Attention Is All You Need

A Transformer is a model architecture that eschews recurrence and instead relies entirely on an attention mechanism to draw global dependencies between input and output. Before Transformers, the dominant sequence transduction models were based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The Transformer also employs an encoder and decoder, but removing recurrence in favor of attention mechanisms allows for significantly more parallelization than methods like RNNs and CNNs.

Source: Attention Is All You Need

Latest Papers

<
PAPER DATE
Deep Multi-View Spatiotemporal Virtual Graph Neural Network for Significant Citywide Ride-hailing Demand Prediction
Guangyin JinZhexu XiHengyu ShaYanghe FengJincai Huang
2020-07-30
Interpretable Contextual Team-aware Item Recommendation: Application in Multiplayer Online Battle Arena Games
Andrés VillaVladimir AraujoFrancisca CattanDenis Parra
2020-07-30
FiSSA at SemEval-2020 Task 9: Fine-tuned For Feelings
| Bertelt BraaksmaRichard ScholtensStan van SuijlekomRemy WangAhmet Üstün
2020-07-24
DeepSVG: A Hierarchical Generative Network for Vector Graphics Animation
| Alexandre CarlierMartin DanelljanAlexandre AlahiRadu Timofte
2020-07-22
CrossTransformers: spatially-aware few-shot transfer
Carl DoerschAnkush GuptaAndrew Zisserman
2020-07-22
Analogical Reasoning for Visually Grounded Language Acquisition
Bo WuHaoyu QinAlireza ZareianCarl VondrickShih-Fu Chang
2020-07-22
SliceOut: Training Transformers and CNNs faster while using less memory
Pascal NotinAidan N. GomezJoanna YooYarin Gal
2020-07-21
Neural Machine Translation with Error Correction
Kaitao SongXu TanJianfeng Lu
2020-07-21
Learning Joint Spatial-Temporal Transformations for Video Inpainting
| Yanhong ZengJianlong FuHongyang Chao
2020-07-20
Conformer-Kernel with Query Term Independence for Document Retrieval
| Bhaskar MitraSebastian HofstatterHamed ZamaniNick Craswell
2020-07-20
Temporal Pointwise Convolutional Networks for Length of Stay Prediction in the Intensive Care Unit
| Emma RocheteauPietro LiòStephanie Hyland
2020-07-18
Feature Pyramid Transformer
Dong ZhangHanwang ZhangJinhui TangMeng WangXiansheng HuaQianru Sun
2020-07-18
Deep Learning Based Traffic Surveillance System For Missing and Suspicious Car Detection
K. V. KadambariVishnu Vardhan Nimmalapudi
2020-07-17
CTC-Segmentation of Large Corpora for German End-to-end Speech Recognition
Ludwig KürzingerDominik WinkelbauerLujun LiTobias WatzelGerhard Rigoll
2020-07-17
The Monte Carlo Transformer: a stochastic self-attention model for sequence prediction
Alice MartinCharles OllionFlorian StrubSylvain Le CorffOlivier Pietquin
2020-07-15
Deep Transformer based Data Augmentation with Subword Units for Morphologically Rich Online ASR
Balázs TarjánGyörgy SzaszákTibor FegyóPéter Mihajlik
2020-07-14
Contextualized Code Representation Learning for Commit Message Generation
Lun Yiu NieCuiyun GaoZhicong ZhongWai LamYang LiuZenglin Xu
2020-07-14
Emoji Prediction: Extensions and Benchmarking
Weicheng MaRuibo LiuLili WangSoroush Vosoughi
2020-07-14
Paranoid Transformer: Reading Narrative of Madness as Computational Approach to Creativity
Yana AgafonovaAlexey TikhonovIvan P. Yamshchikov
2020-07-13
Transformer with Depth-Wise LSTM
Hongfei XuQiuhui LiuDeyi XiongJosef van Genabith
2020-07-13
Sparse Graph to Sequence Learning for Vision Conditioned Long Textual Sequence Generation
Aditya MogadalaMarius MosbachDietrich Klakow
2020-07-12
TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech
| Andy T. LiuShang-Wen LiHung-yi Lee
2020-07-12
Sequence Generation with Mixed Representations
Lijun Wu Shufang Xie Yingce Xia Fan Yang Tao Qin Jianhuang Lai Tie-Yan Liu
2020-07-11
BISON:BM25-weighted Self-Attention Framework for Multi-Fields Document Search
Xuan ShanChuanjie LiuYiqian XiaQi ChenYusi ZhangAngen LuoYuxiang Luo
2020-07-10
DeepSinger: Singing Voice Synthesis with Data Mined From the Web
Yi RenXu TanTao QinJian LuanZhou ZhaoTie-Yan Liu
2020-07-09
Advances of Transformer-Based Models for News Headline Generation
| Alexey BukhtiyarovIlya Gusev
2020-07-09
The Go Transformer: Natural Language Modeling for Game Play
Matthew CiolinoDavid NoeverJosh Kalin
2020-07-07
Do Transformers Need Deep Long-Range Memory
Jack W. RaeAli Razavi
2020-07-07
Relevance Transformer: Generating Concise Code Snippets with Relevance Feedback
Carlos GemmellFederico RossettoJeffrey Dalton
2020-07-06
Learning to Segment Anatomical Structures Accurately from One Exemplar
Yuhang LuWeijian LiKang ZhengYirui WangAdam P. HarrisonChihung LinSong WangJing XiaoLe LuChang-Fu KuoShun Miao
2020-07-06
Abstractive and mixed summarization for long-single documents
Roger BarrullJugal Kalita
2020-07-03
Self-Attention Guided Copy Mechanism for Abstractive Summarization
Song XuHaoran LiPeng YuanYouzheng WuXiaodong HeBowen Zhou
2020-07-01
Multimodal Transformer for Multimodal Machine Translation
Shaowei YaoXiaojun Wan
2020-07-01
Paraphrase Generation by Learning How to Edit from Samples
Amirhossein KazemnejadMohammadreza SalehiMahdieh Soleymani Baghshah
2020-07-01
Dependency Graph Enhanced Dual-transformer Structure for Aspect-based Sentiment Classification
Hao TangDonghong JiChenliang LiQiji Zhou
2020-07-01
In Neural Machine Translation, What Does Transfer Learning Transfer?
Alham Fikri AjiNikolay BogoychevKenneth HeafieldRico Sennrich
2020-07-01
Feature Projection for Improved Text Classification
Qi QinWenpeng HuBing Liu
2020-07-01
Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation
Arya D. McCarthyXian LiJiatao GuNing Dong
2020-07-01
DIALOGPT : Large-Scale Generative Pre-training for Conversational Response Generation
| Yizhe ZhangSiqi SunMichel GalleyYen-Chun ChenChris BrockettXiang GaoJianfeng GaoJingjing LiuBill Dolan
2020-07-01
Combining Subword Representations into Word-level Representations in the Transformer Architecture
Noe CasasMarta R. Costa-juss{\`a}Jos{\'e} A. R. Fonollosa
2020-07-01
Robust Neural Machine Translation with ASR Errors
Haiyang XueYang FengShuhao GuWei Chen
2020-07-01
An empirical investigation of neural methods for content scoring of science explanations
Brian RiordanSarah BichlerAllison BradfordJennifer King ChenKorah WileyLibby GerardMarcia C. Linn
2020-07-01
Neural Transduction of Letter Position Dyslexia using an Anagram Matrix Representation
Avi Bleiweiss
2020-07-01
Character aware models with similarity learning for metaphor detection
Tarun KumarYashvardhan Sharma
2020-07-01
A Transformer Approach to Contextual Sarcasm Detection in Twitter
Hunter GregorySteven LiPouya MohammadiNatalie TarnRachel DraelosCynthia Rudin
2020-07-01
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal Dependency Parsing
Han HeJinho D. Choi
2020-07-01
KIT's IWSLT 2020 SLT Translation System
Ngoc-Quan PhamFelix SchneiderTuan-Nam NguyenThanh-Le HaThai Son NguyenMaximilian AwiszusSebastian St{\"u}kerAlex Waibeler
2020-07-01
End-to-End Simultaneous Translation System for IWSLT2020 Using Modality Agnostic Meta-Learning
Hou Jeung HanMohd Abbas ZaidiSathish Reddy IndurthiNikhil Kumar LakumarapuBeomseok LeeSangha Kim
2020-07-01
End-to-End Offline Speech Translation System for IWSLT 2020 using Modality Agnostic Meta-Learning
Nikhil Kumar LakumarapuBeomseok LeeSathish Reddy IndurthiHou Jeung HanMohd Abbas ZaidiSangha Kim
2020-07-01
SRPOL's System for the IWSLT 2020 End-to-End Speech Translation Task
Tomasz PotapczykPawel Przybysz
2020-07-01
The AFRL IWSLT 2020 Systems: Work-From-Home Edition
Brian OreEric HansenTim AndersonJeremy Gwinnup
2020-07-01
OPPO's Machine Translation System for the IWSLT 2020 Open Domain Translation Task
Qian ZhangXiaopu LiDawei DangTingxun ShiDi AiZhengshan XueJie Hao
2020-07-01
CASIA's System for IWSLT 2020 Open Domain Translation
Qian WangYuchen LiuCong MaYu LuYining WangLong ZhouYang ZhaoJiajun ZhangChengqing Zong
2020-07-01
Deep Blue Sonics' Submission to IWSLT 2020 Open Domain Translation Task
Enmin SuYi Ren
2020-07-01
University of Tsukuba's Machine Translation System for IWSLT20 Open Domain Translation Task
Hongyi CuiYizhen WeiShohei IidaTakehito UtsuroMasaaki Nagata
2020-07-01
Xiaomi's Submissions for IWSLT 2020 Open Domain Translation Task
Yuhui SunMengxue GuoXiang LiJianwei CuiBin Wang
2020-07-01
The HW-TSC Video Speech Translation System at IWSLT 2020
Minghan WangHao YangYao DengYing QinLizhi LeiDaimeng WeiHengchao ShangNing XieXiaochun LiJiaxian Guo
2020-07-01
Towards Stream Translation: Adaptive Computation Time for Simultaneous Machine Translation
Felix SchneiderAlex Waibeler
2020-07-01
Compressing Neural Machine Translation Models with 4-bit Precision
Alham Fikri AjiKenneth Heafield
2020-07-01
Training and Inference Methods for High-Coverage Neural Machine Translation
Michael YangYixin LiuRahul Mayuranath
2020-07-01
Expand and Filter: CUNI and LMU Systems for the WNGT 2020 Duolingo Shared Task
Jind{\v{r}}ich Libovick{\'y}Zden{\v{e}}k KasnerJind{\v{r}}ich HelclOnd{\v{r}}ej Du{\v{s}}ek
2020-07-01
The NiuTrans System for WNGT 2020 Efficiency Task
Chi HuBei LiYinqiao LiYe LinYanyang LiChenglong WangTong XiaoJingbo Zhu
2020-07-01
Efficient and High-Quality Neural Machine Translation with OpenNMT
Guillaume KleinDakun ZhangCl{\'e}ment ChouteauJosep CregoJean Senellart
2020-07-01
Improving Document-Level Neural Machine Translation with Domain Adaptation
Sami Ul HaqSadaf Abdul RaufArslan ShoukatNoor-e- Hira
2020-07-01
CopyBERT: A Unified Approach to Question Generation with Self-Attention
Stalin VaranasiSaadullah AminGuenter Neumann
2020-07-01
How to Tame Your Data: Data Augmentation for Dialog State Tracking
Adam SummervilleJordan HashemiJames Ryanwilliam ferguson
2020-07-01
Methods for Extracting Information from Messages from Primary Care Providers to Specialists
Xiyu DingMichael BarnettAteev MehrotraTimothy Miller
2020-07-01
Generating Medical Reports from Patient-Doctor Conversations Using Sequence-to-Sequence Models
Seppo EnarviMarilisa AmoiaMiguel Del-Agua TebaBrian DelaneyFrank DiehlStefan HahnKristina HarrisLiam McGrathYue PanJoel PintoLuca RubiniMiguel RuizGag SingheepFabian StemmerWeiyi SunPaul VozilaThomas LinRanjani Ramamurthy
2020-07-01
Enhancing Transformer with Sememe Knowledge
Yuhui ZhangChenghao YangZhengping ZhouZhiyuan Liu
2020-07-01
Grapheme-to-Phoneme Conversion with a Multilingual Transformer Model
Omnia ElSaadanyBenjamin Suter
2020-07-01
Frustratingly Easy Multilingual Grapheme-to-Phoneme Conversion
Nikhil PrabhuKatharina Kann
2020-07-01
Leveraging Principal Parts for Morphological Inflection
Ling LiuMans Hulden
2020-07-01
Data Augmentation for Transformer-based G2P
Zach RyanMans Hulden
2020-07-01
HausaMT v1.0: Towards English--Hausa Neural Machine Translation
Adewale Akinfaderin
2020-07-01
Image-level Harmonization of Multi-Site Data using Image-and-Spatial Transformer Networks
| R. RobinsonQ. DouD. C. CastroK. KamnitsasM. de GrootR. M. SummersD. RueckertB. Glocker
2020-06-30
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
| Dmitry LepikhinHyoukJoong LeeYuanzhong XuDehao ChenOrhan FiratYanping HuangMaxim KrikunNoam ShazeerZhifeng Chen
2020-06-30
Correction of Faulty Background Knowledge based on Condition Aware and Revise Transformer for Question Answering
Xinyan ZhaoXiao FengHaoming ZhongJun YaoHuanhuan Chen
2020-06-30
BERTERS: Multimodal Representation Learning for Expert Recommendation System with Transformer
N. Nikzad-KhasmakhiM. A. BalafarM. Reza Feizi-DerakhshiCina Motamed
2020-06-30
Simplifying Models with Unlabeled Output Data
Sang Michael XieTengyu MaPercy Liang
2020-06-29
Predicting Length of Stay in the Intensive Care Unit with Temporal Pointwise Convolutional Networks
| Emma RocheteauPietro LiòStephanie Hyland
2020-06-29
A Transformer-based joint-encoding for Emotion Recognition and Sentiment Analysis
| Jean-Benoit DelbrouckNoé TitsMathilde BrousmicheStéphane Dupont
2020-06-29
Multi-Head Attention: Collaborate Instead of Concatenate
| Jean-Baptiste CordonnierAndreas LoukasMartin Jaggi
2020-06-29
Interpreting Hierarchical Linguistic Interactions in DNNs
Die ZhangHuilin ZhouXiaoyi BaoDa HuoRuizhao ChenXu ChengHao ZhangMengyue WuQuanshi Zhang
2020-06-29
Rethinking Positional Encoding in Language Pre-training
| Guolin KeDi HeTie-Yan Liu
2020-06-28
Self-Attention Networks for Intent Detection
Sevinj YolchuyevaGéza NémethBálint Gyires-Tóth
2020-06-28
Bottom-Up Human Pose Estimation by Ranking Heatmap-Guided Adaptive Keypoint Estimates
| Ke SunZigang GengDepu MengBin XiaoDong LiuZhaoxiang ZhangJingdong Wang
2020-06-28
Mind The Facts: Knowledge-Boosted Coherent Abstractive Text Summarization
Beliz GunelChenguang ZhuMichael ZengXuedong Huang
2020-06-27
What they do when in doubt: a study of inductive biases in seq2seq learners
Eugene KharitonovRahma Chaabouni
2020-06-26
TURL: Table Understanding through Representation Learning
| Xiang DengHuan SunAlyssa LeesYou WuCong Yu
2020-06-26
BERTology Meets Biology: Interpreting Attention in Protein Language Models
| Jesse VigAli MadaniLav R. VarshneyCaiming XiongRichard SocherNazneen Fatema Rajani
2020-06-26
Conditional Set Generation with Transformers
Adam R KosiorekHyunjik KimDanilo J Rezende
2020-06-26
Learning Source Phrase Representations for Neural Machine Translation
Hongfei XuJosef van GenabithDeyi XiongQiuhui LiuJingyi Zhang
2020-06-25
Self-Segregating and Coordinated-Segregating Transformer for Focused Deep Multi-Modular Network for Visual Question Answering
Chiranjib Sur
2020-06-25
SACT: Self-Aware Multi-Space Feature Composition Transformer for Multinomial Attention for Video Captioning
Chiranjib Sur
2020-06-25
Differentiable Window for Dynamic Local Attention
Thanh-Tung NguyenXuan-Phi NguyenShafiq JotyXiaoli Li
2020-06-24
Hybrid Spatio-Temporal Graph Convolutional Network: Improving Traffic Prediction with Navigation Data
| Rui DaiShenkun XuQian GuChenguang JiKaikui Liu
2020-06-23
Bach or Mock? A Grading Function for Chorales in the Style of J.S. Bach
| Alexander FangAlisa LiuPrem SeetharamanBryan Pardo
2020-06-23
Self-supervised edge features for improved Graph Neural Network training
| Arijit SehanobishNeal G. RavindraDavid van Dijk
2020-06-23
A Self-Attention Network based Node Embedding Model
Dai Quoc NguyenTu Dinh NguyenDinh Phung
2020-06-22
Exploring Software Naturalness through Neural Language Models
Luca BurattiSaurabh PujarMihaela BorneaScott McCarleyYunhui ZhengGaetano RossielloAlessandro MorariJim LaredoVeronika ThostYufan ZhuangGiacomo Domeniconi
2020-06-22
AdvAug: Robust Adversarial Augmentation for Neural Machine Translation
Yong ChengLu JiangWolfgang MachereyJacob Eisenstein
2020-06-21
The NYU-CUBoulder Systems for SIGMORPHON 2020 Task 0 and Task 2
Assaf SingerKatharina Kann
2020-06-21
Off-Policy Self-Critical Training for Transformer in Visual Paragraph Generation
Shiyang YanYang HuaNeil M. Robertson
2020-06-21
A Universal Representation Transformer Layer for Few-Shot Image Classification
| Lu LiuWilliam HamiltonGuodong LongJing JiangHugo Larochelle
2020-06-21
Memory Transformer
Mikhail S. BurtsevGrigory V. Sapunov
2020-06-20
End-to-end deep metamodeling to calibrate and optimize energy loads
Max CohenMaurice CharbitSylvain Le CorffMarius PredaGilles Nozière
2020-06-19
Boosting Objective Scores of Speech Enhancement Model through MetricGAN Post-Processing
Szu-Wei FuChien-Feng LiaoTsun-An HsiehKuo-Hsuan HungSyu-Siang WangCheng YuHeng-Cheng KuoRyandhimas E. ZezarioYou-Jin LiShang-Yi ChuangYen-Ju LuYu Tsao
2020-06-18
Multi-branch Attentive Transformer
| Yang FanShufang XieYingce XiaLijun WuTao QinXiang-Yang LiTie-Yan Liu
2020-06-18
I-BERT: Inductive Generalization of Transformer to Arbitrary Context Lengths
| Hyoungwook NamSeung Byum SeoVikram Sharma MailthodyNoor MichaelLan Li
2020-06-18
SEAL: Segment-wise Extractive-Abstractive Long-form Text Summarization
Yao ZhaoMohammad SalehPeter J. Liu
2020-06-18
Sparse GPU Kernels for Deep Learning
Trevor GaleMatei ZahariaCliff YoungErich Elsen
2020-06-18
SenWave: Monitoring the Global Sentiments under the COVID-19 Pandemic
Qiang YangHind AlamroSomayah AlbaradeiAdil SalhiXiaoting LvChangsheng MaManal AlshehriInji JaberFaroug TifrateneWei WangTakashi GojoboriCarlos M. DuarteXin GaoXiangliang Zhang
2020-06-18
Intelligent Protection & Classification of Transients in Two-Core Symmetric Phase Angle Regulating Transformers
Pallav Kumar BeraCan Isik
2020-06-17
Automatically Ranked Russian Paraphrase Corpus for Text Generation
Vadim GudkovOlga MitrofanovaElizaveta Filippskikh
2020-06-17
Learning Visual Commonsense for Robust Scene Graph Generation
Alireza ZareianZhecan WangHaoxuan YouShih-Fu Chang
2020-06-17
Modeling Graph Structure via Relative Position for Better Text Generation from Knowledge Graphs
Martin SchmittLeonardo F. R. RibeiroPhilipp DufterIryna GurevychHinrich Schütze
2020-06-16
Fine-grained Human Evaluation of Transformer and Recurrent Approaches to Neural Machine Translation for English-to-Chinese
| Yuying YeAntonio Toral
2020-06-15
On the Multi-Property Extraction and Beyond
Tomasz DwojakMichał PietruszkaŁukasz BorchmannFilip GralińskiJakub Chłędowski
2020-06-15
Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset
Andrei AndrusenkoAleksandr LaptevIvan Medennikov
2020-06-15
Differentiable Neural Architecture Transformation for Reproducible Architecture Improvement
Do-Guk KimHeung-Chang Lee
2020-06-15
Multi-Image Summarization: Textual Summary from a Set of Cohesive Images
Nicholas TrieuSebastian GoodmanPradyumna NarayanaKazoo SoneRadu Soricut
2020-06-15
Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya
Abrhalei TelaAbraham WoubieVille Hautamaki
2020-06-13
Guided Transformer: Leveraging Multiple External Sources for Representation Learning in Conversational Search
Helia HashemiHamed ZamaniW. Bruce Croft
2020-06-13
Temporal Fusion Network for Temporal Action Localization:Submission to ActivityNet Challenge 2020 (Task E)
Zhiwu QingXiang WangYongpeng SangChangxin GaoShiwei ZhangNong Sang
2020-06-13
Modelling High-Level Mathematical Reasoning in Mechanised Declarative Proofs
Wenda LiLei YuYuhuai WuLawrence C. Paulson
2020-06-13
Comparing Natural Language Processing Techniques for Alzheimer's Dementia Prediction in Spontaneous Speech
Thomas SearleZina IbrahimRichard Dobson
2020-06-12
Unmasking the Inductive Biases of Unsupervised Object Representations for Video Sequences
Marissa A. WeisKashyap ChittaYash SharmaWieland BrendelMatthias BethgeAndreas GeigerAlexander S. Ecker
2020-06-12
Dance Revolution: Long Sequence Dance Generation with Music via Curriculum Learning
| Ruozi HuangHuang HuWei WuKei SawadaMi Zhang
2020-06-11
FastPitch: Parallel Text-to-speech with Pitch Prediction
Adrian Łańcucki
2020-06-11
Extrapolation for Large-batch Training in Deep Learning
Tao LinLingjing KongSebastian U. StichMartin Jaggi
2020-06-10
Graph-Aware Transformer: Is Attention All Graphs Need?
Sanghyun YooYoung-Seok KimKang Hyun LeeKuhwan JeongJunhwi ChoiHoshik LeeYoung Sang Choi
2020-06-09
HausaMT v1.0: Towards English-Hausa Neural Machine Translation
Adewale Akinfaderin
2020-06-09
Linformer: Self-Attention with Linear Complexity
| Sinong WangBelinda Z. LiMadian KhabsaHan FangHao Ma
2020-06-08
Modeling Discourse Structure for Document-level Neural Machine Translation
Junxuan ChenXiang LiJiarui ZhangChulun ZhouJianwei CuiBin WangJinsong Su
2020-06-08
MultiSpeech: Multi-Speaker Text to Speech with Transformer
Mingjian ChenXu TanYi RenJin XuHao SunSheng ZhaoTao Qin
2020-06-08
Learning to Count Words in Fluent Speech enables Online Speech Recognition
| George SterpuChristian SaamNaomi Harte
2020-06-08
Wat zei je? Detecting Out-of-Distribution Translations with Variational Transformers
Tim Z. XiaoAidan N. GomezYarin Gal
2020-06-08
Learning Texture Transformer Network for Image Super-Resolution
| Fuzhi YangHuan YangJianlong FuHongtao LuBaining Guo
2020-06-07
Challenges and Thrills of Legal Arguments
Anurag PallaproluRadha VaidyaAditya Swaroop Attawar
2020-06-06
Masked Language Modeling for Proteins via Linearly Scalable Long-Context Transformers
Krzysztof ChoromanskiValerii LikhosherstovDavid DohanXingyou SongJared DavisTamas SarlosDavid BelangerLucy ColwellAdrian Weller
2020-06-05
GMAT: Global Memory Augmentation for Transformers
| Ankit GuptaJonathan Berant
2020-06-05
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
| Zihang DaiGuokun LaiYiming YangQuoc V. Le
2020-06-05
An Overview of Neural Network Compression
James O' Neill
2020-06-05
End-to-End Speech-Translation with Knowledge Distillation: [email protected]
Marco GaidoMattia Antonino Di GangiMatteo NegriMarco Turchi
2020-06-04
On the Predictive Power of Neural Language Models for Human Real-Time Comprehension Behavior
Ethan Gotlieb WilcoxJon GauthierJennifer HuPeng QianRoger Levy
2020-06-02
Subjective Question Answering: Deciphering the inner workings of Transformers in the realm of subjectivity
Lukas Muttenthaler
2020-06-02
Online Versus Offline NMT Quality: An In-depth Analysis on English-German and German-English
Maha ElbayadMichael UstaszewskiEmmanuelle Esperança-RodierFrancis Brunet ManquatLaurent Besacier
2020-06-01
Context-based Transformer Models for Answer Sentence Selection
Ivano LauriolaAlessandro Moschitti
2020-06-01
Unsupervised Sparse-view Backprojection via Convolutional and Spatial Transformer Networks
Xueqing LiuPaul Sajda
2020-06-01
Image Search With Text Feedback by Visiolinguistic Attention Learning
| Yanbei Chen Shaogang Gong Loris Bazzani
2020-06-01
Few-Shot Learning of Part-Specific Probability Space for 3D Shape Segmentation
Lingjing Wang Xiang Li Yi Fang
2020-06-01
RDCFace: Radial Distortion Correction for Face Recognition
He Zhao Xianghua Ying Yongjie Shi Xin Tong Jingsi Wen Hongbin Zha
2020-06-01
ActBERT: Learning Global-Local Video-Text Representations
Linchao Zhu Yi Yang
2020-06-01
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
| Zhewei YaoAmir GholamiSheng ShenKurt KeutzerMichael W. Mahoney
2020-06-01
BPGC at SemEval-2020 Task 11: Propaganda Detection in News Articles with Multi-Granularity Knowledge Sharing and Linguistic Features based Ensemble Learning
Rajaswa PatilSomesh SinghSwati Agarwal
2020-05-31
CNRL at SemEval-2020 Task 5: Modelling Causal Reasoning in Language with Multi-Head Self-Attention Weights based Counterfactual Detection
Rajaswa PatilVeeky Baths
2020-05-31
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
| Hanrui WangZhanghao WuZhijian LiuHan CaiLigeng ZhuChuang GanSong Han
2020-05-28
Variational Neural Machine Translation with Normalizing Flows
Hendra SetiawanMatthias SperberUdhay NallasamyMatthias Paulik
2020-05-28
Empirical Evaluation of Pretraining Strategies for Supervised Entity Linking
Thibault FévryNicholas FitzGeraldLivio Baldini SoaresTom Kwiatkowski
2020-05-28
General-Purpose User Embeddings based on Mobile App Usage
| Junqi ZhangBing BaiYe LinJian LiangKun BaiFei Wang
2020-05-27
Permutation Matters: Anisotropic Convolutional Layer for Learning on Point Clouds
| Zhongpai GaoGuangtao ZhaiJunchi YanXiaokang Yang
2020-05-27
Insertion-Based Modeling for End-to-End Automatic Speech Recognition
Yuya FujitaShinji WatanabeMotoi OmachiXuankai Chan
2020-05-27
End-to-End Object Detection with Transformers
| Nicolas CarionFrancisco MassaGabriel SynnaeveNicolas UsunierAlexander KirillovSergey Zagoruyko
2020-05-26
GECToR -- Grammatical Error Correction: Tag, Not Rewrite
| Kostiantyn OmelianchukVitaliy AtrasevychArtem ChernodubOleksandr Skurzhanskyi
2020-05-26
Guiding Symbolic Natural Language Grammar Induction via Transformer-Based Sequence Probabilities
Ben GoertzelAndres Suarez MadrigalGino Yu
2020-05-26
Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition
Lei KangPau RibaMarçal RusiñolAlicia FornésMauricio Villegas
2020-05-26
Deep Learning Models for Automatic Summarization
Pirmin Lemberger
2020-05-25
The Unreasonable Volatility of Neural Machine Translation Models
| Marzieh FadaeeChristof Monz
2020-05-25
Adversarial NLI for Factual Correctness in Text Summarisation Models
Mario BarrantesBenedikt HerudekRichard Wang
2020-05-24
Devising Malware Characterstics using Transformers
Simra ShahidTanmay SinghYash SharmaKapil Sharma
2020-05-23
Character-level Transformer-based Neural Machine Translation
Nikolay BanarWalter DaelemansMike Kestemont
2020-05-22
A Generative Approach to Titling and Clustering Wikipedia Sections
Anjalie FieldSascha RotheSimon BaumgartnerCong YuAbe Ittycheriah
2020-05-22
Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection
Danni LiuGerasimos SpanakisJan Niehues
2020-05-22
Transformer-based Context-aware Sarcasm Detection in Conversation Threads from Social Media
Xiangjue DongChangmao LiJinho D. Choi
2020-05-22
Simplified Self-Attention for Transformer-based End-to-End Speech Recognition
Haoneng LuoShiliang ZhangMing LeiLei Xie
2020-05-21
Leveraging Text Data Using Hybrid Transformer-LSTM Based End-to-End ASR in Transfer Learning
Zhiping ZengVan Tung PhamHaihua XuYerbolat KhassanovEng Siong ChngChongjia NiBin Ma
2020-05-21
Applying the Transformer to Character-level Transduction
Shijie WuRyan CotterellMans Hulden
2020-05-20
Relative Positional Encoding for Speech Recognition and Direct Translation
Ngoc-Quan PhamThanh-Le HaTuan-Nam NguyenThai-Son NguyenElizabeth SaleskySebastian StuekerJan NiehuesAlexander Waibel
2020-05-20
A Further Study of Unsupervised Pre-training for Transformer Based Speech Recognition
Dongwei JiangWubo LiRuixiong ZhangMiao CaoNe LuoYang HanWei ZouXiangang Li
2020-05-20
Comparing Transformers and RNNs on predicting human sentence processing data
Danny MerkxStefan L. Frank
2020-05-19
Should we hard-code the recurrence concept or learn it instead ? Exploring the Transformer architecture for Audio-Visual Speech Recognition
| George SterpuChristian SaamNaomi Harte
2020-05-19
Sketch-BERT: Learning Sketch Bidirectional Encoder Representation from Transformers by Self-supervised Learning of Sketch Gestalt
Hangyu LinYanwei FuYu-Gang JiangXiangyang Xue
2020-05-19
Exploring Transformers for Large-Scale Speech Recognition
Liang LuChangliang LiuJinyu LiYifan Gong
2020-05-19
A Transformer-based Embedding Model for Personalized Product Search
Keping BiQingyao AiW. Bruce Croft
2020-05-18
Efficient Wait-k Models for Simultaneous Machine Translation
Maha ElbayadLaurent BesacierJakob Verbeek
2020-05-18
Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction
Cunjun YuXiao MaJiawei RenHaiyu ZhaoShuai Yi
2020-05-18
Many-to-Many Voice Transformer Network
Hirokazu KameokaWen-Chin HuangKou TanakaTakuhiro KanekoNobukatsu HojoTomoki Toda
2020-05-18
GPT-too: A language-model-first approach for AMR-to-text generation
| Manuel MagerRamon Fernandez AstudilloTahira NaseemMd Arafat SultanYoung-Suk LeeRadu FlorianSalim Roukos
2020-05-18
Weak-Attention Suppression For Transformer Based Speech Recognition
Yangyang ShiYongqiang WangChunyang WuChristian FuegenFrank ZhangDuc LeChing-Feng YehMichael L. Seltzer
2020-05-18
Mask CTC: Non-Autoregressive End-to-End ASR with CTC and Mask Predict
Yosuke HiguchiShinji WatanabeNanxin ChenTetsuji OgawaTetsunori Kobayashi
2020-05-18
A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer
| Vladimir IashinEsa Rahtu
2020-05-17
Building a Hebrew Semantic Role Labeling Lexical Resource from Parallel Movie Subtitles
Ben EyalMichael Elhadad
2020-05-17
Conformer: Convolution-augmented Transformer for Speech Recognition
| Anmol GulatiJames QinChung-Cheng ChiuNiki ParmarYu ZhangJiahui YuWei HanShibo WangZhengdong ZhangYonghui WuRuoming Pang
2020-05-16
Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension
Hongyu GongYelong ShenDian YuJianshu ChenDong Yu
2020-05-16
Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory
Chunyang WuYongqiang WangYangyang ShiChing-Feng YehFrank Zhang
2020-05-16
IntelliCode Compose: Code Generation Using Transformer
Alexey SvyatkovskiyShao Kun DengShengyu FuNeel Sundaresan
2020-05-16
Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition
Zhengkun TianJiangyan YiJianhua TaoYe BaiShuai ZhangZhengqi Wen
2020-05-16
COVID-Twitter-BERT: A Natural Language Processing Model to Analyse COVID-19 Content on Twitter
| Martin MüllerMarcel SalathéPer E Kummervold
2020-05-15
Neural Entity Linking on Technical Service Tickets
Nadja KurzFelix HamannAdrian Ulges
2020-05-15
Finding Experts in Transformer Models
Xavier SuauLuca ZappellaNicholas Apostoloff
2020-05-15
JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech without Explicit Alignment
Dan LimWon JangGyeonghwan OHyeyeong ParkBongwan KimJesam Yoon
2020-05-15
The Unstoppable Rise of Computational Linguistics in Deep Learning
James Henderson
2020-05-13
Discriminative Multi-modality Speech Recognition
| Bo XuCheng LuYandong GuoJacob Wang
2020-05-12
Simultaneous paraphrasing and translation by fine-tuning Transformer models
Rakesh Chada
2020-05-12
SOLOIST: Few-shot Task-Oriented Dialog with A Single Pre-trained Auto-regressive Model
Baolin PengChunyuan LiJinchao LiShahin ShayandehLars LidenJianfeng Gao
2020-05-11
Hierarchical Attention Transformer Architecture For Syntactic Spell Correction
Abhishek NiranjanM Ali Basha ShaikKushal Verma
2020-05-11
Listen Attentively, and Spell Once: Whole Sentence Generation via a Non-Autoregressive Architecture for Low-Latency Speech Recognition
Ye BaiJiangyan YiJianhua TaoZhengkun TianZhengqi WenShuai Zhang
2020-05-11
MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
| Jie LeiLiwei WangYelong ShenDong YuTamara L. BergMohit Bansal
2020-05-11
On the Generation of Medical Dialogues for COVID-19
| Wenmian YangGuangtao ZengBowen TanZeqian JuSubrato ChakravortyXuehai HeShu ChenXingyi YangQingyang WuZhou YuEric XingPengtao Xie
2020-05-11
Epipolar Transformers
| Yihui HeRui YanKaterina FragkiadakiShoou-I Yu
2020-05-10
Transformer Based Language Models for Similar Text Retrieval and Ranking
Javed Qadrud-DinAshraf Bah RabiouRyan WalkerRavi SoniMartin GajekGabriel PackAkhil Rangaraj
2020-05-10
SocialTrans: A Deep Sequential Model with Social Information for Web-Scale Recommendation Systems
Qiaoan ChenHao GuLingling YiYishi LinPeng HeChuan ChenYangqiu Song
2020-05-09
It's Morphin' Time! Combating Linguistic Discrimination with Inflectional Perturbations
| Samson TanShafiq JotyMin-Yen KanRichard Socher
2020-05-09
schuBERT: Optimizing Elements of BERT
Ashish KhetanZohar Karnin
2020-05-09
Character Matters: Video Story Understanding with Character-Aware Relations
Shijie GengJi ZhangZuohui FuPeng GaoHang ZhangGerard de Melo
2020-05-09
Mapping Natural Language Instructions to Mobile UI Action Sequences
| Yang LiJiacong HeXin ZhouYuan ZhangJason Baldridge
2020-05-07
A Systematic Assessment of Syntactic Generalization in Neural Language Models
Jennifer HuJon GauthierPeng QianEthan WilcoxRoger P. Levy
2020-05-07
An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining
Yifan PengQingyu ChenZhiyong Lu
2020-05-06
The Cascade Transformer: an Application for Efficient Answer Sentence Selection
| Luca SoldainiAlessandro Moschitti
2020-05-05
Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change
Hongfei XuJosef van GenabithDeyi XiongQiuhui Liu
2020-05-05
OpinionDigest: A Simple Framework for Opinion Summarization
Yoshihiko SuharaXiaolan WangStefanos AngelidisWang-Chiew Tan
2020-05-05
Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture
Christopher BrixParnia BaharHermann Ney
2020-05-04
An Accurate Model for Predicting the (Graded) Effect of Context in Word Similarity Based on Bert
Wei BaoHongshu CheJiandong Zhang
2020-05-03
Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints
Zhenyi WangXiaoyang WangBang AnDong YuChangyou Chen
2020-05-03
Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation
| Xuanli HeGholamreza HaffariMohammad Norouzi
2020-05-03
Quantifying Attention Flow in Transformers
| Samira AbnarWillem Zuidema
2020-05-02
Measuring and Reducing Non-Multifact Reasoning in Multi-hop Question Answering
Harsh TrivediNiranjan BalasubramanianTushar KhotAshish Sabharwal
2020-05-02
Synthesizer: Rethinking Self-Attention in Transformer Models
Yi TayDara BahriDonald MetzlerDa-Cheng JuanZhe ZhaoChe Zheng
2020-05-02
Hard-Coded Gaussian Attention for Neural Machine Translation
Weiqiu YouSimeng SunMohit Iyyer
2020-05-02
Contrastive Self-Supervised Learning for Commonsense Reasoning
| Tassilo KleinMoin Nabi
2020-05-02
The AVA-Kinetics Localized Human Actions Video Dataset
Ang LiMeghana ThotakuriDavid A. RossJoão CarreiraAlexander VostrikovAndrew Zisserman
2020-05-01
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie LiYen-Chun ChenYu ChengZhe GanLicheng YuJingjing Liu
2020-05-01
A Transformer-based Approach for Source Code Summarization
| Wasi Uddin AhmadSaikat ChakrabortyBaishakhi RayKai-Wei Chang
2020-05-01
Multi-scale Transformer Language Models
Sandeep SubramanianRonan CollobertMarc'Aurelio RanzatoY-Lan Boureau
2020-05-01
Event Clustering within News Articles
Faik Kerem {\"O}rsS{\"u}veyda YeniterziReyyan Yeniterzi
2020-05-01
Detecting Direct Speech in Multilingual Collection of 19th-century Novels
Joanna ByszukMicha{\l} Wo{\'z}niakMike KestemontAlbert Le{\'s}niakWojciech {\L}ukasikArtjoms {\v{S}}e{\c{l}}aMaciej Eder
2020-05-01
ASU\_OPTO at OSACT4 - Offensive Language Detection for Arabic text
Amr KelegSamhaa R. El-BeltagyMahmoud Khalil
2020-05-01
Scaling Language Data Import/Export with a Data Transformer Interface
Nicholas BuckeridgeBen Foley
2020-05-01
Aggression Identification in Social Media: a Transfer Learning Based Approach
RamiFaneva risoaJosiane Mothe
2020-05-01
IRIT at TRAC 2020
RamiFaneva risoaJosiane Mothe
2020-05-01
Multilingual Joint Fine-tuning of Transformer models for identifying Trolling, Aggression and Cyberbullying at TRAC 2020
| Sudhanshu MishraShivangi PrasadShubhanshu Mishra
2020-05-01
On the Influence of Coreference Resolution on Word Embeddings in Lexical-semantic Evaluation Tasks
Alex HenleinerAlex Mehlerer
2020-05-01
Chinese Discourse Parsing: Model and Evaluation
Lin Chuan-AnShyh-Shiun HungHen-Hsen HuangHsin-Hsi Chen
2020-05-01
DecOp: A Multilingual and Multi-domain Corpus For Detecting Deception In Typed Text
Pasquale CapuozzoIvano LauriolaCarlo StrapparavaFabio AiolliGiuseppe Sartori
2020-05-01
Paraphrase Generation and Evaluation on Colloquial-Style Sentences
Eetu Sj{\"o}blomMathias CreutzYves Scherrer
2020-05-01
Building a Task-oriented Dialog System for Languages with no Training Data: the Case for Basque
Maddalen L{\'o}pez de LacalleXabier SaralegiI{\~n}aki San Vicente
2020-05-01
Linguistically Informed Hindi-English Neural Machine Translation
Vikrant GoyalPruthwik MishraDipti Misra Sharma
2020-05-01
Corpora for Document-Level Neural Machine Translation
Siyou LiuXiaojun Zhang
2020-05-01
Exploring Transformer Text Generation for Medical Dataset Augmentation
Ali Amin-NejadJulia IveSumithra Velupillai
2020-05-01
Much Ado About Nothing -- Identification of Zero Copulas in Hungarian Using an NMT Model
Andrea D{\"o}m{\"o}t{\"o}rZijian Gy{\H{o}}z{\H{o}} YangAttila Nov{\'a}k
2020-05-01
ParlVote: A Corpus for Sentiment Analysis of Political Debates
Gavin AbercrombieRiza Batista-Navarro
2020-05-01
Cross-lingual and Cross-domain Evaluation of Machine Reading Comprehension with Squad and CALOR-Quest Corpora
Delphine CharletGeraldine DamnatiFrederic Bechetgabriel marzinottoJohannes Heinecke
2020-05-01
Contextualized Embeddings based Transformer Encoder for Sentence Similarity Modeling in Answer Selection Task
| Md Tahmid Rahman LaskarJimmy Xiangji HuangEnamul Hoque
2020-05-01
Evaluation of Dataset Selection for Pre-Training and Fine-Tuning Transformer Language Models for Clinical Question Answering
Sarvesh SoniKirk Roberts
2020-05-01
Minority Positive Sampling for Switching Points - an Anecdote for the Code-Mixing Language Modeling
Arindam ChatterjereVineeth GupthaParul ChopraAmitava Das
2020-05-01
Seq2SeqPy: A Lightweight and Customizable Toolkit for Neural Sequence-to-Sequence Modeling
Raheel QaderFran{\c{c}}ois PortetCyril Labbe
2020-05-01
SegaBERT: Pre-training of Segment-aware BERT for Language Understanding
He BaiPeng ShiJimmy LinLuchen TanKun XiongWen GaoMing Li
2020-04-30
Addressing Zero-Resource Domains Using Document-Level Context in Neural Machine Translation
Dario StojanovskiAlexander Fraser
2020-04-30
Progressive Transformers for End-to-End Sign Language Production
| Ben SaundersNecati Cihan CamgozRichard Bowden
2020-04-30
Accurate Word Alignment Induction from Neural Machine Translation
Yun ChenYang LiuGuanhua ChenXin JiangQun Liu
2020-04-30
Character-Level Translation with Self-attention
Yingqiang GaoNikola I. NikolovYuhuang HuRichard H. R. Hahnloser
2020-04-30
Semantic Triple Encoder for Fast Open-Set Link Prediction
Bo WangTao ShenGuodong LongTianyi ZhouYi Chang
2020-04-30
Self-Supervised and Controlled Multi-Document Opinion Summarization
Hady ElsaharMaximin CoavouxMatthias GalléJos Rozen
2020-04-30
End-to-End Neural Word Alignment Outperforms GIZA++
Thomas ZenkelJoern WuebkerJohn DeNero
2020-04-30
Capsule-Transformer for Neural Machine Translation
Sufeng DuanJuncheng CaoHai Zhao
2020-04-30
Breaking (Global) Barriers in Parallel Stochastic Optimization with Wait-Avoiding Group Averaging
Shigang LiTal Ben-NunGiorgi NadiradzeSalvatore Di GirolamoNikoli DrydenDan AlistarhTorsten Hoefler
2020-04-30
Towards Character-Level Transformer NMT by Finetuning Subword Systems
Jindřich LibovickýAlexander Fraser
2020-04-29
Efficient Document Re-Ranking for Transformers by Precomputing Term Representations
| Sean MacAvaneyFranco Maria NardiniRaffaele PeregoNicola TonellottoNazli GoharianOphir Frieder
2020-04-29
Image Captioning through Image Transformer
Sen HeWentong LiaoHamed R. TavakoliMichael YangBodo RosenhahnNicolas Pugeault
2020-04-29
Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning
Alexandre TamborrinoNicola PellicanoBaptiste PannierPascal VoitotLouise Naudin
2020-04-29
Image Morphing with Perceptual Constraints and STN Alignment
Noa FishRichard ZhangLilach PerryDaniel Cohen-OrEli ShechtmanConnelly Barnes
2020-04-29
Multiresolution and Multimodal Speech Recognition with Transformers
Georgios ParaskevopoulosSrinivas ParthasarathyAparna KhareShiva Sundaram
2020-04-29
EARL: Speedup Transformer-based Rankers with Pre-computed Representation
Luyu GaoZhuyun DaiJamie Callan
2020-04-28
VD-BERT: A Unified Vision and Dialog Transformer with BERT
Yue WangShafiq JotyMichael R. LyuIrwin KingCaiming XiongSteven C. H. Hoi
2020-04-28
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference
| Ji XinRaphael TangJaejun LeeYaoliang YuJimmy Lin
2020-04-27
Augmenting Transformers with KNN-Based Composite Memory for Dialogue
Angela FanClaire GardentChloe BraudAntoine Bordes
2020-04-27
Lexically Constrained Neural Machine Translation with Levenshtein Transformer
| Raymond Hendy SusantoShamil ChollampattLiling Tan
2020-04-27
Explicitly Modeling Adaptive Depths for Transformer
Yijin LiuFandong MengJie ZhouYufeng ChenJinan Xu
2020-04-27
Experiments with LVT and FRE for Transformer model
Ilshat GibadullinAidar Valeev
2020-04-26
Causal Mediation Analysis for Interpreting Neural NLP: The Case of Gender Bias
Jesse VigSebastian GehrmannYonatan BelinkovSharon QianDaniel NevoYaron SingerStuart Shieber
2020-04-26
Research on Modeling Units of Transformer Transducer for Mandarin Speech Recognition
Li FuXiaoxiao LiLibo Zi
2020-04-26
Choppy: Cut Transformer For Ranked List Truncation
Dara BahriYi TayChe ZhengDonald MetzlerAndrew Tomkins
2020-04-26
Combining Word Embeddings and N-grams for Unsupervised Document Summarization
Zhuolin JiangManaj SrivastavaSanjay KrishnaDavid AkodesRichard Schwartz
2020-04-25
All Word Embeddings from One Embedding
| Sho TakaseSosuke Kobayashi
2020-04-25
Lite Transformer with Long-Short Range Attention
| Zhanghao WuZhijian LiuJi LinYujun LinSong Han
2020-04-24
On Sparsifying Encoder Outputs in Sequence-to-Sequence Models
Biao ZhangIvan TitovRico Sennrich
2020-04-24
FLAT: Chinese NER Using Flat-Lattice Transformer
Xiaonan LiHang YanXipeng QiuXuanjing Huang
2020-04-24
Understanding when spatial transformer networks do not support invariance, and what to do about it
Lukas FinnvedenYlva JanssonTony Lindeberg
2020-04-24
UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer Networks for Offensive Language Detection
Gregor WiedemannSeid Muhie YimamChris Biemann
2020-04-23
MolTrans: Molecular Interaction Transformer for Drug Target Interaction Prediction
Kexin HuangCao XiaoLucas GlassJimeng Sun
2020-04-23
Self-Attention Attribution: Interpreting Information Interactions Inside Transformer
Yaru HaoLi DongFuru WeiKe Xu
2020-04-23
Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription
Andrei AndrusenkoAleksandr LaptevIvan Medennikov
2020-04-22
Logical Natural Language Generation from Open-Domain Tables
| Wenhu ChenJianshu ChenYu SuZhiyu ChenWilliam Yang Wang
2020-04-22
Vector Quantized Contrastive Predictive Coding for Template-based Music Generation
| Gaëtan HadjeresLéopold Crestel
2020-04-21
Joint Cross-Modality Super Resolution
Guy ShachtSharon FogelDov DanonDaniel Cohen-OrIlya Leizerson
2020-04-21
DIET: Lightweight Language Understanding for Dialogue Systems
| Tanja BunkDaksh VarshneyaVladimir VlasovAlan Nichol
2020-04-21
Contextual Neural Machine Translation Improves Translation of Cataphoric Pronouns
KayYen WongSameen MarufGholamreza Haffari
2020-04-21
Keyphrase Generation with Cross-Document Attention
| Shizhe DiaoYan SongTong Zhang
2020-04-21
Learning Local Neighboring Structure for Robust 3D Shape Representation
| Zhongpai GaoGuangtao ZhaiJuyong ZhangJunchi YanYiyan YangXiaokang Yang
2020-04-21
A Review-based Transformer Model for Personalized Product Search
Keping BiQingyao AiW. Bruce Croft
2020-04-20
WHALETRANS: E2E WHisper to nAturaL spEech conversion using modified TRANSformer network
Abhishek NiranjanMukesh SharmaSai Bharath Chandra GuthaM Ali Basha Shaik
2020-04-20
Transformer Reasoning Network for Image-Text Matching and Retrieval
| Nicola MessinaFabrizio FalchiAndrea EsuliGiuseppe Amato
2020-04-20
Deep-COVID: Predicting COVID-19 From Chest X-Ray Images Using Deep Transfer Learning
| Shervin MinaeeRahele KafiehMilan SonkaShakib YazdaniGhazaleh Jamalipour Soufi
2020-04-20
Motion Segmentation using Frequency Domain Transformer Networks
Hafez FaraziSven Behnke
2020-04-18
Understanding the Difficulty of Training Transformers
| Liyuan LiuXiaodong LiuJianfeng GaoWeizhu ChenJiawei Han
2020-04-17
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
Yekun ChaiJin ShuoXinwen Hou
2020-04-17
Transform and Tell: Entity-Aware News Image Captioning
| Alasdair TranAlexander MathewsLexing Xie
2020-04-17
Enriching the Transformer with Linguistic and Semantic Factors for Low-Resource Machine Translation
Jordi Armengol-EstapéMarta R. Costa-jussàCarlos Escolano
2020-04-17
ETC: Encoding Long and Structured Data in Transformers
Joshua AinslieSantiago OntanonChris AlbertiPhilip PhamAnirudh RavulaSumit Sanghai
2020-04-17
Non-Autoregressive Machine Translation with Latent Alignments
| Chitwan SahariaWilliam ChanSaurabh SaxenaMohammad Norouzi
2020-04-16
Towards Instance-Level Parser Selection for Cross-Lingual Transfer of Dependency Parsers
Robert LitschkoIvan VulićŽeljko AgićGoran Glavaš
2020-04-16
Entities as Experts: Sparse Memory Access with Entity Supervision
Thibault FévryLivio Baldini SoaresNicholas FitzGeraldEunsol ChoiTom Kwiatkowski
2020-04-15
SPECTER: Document-level Representation Learning using Citation-informed Transformers
| Arman CohanSergey FeldmanIz BeltagyDoug DowneyDaniel S. Weld
2020-04-15
Training with Quantization Noise for Extreme Model Compression
| Angela FanPierre StockBenjamin GrahamEdouard GraveRemi GribonvalHerve JegouArmand Joulin
2020-04-15
Transformer based Grapheme-to-Phoneme Conversion
Sevinj YolchuyevaGéza NémethBálint Gyires-Tóth
2020-04-14
ProFormer: Towards On-Device LSH Projection Based Transformers
Chinnadhurai SankarSujith RaviZornitsa Kozareva
2020-04-13
Relation Transformer Network
Rajat KonerPoulami SinhamahapatraVolker Tresp
2020-04-13
Relational Learning between Multiple Pulmonary Nodules via Deep Set Attention Transformers
Jiancheng YangHaoran DengXiaoyang HuangBingbing NiYi Xu
2020-04-12
Stacked Convolutional Deep Encoding Network for Video-Text Retrieval
Rui ZhaoKecheng ZhengZheng-jun Zha
2020-04-10
Telling BERT's full story: from Local Attention to Global Aggregation
Damian PascualGino BrunnerRoger Wattenhofer
2020-04-10
Cortical surface registration using unsupervised learning
| Jieyu ChengAdrian V. DalcaBruce FischlLilla Zollei
2020-04-09
On Optimal Transformer Depth for Low-Resource Language Translation
Elan van BiljonArnu PretoriusJulia Kreutzer
2020-04-09
Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News Multi-Headline Generation
Dayiheng LiuYeyun GongJie FuWei LiuYu YanBo ShaoDaxin JiangJiancheng LvNan Duan
2020-04-08
Poor Man's BERT: Smaller and Faster Transformer Models
| Hassan SajjadFahim DalviNadir DurraniPreslav Nakov
2020-04-08
Adaptive Transformers in RL
| Shakti KumarJerrod ParkerPanteha Naderian
2020-04-08
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-based Question Answering
| Changmao LiJinho D. Choi
2020-04-07
Byte Pair Encoding is Suboptimal for Language Model Pretraining
Kaj BostromGreg Durrett
2020-04-07
Probabilistic Spatial Transformers for Bayesian Data Augmentation
Pola SchwöbelFrederik WarburgMartin JørgensenKristoffer H. MadsenSøren Hauberg
2020-04-07
AutoToon: Automatic Geometric Warping for Face Cartoon Generation
Julia GongYannick Hold-GeoffroyJingwan Lu
2020-04-06
A Systematic Analysis of Morphological Content in BERT Models for Multiple Languages
| Daniel Edmiston
2020-04-06
Syntax-driven Iterative Expansion Language Models for Controllable Text Generation
Noe CasasJosé A. R. FonollosaMarta R. Costa-jussà
2020-04-05
Conversational Question Reformulation via Sequence-to-Sequence Architectures and Pretrained Language Models
Sheng-Chieh LinJheng-Hong YangRodrigo NogueiraMing-Feng TsaiChuan-Ju WangJimmy Lin
2020-04-04
STEP: Sequence-to-Sequence Transformer Pre-training for Document Summarization
Yanyan ZouXingxing ZhangWei LuFuru WeiMing Zhou
2020-04-04
LiDAR-based Online 3D Video Object Detection with Graph-based Message Passing and Spatiotemporal Transformer Attention
| Junbo YinJianbing ShenChenye GuanDingfu ZhouRuigang Yang
2020-04-03
Testing pre-trained Transformer models for Lithuanian news clustering
Lukas StankevičiusMantas Lukoševičius
2020-04-03
The RWTH ASR System for TED-LIUM Release 2: Improving Hybrid HMM with SpecAugment
Wei ZhouWilfried MichelKazuki IrieMarkus KitzaRalf SchlüterHermann Ney
2020-04-02
Sign Language Translation with Transformers
| Kayo Yin
2020-04-01
Sign Language Translation with Transformers
| Kayo Yin
2020-04-01
DSTC8-AVSD: Multimodal Semantic Transformer Network with Retrieval Style Word Generator
Hwanhee LeeSeunghyun YoonFranck DernoncourtDoo Soon KimTrung BuiKyomin Jung
2020-04-01
Graph Enhanced Representation Learning for News Recommendation
Suyu GeChuhan WuFangzhao WuTao QiYongfeng Huang
2020-03-31
X-Linear Attention Networks for Image Captioning
| Yingwei PanTing YaoYehao LiTao Mei
2020-03-31
A Swiss German Dictionary: Variation in Speech and Writing
Larissa SchmidtLucy LinderSandra DjambazovskaAlexandros LazaridisTanja SamardžićClaudiu Musat
2020-03-31
DeepSumm -- Deep Code Summaries using Neural Transformer Architecture
Vivek Gupta
2020-03-31
AriEL: volume coding for sentence generation
Luca CelottiSimon BrodeurJean Rouat
2020-03-30
Learning Contextualized Sentence Representations for Document-Level Neural Machine Translation
Pei ZhangXu ZhangWei ChenJian YuYanfeng WangDeyi Xiong
2020-03-30
A Hierarchical Transformer for Unsupervised Parsing
Ashok Thillaisundaram
2020-03-30
Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation
| Necati Cihan CamgozOscar KollerSimon HadfieldRichard Bowden
2020-03-30
Code Prediction by Feeding Trees to Transformers
| Seohyun KimJinman ZhaoYuchi TianSatish Chandra
2020-03-30
Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement
Alireza MohammadshahiJames Henderson
2020-03-29
Abstractive Text Summarization based on Language Model Conditioning and Locality Modeling
Dmitrii AksenovJulián Moreno-SchneiderPeter BourgonjeRobert SchwarzenbergLeonhard HennigGeorg Rehm
2020-03-29
Variational Transformers for Diverse Response Generation
| Zhaojiang LinGenta Indra WinataPeng XuZihan LiuPascale Fung
2020-03-28
Actor-Transformers for Group Activity Recognition
Kirill GavrilyukRyan SanfordMehrsan JavanCees G. M. Snoek
2020-03-28
TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation
| Shaojie JiangThomas WolfChristof MonzMaarten de Rijke
2020-03-26
StrokeCoder: Path-Based Image Generation from Single Examples using Transformers
Sabine WieluchFriedhelm Schwenker
2020-03-26
Generalizing Spatial Transformers to Projective Geometry with Applications to 2D/3D Registration
| Cong GaoXingtong LiuWenhao GuBenjamin KilleenMehran ArmandRussell TaylorMathias Unberath
2020-03-24
Analyzing Word Translation of Transformer Layers
Hongfei XuJosef van GenabithDeyi XiongQiuhui Liu
2020-03-21
TNT-KID: Transformer-based Neural Tagger for Keyword Identification
Matej MartincBlaž ŠkrljSenja Pollak
2020-03-20
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
Longteng GuoJing LiuXinxin ZhuPeng YaoShichen LuHanqing Lu
2020-03-19
Temporal Embeddings and Transformer Models for Narrative Text Understanding
Vani KSimone MellaceAlessandro Antonucci
2020-03-19
Detecting Lane and Road Markings at A Distance with Perspective Transformer Layers
Zhuoping YuXiaozhou RenYuyao HuangWei TianJunqiao Zhao
2020-03-19
Transformer Networks for Trajectory Forecasting
| Francesco GiuliariIrtiza HasanMarco CristaniFabio Galasso
2020-03-18
Scene Text Recognition via Transformer
Xinjie FengHongxun YaoYuankai QiJun ZhangShengping Zhang
2020-03-18
PowerNorm: Rethinking Batch Normalization in Transformers
| Sheng ShenZhewei YaoAmir GholamiMichael W. MahoneyKurt Keutzer
2020-03-17
Multi-modal Dense Video Captioning
| Vladimir IashinEsa Rahtu
2020-03-17
TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding
Zhiheng HuangPeng XuDavis LiangAjay MishraBing Xiang
2020-03-16
Document Ranking with a Pretrained Sequence-to-Sequence Model
Rodrigo NogueiraZhiying JiangJimmy Lin
2020-03-14
Learning to Encode Position for Transformer with Continuous Dynamical Model
| Xuanqing LiuHsiang-Fu YuInderjit DhillonCho-Jui Hsieh
2020-03-13
Efficient Content-Based Sparse Attention with Routing Transformers
| Aurko RoyMohammad SaffarAshish VaswaniDavid Grangier
2020-03-12
Keyword-Attentive Deep Semantic Matching
| Changyu MiaoZhen CaoYik-Cheung Tam
2020-03-11
ReZero is All You Need: Fast Convergence at Large Depth
| Thomas BachlechnerBodhisattwa Prasad MajumderHuanru Henry MaoGarrison W. CottrellJulian McAuley
2020-03-10
Hybrid Attention-Based Transformer Block Model for Distant Supervision Relation Extraction
Yan XiaoYaochu JinRan ChengKuangrong Hao
2020-03-10
Capacity of Continuous Channels with Memory via Directed Information Neural Estimator
Ziv AharoniDor TsurZiv GoldfeldHaim Henry Permuter
2020-03-09
TTPP: Temporal Transformer with Progressive Prediction for Efficient Action Anticipation
Wen WangXiaojiang PengYanzhou SuYu QiaoJian Cheng
2020-03-07
Cross-modal Learning for Multi-modal Video Categorization
Palash GoyalSaurabh SahuShalini GhoshChul Lee
2020-03-07
Transformers Generalize to the Semantics of Logics
| Christopher HahnFrederik SchmittJens U. KreberMarkus N. RabeBernd Finkbeiner
2020-03-06
EmpTransfo: A Multi-head Transformer Architecture for Creating Empathetic Dialog Systems
| Rohola ZandieMohammad H. Mahoor
2020-03-05
Data Augmentation using Pre-trained Transformer Models
| Varun KumarAshutosh ChoudharyEunah Cho
2020-03-04
AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment
Zhen ZengJianzong WangNing ChengTian XiaJing Xiao
2020-03-04
Meta-Embeddings Based On Self-Attention
Qichen LiYuanqing LinLuofeng ZhouJian Li
2020-03-03
Heterogeneous Graph Transformer
| Ziniu HuYuxiao DongKuansan WangYizhou Sun
2020-03-03
Controllable Time-Delay Transformer for Real-Time Punctuation Prediction and Disfluency Detection
Qian ChenMengzhe ChenBo LiWen Wang
2020-03-03
Transfer Learning for Context-Aware Spoken Language Understanding
Qian ChenZhu ZhuoWen WangQiuyun Xu
2020-03-03
Transformer++
Prakhar ThapakProdip Hore
2020-03-02
Exploring and Distilling Cross-Modal Information for Image Captioning
Fenglin LiuXuancheng RenYuanxin LiuKai LeiXu Sun
2020-02-28
Provable, Scalable and Automatic Perturbation Analysis on General Computational Graphs
| Kaidi XuZhouxing ShiHuan ZhangYihan WangKai-Wei ChangMinlie HuangBhavya KailkhuraXue LinCho-Jui Hsieh
2020-02-28
Compressing Large-Scale Transformer-Based Models: A Case Study on BERT
Prakhar GaneshYao ChenXin LouMohammad Ali KhanYin YangDeming ChenMarianne WinslettHassan SajjadPreslav Nakov
2020-02-27
Marathi To English Neural Machine Translation With Near Perfect Corpus And Transformers
Swapnil Ashok Jadhav
2020-02-26
Sparse Sinkhorn Attention
| Yi TayDara BahriLiu YangDonald MetzlerDa-Cheng Juan
2020-02-26
Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers
Zhuohan LiEric WallaceSheng ShenKevin LinKurt KeutzerDan KleinJoseph E. Gonzalez
2020-02-26
Multi-task Learning with Multi-head Attention for Multi-choice Reading Comprehension
Hui Wan
2020-02-26
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
| Wenhui WangFuru WeiLi DongHangbo BaoNan YangMing Zhou
2020-02-25
Exploring BERT Parameter Efficiency on the Stanford Question Answering Dataset v2.0
Eric Hulburd
2020-02-25
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation
Alessandro RaganatoYves ScherrerJörg Tiedemann
2020-02-24
GRET: Global Representation Enhanced Transformer
Rongxiang WengHaoran WeiShujian HuangHeng YuLidong BingWeihua LuoJiajun Chen
2020-02-24
Accessing Higher-level Representations in Sequential Transformers with Feedback Memory
Angela FanThibaut LavrilEdouard GraveArmand JoulinSainbayar Sukhbaatar
2020-02-21
Transformer Hawkes Process
| Simiao ZuoHaoming JiangZichong LiTuo ZhaoHongyuan Zha
2020-02-21
Learning Dynamic Belief Graphs to Generalize on Text-Based Games
Ashutosh AdhikariXingdi YuanMarc-Alexandre CôtéMikuláš ZelinkaMarc-Antoine RondeauRomain LarochePascal PoupartJian TangAdam TrischlerWilliam L. Hamilton
2020-02-21
Molecule Attention Transformer
| Łukasz MaziarkaTomasz DanelSławomir MuchaKrzysztof RatajJacek TaborStanisław Jastrzębski
2020-02-19
LAMBERT: Layout-Aware (Language) Modeling using BERT for information extraction
Łukasz GarncarekRafał PowalskiTomasz StanisławekBartosz TopolskiPiotr HalamaFilip Graliński
2020-02-19
Tree-structured Attention with Hierarchical Accumulation
Xuan-Phi NguyenShafiq JotySteven C. H. HoiRichard Socher
2020-02-19
Toward Making the Most of Context in Neural Machine Translation
Zaixiang ZhengXiang YueShujian HuangJiajun ChenAlexandra Birch
2020-02-19
Gradient-Based Adversarial Training on Transformer Networks for Detecting Check-Worthy Factual Claims
Kevin MengDamian JimenezFatma ArslanJacob Daniel DevasierDaniel ObembeChengkai Li
2020-02-18
Uncertainty in Structured Prediction
Andrey MalininMark Gales
2020-02-18
Hierarchical Transformer Network for Utterance-level Emotion Recognition
QingBiao LiChunHua WuKangFeng ZhengZhe Wang
2020-02-18
Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue
| Byeongchang KimJaewoo AhnGunhee Kim
2020-02-18
Conditional Self-Attention for Query-based Summarization
Yujia XieTianyi ZhouYi MaoWeizhu Chen
2020-02-18
Controlling Computation versus Quality for Neural Sequence Models
Ankur BapnaNaveen ArivazhaganOrhan Firat
2020-02-17
Low-Rank Bottleneck in Multi-head Attention Models
Srinadh BhojanapalliChulhee YunAnkit Singh RawatSashank J. ReddiSanjiv Kumar
2020-02-17
A Financial Service Chatbot based on Deep Bidirectional Transformers
Shi YuYuxin ChenHussain Zaidi
2020-02-17
Multi-layer Representation Fusion for Neural Machine Translation
Qiang WangFuxue LiTong XiaoYanyang LiYinqiao LiJingbo Zhu
2020-02-16
Neural Machine Translation with Joint Representation
| Yanyang LiQiang WangTong XiaoTongran LiuJingbo Zhu
2020-02-16
UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation
Huaishao LuoLei JiBotian ShiHaoyang HuangNan DuanTianrui LiXilin ChenMing Zhou
2020-02-15
Small energy masking for improved neural network training for end-to-end speech recognition
Chanwoo KimKwangyoun KimSathish Reddy Indurthi
2020-02-15
Transformer on a Diet
| Chenguang WangZihao YeAston ZhangZheng ZhangAlexander J. Smola
2020-02-14
Towards an Appropriate Query, Key, and Value Computation for Knowledge Tracing
Youngduck ChoiYoungnam LeeJunghyun ChoJineon BaekByungsoo KimYeongmin ChaDongmin ShinChan BaeJaewe Heo
2020-02-14
Stress Test Evaluation of Transformer-based Models in Natural Language Understanding Tasks
Carlos AspillagaAndrés CarvalloVladimir Araujo
2020-02-14
Deep Attentive Study Session Dropout Prediction in Mobile Learning Environment
Youngnam LeeDongmin ShinHyunBin LohJaemin LeePiljae ChaeJunghyun ChoSeoyon ParkJinhwan LeeJineon BaekByungsoo KimYoungduck Choi
2020-02-14
Sparse and Structured Visual Attention
Pedro Henrique MartinsVlad NiculaeZita MarinhoAndré Martins
2020-02-13
Attentional Speech Recognition Models Misbehave on Out-of-domain Utterances
Phillip KeungWei NiuYichao LuJulian SalazarVikas Bhardwaj
2020-02-12
End-to-End Face Parsing via Interlinked Convolutional Neural Networks
| Zi YinValentin YiuXiaolin HuLiang Tang
2020-02-12
On Layer Normalization in the Transformer Architecture
Ruibin XiongYunchang YangDi HeKai ZhengShuxin ZhengChen XingHuishuai ZhangYanyan LanLiwei WangTie-Yan Liu
2020-02-12
GLU Variants Improve Transformer
| Noam Shazeer
2020-02-12
Training with Streaming Annotation
Tongtao ZhangHeng JiShih-Fu ChangMarjorie Freedman
2020-02-11
Superbloom: Bloom filter meets Transformer
John AndersonQingqing HuangWalid KricheneSteffen RendleLi Zhang
2020-02-11
Pre-training Tasks for Embedding-based Large-scale Retrieval
Wei-Cheng ChangFelix X. YuYin-Wen ChangYiming YangSanjiv Kumar
2020-02-10
End-to-End Multi-speaker Speech Recognition with Transformer
Xuankai ChangWangyou ZhangYanmin QianJonathan Le RouxShinji Watanabe
2020-02-10
Deep Representation Learning for Dynamical Systems Modeling
Anna ShalovaIvan Oseledets
2020-02-10
StickyPillars: Robust and Efficient Feature Matching on Point Clouds using Graph Neural Networks
Martin SimonKai FischerStefan MilzChristian WittFlorian OelsnerPatrick MaederHorst-Michael Gross
2020-02-10
On the distance between two neural networks and the stability of learning
| Jeremy BernsteinArash VahdatYisong YueMing-Yu Liu
2020-02-09
Blank Language Models
Tianxiao ShenVictor QuachRegina BarzilayTommi Jaakkola
2020-02-08
Multimodal Matching Transformer for Live Commenting
Chaoqun DuanLei CuiShuming MaFuru WeiConghui ZhuTiejun Zhao
2020-02-07
Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss
| Qian ZhangHan LuHasim SakAnshuman TripathiErik McDermottStephen KooShankar Kumar
2020-02-07
Transformer-Capsule Model for Intent Detection
Aleksander ObuchowskiMichał Lew
2020-02-07
perm2vec: Graph Permutation Selection for Decoding of Error Correction Codes using Self-Attention
Nir RavivAvi CaciularuTomer RavivJacob GoldbergerYair Be'ery
2020-02-06
Few-Shot Learning as Domain Adaptation: Algorithm and Analysis
Jiechao GuanZhiwu LuTao XiangJi-Rong Wen
2020-02-06
Aligning the Pretraining and Finetuning Objectives of Language Models
Nuo Wang PierseJingwen Lu
2020-02-05
Vocoder-free End-to-End Voice Conversion with Transformer Network
June-Woo KimHo-Young JungMinho Lee
2020-02-05
Learning Long- and Short-Term User Literal-Preference with Multimodal Hierarchical Transformer Network for Personalized Image Caption
Wei ZhangYue YingPan LuHongyuan Zha
2020-02-04
Multistage Model for Robust Face Alignment Using Deep Neural Networks
Huabin WangRui ChengJian ZhouLiang TaoHon Keung Kwan
2020-02-04
Interpretable & Time-Budget-Constrained Contextualization for Re-Ranking
| Sebastian HofstätterMarkus ZlabingerAllan Hanbury
2020-02-04
IART: Intent-aware Response Ranking with Transformers in Information-seeking Conversation Systems
| Liu YangMinghui QiuChen QuCen ChenJiafeng GuoYongfeng ZhangW. Bruce CroftHaiqing Chen
2020-02-03
Pop Music Transformer: Generating Music with Rhythm and Harmony
| Yu-Siang HuangYi-Hsuan Yang
2020-02-01
Bridging Text and Video: A Universal Multimodal Transformer for Video-Audio Scene-Aware Dialog
Zekang LiZongjia LiJinchao ZhangYang FengCheng NiuJie Zhou
2020-02-01
Pretrained Transformers for Simple Question Answering over Knowledge Graphs
D. LukovnikovA. FischerJ. Lehmann
2020-01-31
Interpretable Rumor Detection in Microblogs by Attending to User Interactions
| Ling Min Serena KhooHai Leong ChieuZhong QianJing Jiang
2020-01-29
A Study of Pyramid Structure for Code Correction
Shan HuangXiao ZhouSang Chin
2020-01-28
Applying Recent Innovations from NLP to MOOC Student Course Trajectory Modeling
Clarence ChenZachary Pardos
2020-01-23
Recommending Themes for Ad Creative Design via Visual-Linguistic Representations
| Yichao ZhouShaunak MishraManisha VermaNarayan BhamidipatiWei Wang
2020-01-20
Multi-level Head-wise Match and Aggregation in Transformer for Textual Sequence Matching
Shuohang WangYunshi LanYi TayJing JiangJingjing Liu
2020-01-20
Deep Learning for Hindi Text Classification: A Comparison
Ramchandra JoshiPurvi GoelRaviraj Joshi
2020-01-19
A multimodal deep learning approach for named entity recognition from social media
Meysam Asgari-ChenaghluM. Reza Feizi-DerakhshiLeili FarzinvashM. A. BalafarCina Motamed
2020-01-19
Shifted and Squeezed 8-bit Floating Point format for Low-Precision Training of Deep Neural Networks
Léopold CambierAnahita BhiwandiwallaTing GongMehran NekuiiOguz H ElibolHanlin Tang
2020-01-16
Non-Autoregressive Machine Translation with Disentangled Context Transformer
| Jungo KasaiJames CrossMarjan GhazvininejadJiatao Gu
2020-01-15
Insertion-Deletion Transformer
Laura RuisMitchell SternJulia ProskurniaWilliam Chan
2020-01-15
Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture
Haoran MiaoGaofeng ChengChangfeng GaoPengyuan ZhangYonghong Yan
2020-01-15
The problems with using STNs to align CNN feature maps
Lukas FinnvedenYlva JanssonTony Lindeberg
2020-01-14
Auto Completion of User Interface Layout Design Using Transformer-Based Tree Decoders
Yang LiJulien AmelotXin ZhouSamy BengioSi Si
2020-01-14
Reformer: The Efficient Transformer
| Nikita KitaevŁukasz KaiserAnselm Levskaya
2020-01-13
Urdu-English Machine Transliteration using Neural Networks
Usman Mohy ud Din
2020-01-12
Spatial-Temporal Transformer Networks for Traffic Flow Forecasting
Mingxing XuWenrui DaiChunmiao LiuXing GaoWeiyao LinGuo-Jun QiHongkai Xiong
2020-01-09
Streaming automatic speech recognition with the transformer model
Niko MoritzTakaaki HoriJonathan Le Roux
2020-01-08
RECAST: Interactive Auditing of Automatic Toxicity Detection Models
Austin P. WrightOmar ShaikhHaekyu ParkWill EppersonMuhammed AhmedStephane PinelDiyi YangDuen Horng Chau
2020-01-07
FDFtNet: Facing Off Fake Images using Fake Detection Fine-tuning Network
Hyeonseong JeonYoungoh BangSimon S. Woo
2020-01-05
Learning Accurate Integer Transformer Machine-Translation Models
Ephrem Wu
2020-01-03
Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation
Goran GlavašSwapna Somasundaran
2020-01-03
Representing Unordered Data Using Complex-Weighted Multiset Automata
Justin DeBenedettoDavid Chiang
2020-01-02
Improved Training Techniques for Online Neural Machine Translation
Anonymous
2020-01-01
Efficient Transformer for Mobile Applications
Anonymous
2020-01-01
Resolving Lexical Ambiguity in English–Japanese Neural Machine Translation
Anonymous
2020-01-01
Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps
| Anonymous
2020-01-01
Faster and Just As Accurate: A Simple Decomposition for Transformer Models
Anonymous
2020-01-01
Lossless Data Compression with Transformer
Anonymous
2020-01-01
Sparse Transformer: Concentrated Attention Through Explicit Selection
Anonymous
2020-01-01
Forecasting Deep Learning Dynamics with Applications to Hyperparameter Tuning
Anonymous
2020-01-01
Compressive Transformers for Long-Range Sequence Modelling
Anonymous
2020-01-01
MUSE: Multi-Scale Attention Model for Sequence to Sequence Learning
| Anonymous
2020-01-01
DeFINE: Deep Factorized Input Word Embeddings for Neural Sequence Modeling
Anonymous
2020-01-01
Concise Multi-head Attention Models
Anonymous
2020-01-01
Logic and the 2-Simplicial Transformer
| Anonymous
2020-01-01
DeepEnFM: Deep neural networks with Encoder enhanced Factorization Machine
Anonymous
2020-01-01
GOING BEYOND TOKEN-LEVEL PRE-TRAINING FOR EMBEDDING-BASED LARGE-SCALE RETRIEVAL
Anonymous
2020-01-01
CGT: Clustered Graph Transformer for Urban Spatio-temporal Prediction
Anonymous
2020-01-01
Fully Quantized Transformer for Improved Translation
Anonymous
2020-01-01
Augmenting Transformers with KNN-Based Composite Memory
Anonymous
2020-01-01
BERT-AL: BERT for Arbitrarily Long Document Understanding
Ruixuan ZhangZhuoyu WeiYu ShiYining Chen
2020-01-01
NEURAL EXECUTION ENGINES
Yujun YanKevin SwerskyDanai KoutraParthasarathy RanganathanMilad Hashemi
2020-01-01
Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring
| Samuel HumeauKurt ShusterMarie-Anne LachauxJason Weston
2020-01-01
Attention over Phrases
Anonymous
2020-01-01
Group-Transformer: Towards A Lightweight Character-level Language Model
Anonymous
2020-01-01
Global Relational Models of Source Code
Anonymous
2020-01-01
Putting Machine Translation in Context with the Noisy Channel Model
Anonymous
2020-01-01
Deep Attentive Ranking Networks for Learning to Order Sentences
Pawan KumarDhanajit BrahmaHarish KarnickPiyush Rai
2019-12-31
EEG based Continuous Speech Recognition using Transformers
Gautam KrishnaCo TranMason CarnahanAhmed H Tewfik
2019-12-31
AraNet: A Deep Learning Toolkit for Arabic Social Media
Muhammad Abdul-MageedChiyu ZhangAzadeh HashemiEl Moatez Billah Nagoudi
2019-12-30
All-in-One Image-Grounded Conversational Agents
Da JuKurt ShusterY-Lan BoureauJason Weston
2019-12-28
Is Attention All What You Need? -- An Empirical Investigation on Convolution-Based Active Memory and Self-Attention
Thomas DowdellHongyu Zhang
2019-12-27
Encoding word order in complex embeddings
| Benyou WangDonghao ZhaoChristina LiomaQiuchi LiPeng ZhangJakob Grue Simonsen
2019-12-27
Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection
Guangxiang ZhaoJunyang LinZhiyuan ZhangXuancheng RenQi SuXu Sun
2019-12-25
Multi-Graph Transformer for Free-Hand Sketch Recognition
| Peng XuChaitanya K. JoshiXavier Bresson
2019-12-24
Improving Abstractive Text Summarization with History Aggregation
Pengcheng LiaoChuang ZhangXiaojun ChenXiaofei Zhou
2019-12-24
end-to-end training of a large vocabulary end-to-end speech recognition system
Chanwoo KimSungsoo KimKwangyoun KimMehul KumarJiyeon KimKyungmin LeeChangwoo HanAbhinav GargEunhyang KimMinkyoo ShinShatrughan SinghLarry HeckDhananjaya Gowda
2019-12-22
Learning and Evaluating Contextual Embedding of Source Code
| Aditya KanadePetros ManiatisGogul BalakrishnanKensen Shi
2019-12-21
Are Transformers universal approximators of sequence-to-sequence functions?
Chulhee YunSrinadh BhojanapalliAnkit Singh RawatSashank J. ReddiSanjiv Kumar
2019-12-20
ET-USB: Transformer-Based Sequential Behavior Modeling for Inbound Customer Service
Ta-Chun SuGuan-Ying Chen
2019-12-20
Axial Attention in Multidimensional Transformers
| Jonathan HoNal KalchbrennerDirk WeissenbornTim Salimans
2019-12-20
Shareable Representations for Search Query Understanding
Mukul KumarYouna HuWill HeaddenRahul GoutamHeran LinBing Yin
2019-12-20
Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting
| Bryan LimSercan O. ArikNicolas LoeffTomas Pfister
2019-12-19
Meshed-Memory Transformer for Image Captioning
| Marcella CorniaMatteo StefaniniLorenzo BaraldiRita Cucchiara
2019-12-17
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining
Wen-Chin HuangTomoki HayashiYi-Chiao WuHirokazu KameokaTomoki Toda
2019-12-14
BERTQA -- Attention on Steroids
Ankit ChadhaRewa Sood
2019-12-14
WaLDORf: Wasteless Language-model Distillation On Reading-comprehension
James Yi TianAlexander P. KreuzerPai-Hung ChenHans-Martin Will
2019-12-13
Diffeomorphic Temporal Alignment Nets
| Ron Shapira WeberMatan EyalNicki SkafteOren ShrikiOren Freifeld
2019-12-10
Encoding Musical Style with Transformer Autoencoders
Kristy ChoiCurtis HawthorneIan SimonMonica DinculescuJesse Engel
2019-12-10
Transformer Based Reinforcement Learning For Games
Uddeshya UpadhyayNikunj ShahSucheta RavikantiMayanka Medhe
2019-12-09
Learning a Layout Transfer Network for Context Aware Object Detection
Tao WangXuming HeYuanzheng CaiGuobao Xiao
2019-12-09
Bidirectional Scene Text Recognition with a Single Decoder
| Maurits BleekerMaarten de Rijke
2019-12-08
Personalized Patent Claim Generation and Measurement
Jieh-Sheng Lee
2019-12-07
Weak Supervision helps Emergence of Word-Object Alignment and improves Vision-Language Tasks
Corentin KervadecGrigory AntipovMoez BaccoucheChristian Wolf
2019-12-06
Synchronous Transformers for End-to-End Speech Recognition
Zhengkun TianJiangyan YiYe BaiJianhua TaoShuai ZhangZhengqi Wen
2019-12-06
Self-Supervised Contextual Language Representation of Radiology Reports to Improve the Identification of Communication Urgency
Xing MengCraig H. GanoeRyan T. SiebergYvonne Y. CheungSaeed Hassanpour
2019-12-05
AMUSED: A Multi-Stream Vector Representation Method for Use in Natural Dialogue
Gaurav KumarRishabh JoshiJaspreet SinghPromod Yenigalla
2019-12-04
TU Wien @ TREC Deep Learning '19 -- Simple Contextualization for Re-ranking
| Sebastian HofstätterMarkus ZlabingerAllan Hanbury
2019-12-03
Multi-Scale Self-Attention for Text Classification
Qipeng GuoXipeng QiuPengfei LiuXiangyang XueZheng Zhang
2019-12-02
BLiMP: The Benchmark of Linguistic Minimal Pairs for English
| Alex WarstadtAlicia ParrishHaokun LiuAnhad MohananeyWei PengSheng-Fu WangSamuel R. Bowman
2019-12-02
Solving Arithmetic Word Problems Automatically Using Transformer and Unambiguous Representations
| Kaden GriffithJugal Kalita
2019-12-02
Long Distance Relationships without Time Travel: Boosting the Performance of a Sparse Predictive Autoencoder in Sequence Modeling
Jeremy GordonDavid RawlinsonSubutai Ahmad
2019-12-02
Neural Academic Paper Generation
| Samet DemirUras MutluÖzgur Özdemir
2019-12-02
Audiovisual Transformer Architectures for Large-Scale Classification and Synchronization of Weakly Labeled Audio Events
Wim BoesHugo Van hamme
2019-12-02
Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks
Xiao SunJungwook ChoiChia-Yu ChenNaigang WangSwagath VenkataramaniVijayalakshmi (Viji) SrinivasanXiaodong CuiWei ZhangKailash Gopalakrishnan
2019-12-01
Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech Recognition
Chao WengChengzhu YuJia CuiChunlei ZhangDong Yu
2019-11-28
Do Attention Heads in BERT Track Syntactic Dependencies?
Phu Mon HtutJason PhangShikha BordiaSamuel R. Bowman
2019-11-27
Taking a Stance on Fake News: Towards Automatic Disinformation Assessment via Deep Bidirectional Transformer Language Models for Stance Detection
Chris DulhantyJason L. DeglintIbrahim Ben DayaAlexander Wong
2019-11-27
SimpleBooks: Long-term dependency book dataset with simplified English vocabulary for word-level language modeling
Huyen Nguyen
2019-11-27
DeFINE: DEep Factorized INput Token Embeddings for Neural Sequence Modeling
Sachin MehtaRik Koncel-KedziorskiMohammad RastegariHannaneh Hajishirzi
2019-11-27
Password-conditioned Anonymization and Deanonymization with Face Identity Transformers
Xiuye GuWeixin LuoMichael S. RyooYong Jae Lee
2019-11-26
Relevance-Promoting Language Model for Short-Text Conversation
Xin LiPiji LiWei BiXiaojiang LiuWai Lam
2019-11-26
Efficient Attention Mechanism for Visual Dialog that can Handle All the Interactions between Multiple Inputs
| Van-Quang NguyenMasanori SuganumaTakayuki Okatani
2019-11-26
Autoencoding Undirected Molecular Graphs With Neural Networks
Jeppe Johan Waarkjær OlsenPeter Ebert ChristensenMartin Hangaard HansenAlexander Rosenberg Johansen
2019-11-26
Who did They Respond to? Conversation Structure Modeling using Masked Hierarchical Transformer
| Henghui ZhuFeng NanZhiguo WangRamesh NallapatiBing Xiang
2019-11-25
Learning to Reuse Translations: Guiding Neural Machine Translation with Examples
Qian CaoShaohui KuangDeyi Xiong
2019-11-25
Spectral Graph Transformer Networks for Brain Surface Parcellation
Ran HeKarthik GopinathChristian DesrosiersHerve Lombaert
2019-11-22
Neuron Interaction Based Representation Composition for Neural Machine Translation
Jian LiXing WangBaosong YangShuming ShiMichael R. LyuZhaopeng Tu
2019-11-22
Factorized Multimodal Transformer for Multimodal Sequential Learning
| Amir ZadehChengfeng MaoKelly ShiYiwei ZhangPaul Pu LiangSoujanya PoriaLouis-Philippe Morency
2019-11-22
Improving N-gram Language Models with Pre-trained Deep Transformer
Yiren WangHongzhao HuangZhe LiuYutong PangYongqiang WangChengXiang ZhaiFuchun Peng
2019-11-22
WildMix Dataset and Spectro-Temporal Transformer Model for Monoaural Audio Source Separation
Amir ZadehTianjun MaSoujanya PoriaLouis-Philippe Morency
2019-11-21
MarioNETte: Few-shot Face Reenactment Preserving Identity of Unseen Targets
Sungjoo HaMartin KersnerBeomsu KimSeokjun SeoDongyoung Kim
2019-11-19
Graph Transformer for Graph-to-Sequence Learning
| Deng CaiWai Lam
2019-11-18
MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning
| Guangxiang ZhaoXu SunJingjing XuZhiyuan ZhangLiangchen Luo
2019-11-17
Music theme recognition using CNN and self-attention
Manoj SukhavasiSainath Adapa
2019-11-16
Sequential Recommendation with Relation-Aware Kernelized Self-Attention
Mingi JiWeonyoung JooKyungwoo SongYoon-Yeong KimIl-Chul Moon
2019-11-15
Evaluating robustness of language models for chief complaint extraction from patient-generated text
Ilya ValmianskiCaleb GoodwinIan M. FinnNaqi KhanDaniel S. Zisook
2019-11-15
Selection-based Question Answering of an MOOC
Atul SahaySmita GholkarKavi Arya
2019-11-15
Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA
Ronghang HuAmanpreet SinghTrevor DarrellMarcus Rohrbach
2019-11-14
Attention on Abstract Visual Reasoning
Lukas HahneTimo LüddeckeFlorentin WörgötterDavid Kappel
2019-11-14
Compressive Transformers for Long-Range Sequence Modelling
| Jack W. RaeAnna PotapenkoSiddhant M. JayakumarTimothy P. Lillicrap
2019-11-13
Character-based NMT with Transformer
Rohit GuptaLaurent BesacierMarc DymetmanMatthias Gallé
2019-11-12
SMILES Transformer: Pre-trained Molecular Fingerprint for Low Data Drug Discovery
| Shion HondaShoi ShiHiroki R. Ueda
2019-11-12
Disentangle, align and fuse for multimodal and zero-shot image segmentation
| Agisilaos ChartsiasGiorgos PapanastasiouChengjia WangScott SempleDavid E. NewbyRohan DharmakumarSotirios A. Tsaftaris
2019-11-11
Attending to Entities for Better Text Understanding
Pengxiang ChengKatrin Erk
2019-11-11
TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection
| Siddhant GargThuy VuAlessandro Moschitti
2019-11-11
BP-Transformer: Modelling Long-Range Context via Binary Partitioning
| Zihao YeQipeng GuoQuan GanXipeng QiuZheng Zhang
2019-11-11
Long-span language modeling for speech recognition
Sarangarajan ParthasarathyWilliam GaleXie ChenGeorge PolovetsShuangyu Chang
2019-11-11
TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection
| Siddhant GargThuy VuAlessandro Moschitti
2019-11-11
Two-Headed Monster And Crossed Co-Attention Networks
Yaoyiran LiJing Jiang
2019-11-10
Improving Transformer Models by Reordering their Sublayers
| Ofir PressNoah A. SmithOmer Levy
2019-11-10
Learning to Few-Shot Learn Across Diverse Natural Language Classification Tasks
Trapit BansalRishikesh JhaAndrew McCallum
2019-11-10
Distilling Knowledge Learned in BERT for Text Generation
| Yen-Chun ChenZhe GanYu ChengJingzhou LiuJingjing Liu
2019-11-10
TENER: Adapting Transformer Encoder for Named Entity Recognition
| Hang YanBocao DengXiaonan LiXipeng Qiu
2019-11-10
Listen and Fill in the Missing Letters: Non-Autoregressive Transformer for Speech Recognition
Nanxin ChenShinji WatanabeJesús VillalbaNajim Dehak
2019-11-10
Syntax-Infused Transformer and BERT models for Machine Translation and Natural Language Understanding
Dhanasekar SundararamanVivek SubramanianGuoyin WangShijing SiDinghan ShenDong WangLawrence Carin
2019-11-10
A Reinforced Generation of Adversarial Examples for Neural Machine Translation
Wei ZouShujian HuangJun XieXinyu DaiJiajun Chen
2019-11-09
Question Generation from Paragraphs: A Tale of Two Hierarchical Models
Vishwajeet KumarRaktim ChakiSai Teja TalluriGanesh RamakrishnanYuan-Fang LiGholamreza Haffari
2019-11-08
Lipschitz Constrained Parameter Initialization for Deep Transformers
Hongfei XuQiuhui LiuJosef van GenabithDeyi XiongJingyi Zhang
2019-11-08
Resurrecting Submodularity for Neural Text Generation
Simeng HanXiang LinShafiq Joty
2019-11-08
Graph-to-Graph Transformer for Transition-based Dependency Parsing
Alireza MohammadshahiJames Henderson
2019-11-08
Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models
Xisen JinZhongyu WeiJunyi DuXiangyang XueXiang Ren
2019-11-08
Probing Contextualized Sentence Representations with Visual Awareness
Zhuosheng ZhangRui WangKehai ChenMasao UtiyamaEiichiro SumitaHai Zhao
2019-11-07
Porous Lattice-based Transformer Encoder for Chinese NER
Xue MenggeYu BowenLiu TingwenWang BinMeng ErliLi Quangang
2019-11-07
Microsoft Research Asia's Systems for WMT19
Yingce XiaXu TanFei TianFei GaoWeicong ChenYang FanLinyuan GongYichong LengRenqian LuoYiren WangLijun WuJinhua ZhuTao QinTie-Yan Liu
2019-11-07
Learning to Answer by Learning to Ask: Getting the Best of GPT-2 and BERT Worlds
Tassilo KleinMoin Nabi
2019-11-06
Enriching Conversation Context in Retrieval-based Chatbots
Amir Vakili TahamiAzadeh Shakery
2019-11-06
CoKE: Contextualized Knowledge Graph Embedding
| Quan WangPingping HuangHaifeng WangSongtai DaiWenbin JiangJing LiuYajuan LyuYong ZhuHua Wu
2019-11-06
Fast Transformer Decoding: One Write-Head is All You Need
Noam Shazeer
2019-11-06
An End-to-end Approach for Lexical Stress Detection based on Transformer
Yong RuanXiangdong WangHong LiuZhigang OuYun GaoJianfeng ChengYueliang Qian
2019-11-06
Graph Transformer Networks
| Seongjun YunMinbyul JeongRaehyun KimJaewoo KangHyunwoo J. Kim
2019-11-06
Improving Bidirectional Decoding with Dynamic Target Semantics in Neural Machine Translation
Yong ShanYang FengJinchao ZhangFandong MengWen Zhang
2019-11-05
An Algorithm for Routing Capsules in All Domains
| Franz A. Heinsen
2019-11-02
Machine Translation Evaluation using Bi-directional Entailment
Rakesh KhobragadeHeaven PatelAnand NamdevAnish MishraPushpak Bhattacharyya
2019-11-02
English to Hindi Multi-modal Neural Machine Translation and Hindi Image Captioning
Sahinur Rahman LaskarRohit Pratap SinghPartha PakrayBSivaji yopadhyay
2019-11-01
CVIT's submissions to WAT-2019
Jerin PhilipShashank SiripragadaUpendra KumarVinay NamboodiriC V Jawahar
2019-11-01
LTRC-MT Simple \& Effective Hindi-English Neural Machine Translation Systems at WAT 2019
Vikrant GoyalDipti Misra Sharma
2019-11-01
Long Warm-up and Self-Training: Training Strategies of NICT-2 NMT System at WAT-2019
Kenji ImamuraEiichiro Sumita
2019-11-01
Supervised neural machine translation based on data augmentation and improved training \& inference process
Yixuan TongLiang LiangBoyan LiuShanshan JiangBin Dong
2019-11-01
Sarah's Participation in WAT 2019
Raymond Hendy SusantoOhnmar HtunLiling Tan
2019-11-01
Our Neural Machine Translation Systems for WAT 2019
Wei YangJun Ogata
2019-11-01
Idiap NMT System for WAT 2019 Multimodal Translation Task
Shantipriya ParidaOnd{\v{r}}ej BojarPetr Motlicek
2019-11-01
SYSTRAN @ WAT 2019: Russian-Japanese News Commentary task
Jitao XuTuAnh NguyenMinhQuang PhamJosep CregoJean Senellart
2019-11-01
Dialect Text Normalization to Normative Standard Finnish
| Niko PartanenMika H{\"a}m{\"a}l{\"a}inenKhalid Alnajjar
2019-11-01
Recycling a Pre-trained BERT Encoder for Neural Machine Translation
Kenji ImamuraEiichiro Sumita
2019-11-01
Transformer-based Model for Single Documents Neural Summarization
Elozino EgonmwanYllias Chali
2019-11-01
Enhanced Transformer Model for Data-to-Text Generation
Li GongJosep CregoJean Senellart
2019-11-01
Mixed Multi-Head Self-Attention for Neural Machine Translation
Hongyi CuiShohei IidaPo-Hsuan HungTakehito UtsuroMasaaki Nagata
2019-11-01
Transformer and seq2seq model for Paraphrase Generation
Elozino EgonmwanYllias Chali
2019-11-01
SYSTRAN @ WNGT 2019: DGT Task
Li GongJosep CregoJean Senellart
2019-11-01
Inspecting Unification of Encoding and Matching with Transformer: A Case Study of Machine Reading Comprehension
Hangbo BaoLi DongFuru WeiWenhui WangNan YangLei CuiSonghao PiaoMing Zhou
2019-11-01
The Concordia NLG Surface Realizer at SRST 2019
Farhood FarahnakLaya RafieeLeila KosseimThomas Fevens
2019-11-01
Improving Natural Language Understanding by Reverse Mapping Bytepair Encoding
Chaodong TongHuailiang PengQiong DaiLei JiangJianghua Huang
2019-11-01
Automatically Extracting Challenge Sets for Non-Local Phenomena in Neural Machine Translation
Leshem ChoshenOmri Abend
2019-11-01
On the Relation between Position Information and Sentence Length in Neural Machine Translation
Masato NeishiNaoki Yoshinaga
2019-11-01
Self-Adaptive Scaling for Learnable Residual Structure
Fenglin LiuMeng GaoYuanxin LiuKai Lei
2019-11-01
Recurrent Positional Embedding for Neural Machine Translation
Kehai ChenRui WangMasao UtiyamaEiichiro Sumita
2019-11-01
``Transforming'' Delete, Retrieve, Generate Approach for Controlled Text Style Transfer
Akhilesh SudhakarBhargav UpadhyayArjun Maheswaran
2019-11-01
Combining Global Sparse Gradients with Local Gradients in Distributed Neural Network Training
Alham Fikri AjiKenneth HeafieldNikolay Bogoychev
2019-11-01
Transformer Dissection: An Unified Understanding for Transformer's Attention via the Lens of Kernel
Yao-Hung Hubert TsaiShaojie BaiMakoto YamadaLouis-Philippe MorencyRuslan Salakhutdinov
2019-11-01
Improving Answer Selection and Answer Triggering using Hard Negatives
Sawan KumarShweta GargKartik MehtaNikhil Rasiwasia
2019-11-01
Aggregating Bidirectional Encoder Representations Using MatchLSTM for Sequence Matching
Bo ShaoYeyun GongWeizhen QiNan DuanXiaola Lin
2019-11-01
Improving Generalization of Transformer for Speech Recognition with Parallel Schedule Sampling and Relative Positional Embedding
Pan ZhouRuchao FanWei ChenJia Jia
2019-11-01
DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation
| Yizhe ZhangSiqi SunMichel GalleyYen-Chun ChenChris BrockettXiang GaoJianfeng GaoJingjing LiuBill Dolan
2019-11-01
Attention Is All You Need for Chinese Word Segmentation
Sufeng DuanHai Zhao
2019-10-31
Document-level Neural Machine Translation with Inter-Sentence Attention
Shu JiangRui WangZuchao LiMasao UtiyamaKehai ChenEiichiro SumitaHai ZhaoBao-liang Lu
2019-10-31
NAT: Neural Architecture Transformer for Accurate and Compact Architectures
| Yong GuoYin ZhengMingkui TanQi ChenJian ChenPeilin ZhaoJunzhou Huang
2019-10-31