Label Smoothing

Label Smoothing is a regularization technique that introduces noise for the labels. This accounts for the fact that datasets may have mistakes in them, so maximizing the likelihood of $\log{p}\left(y\mid{x}\right)$ directly can be harmful. Assume for a small constant $\epsilon$, the training set label $y$ is correct with probability $1-\epsilon$ and incorrect otherwise. Label Smoothing regularizes a model based on a softmax with $k$ output values by replacing the hard $0$ and $1$ classification targets with targets of $\frac{\epsilon}{k-1}$ and $1-\epsilon$ respectively.

Source: Deep Learning, Goodfellow et al

Image Source: When Does Label Smoothing Help?

Latest Papers

PAPER DATE
Deep Multi-View Spatiotemporal Virtual Graph Neural Network for Significant Citywide Ride-hailing Demand Prediction
Guangyin JinZhexu XiHengyu ShaYanghe FengJincai Huang
2020-07-30
Interpretable Contextual Team-aware Item Recommendation: Application in Multiplayer Online Battle Arena Games
Andrés VillaVladimir AraujoFrancisca CattanDenis Parra
2020-07-30
FiSSA at SemEval-2020 Task 9: Fine-tuned For Feelings
| Bertelt BraaksmaRichard ScholtensStan van SuijlekomRemy WangAhmet Üstün
2020-07-24
PP-YOLO: An Effective and Efficient Implementation of Object Detector
| Xiang LongKaipeng DengGuanzhong WangYang ZhangQingqing DangYuan GaoHui ShenJianguo RenShumin HanErrui DingShilei Wen
2020-07-23
CrossTransformers: spatially-aware few-shot transfer
Carl DoerschAnkush GuptaAndrew Zisserman
2020-07-22
Analogical Reasoning for Visually Grounded Language Acquisition
Bo WuHaoyu QinAlireza ZareianCarl VondrickShih-Fu Chang
2020-07-22
SliceOut: Training Transformers and CNNs faster while using less memory
Pascal NotinAidan N. GomezJoanna YooYarin Gal
2020-07-21
Neural Machine Translation with Error Correction
Kaitao SongXu TanJianfeng Lu
2020-07-21
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing
| Yapeng TianDingzeyu LiChenliang Xu
2020-07-21
Learning Joint Spatial-Temporal Transformations for Video Inpainting
| Yanhong ZengJianlong FuHongyang Chao
2020-07-20
Conformer-Kernel with Query Term Independence for Document Retrieval
| Bhaskar MitraSebastian HofstatterHamed ZamaniNick Craswell
2020-07-20
Temporal Pointwise Convolutional Networks for Length of Stay Prediction in the Intensive Care Unit
| Emma RocheteauPietro LiòStephanie Hyland
2020-07-18
Feature Pyramid Transformer
Dong ZhangHanwang ZhangJinhui TangMeng WangXiansheng HuaQianru Sun
2020-07-18
Deep Learning Based Traffic Surveillance System For Missing and Suspicious Car Detection
K. V. KadambariVishnu Vardhan Nimmalapudi
2020-07-17
CTC-Segmentation of Large Corpora for German End-to-end Speech Recognition
Ludwig KürzingerDominik WinkelbauerLujun LiTobias WatzelGerhard Rigoll
2020-07-17
The Monte Carlo Transformer: a stochastic self-attention model for sequence prediction
Alice MartinCharles OllionFlorian StrubSylvain Le CorffOlivier Pietquin
2020-07-15
Deep Transformer based Data Augmentation with Subword Units for Morphologically Rich Online ASR
Balázs TarjánGyörgy SzaszákTibor FegyóPéter Mihajlik
2020-07-14
Contextualized Code Representation Learning for Commit Message Generation
Lun Yiu NieCuiyun GaoZhicong ZhongWai LamYang LiuZenglin Xu
2020-07-14
Emoji Prediction: Extensions and Benchmarking
Weicheng MaRuibo LiuLili WangSoroush Vosoughi
2020-07-14
Learning and Exploiting Interclass Visual Correlations for Medical Image Classification
Dong WeiShilei CaoKai MaYefeng Zheng
2020-07-13
Paranoid Transformer: Reading Narrative of Madness as Computational Approach to Creativity
Yana AgafonovaAlexey TikhonovIvan P. Yamshchikov
2020-07-13
Transformer with Depth-Wise LSTM
Hongfei XuQiuhui LiuDeyi XiongJosef van Genabith
2020-07-13
Sparse Graph to Sequence Learning for Vision Conditioned Long Textual Sequence Generation
Aditya MogadalaMarius MosbachDietrich Klakow
2020-07-12
TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech
| Andy T. LiuShang-Wen LiHung-yi Lee
2020-07-12
Sequence Generation with Mixed Representations
Lijun Wu Shufang Xie Yingce Xia Fan Yang Tao Qin Jianhuang Lai Tie-Yan Liu
2020-07-11
BISON:BM25-weighted Self-Attention Framework for Multi-Fields Document Search
Xuan ShanChuanjie LiuYiqian XiaQi ChenYusi ZhangAngen LuoYuxiang Luo
2020-07-10
DeepSinger: Singing Voice Synthesis with Data Mined From the Web
Yi RenXu TanTao QinJian LuanZhou ZhaoTie-Yan Liu
2020-07-09
Single architecture and multiple task deep neural network for altered fingerprint analysis
Oliver GiudiceMattia LitricoSebastiano Battiato
2020-07-09
Advances of Transformer-Based Models for News Headline Generation
| Alexey BukhtiyarovIlya Gusev
2020-07-09
The Go Transformer: Natural Language Modeling for Game Play
Matthew CiolinoDavid NoeverJosh Kalin
2020-07-07
Learning and Reasoning with the Graph Structure Representation in Robotic Surgery
Mobarakol IslamLalithkumar SeenivasanLim Chwee MingHongliang Ren
2020-07-07
Do Transformers Need Deep Long-Range Memory
Jack W. RaeAli Razavi
2020-07-07
Relevance Transformer: Generating Concise Code Snippets with Relevance Feedback
Carlos GemmellFederico RossettoJeffrey Dalton
2020-07-06
Learning to Segment Anatomical Structures Accurately from One Exemplar
Yuhang LuWeijian LiKang ZhengYirui WangAdam P. HarrisonChihung LinSong WangJing XiaoLe LuChang-Fu KuoShun Miao
2020-07-06
Abstractive and mixed summarization for long-single documents
Roger BarrullJugal Kalita
2020-07-03
Self-Attention Guided Copy Mechanism for Abstractive Summarization
Song XuHaoran LiPeng YuanYouzheng WuXiaodong HeBowen Zhou
2020-07-01
Multimodal Transformer for Multimodal Machine Translation
Shaowei YaoXiaojun Wan
2020-07-01
Paraphrase Generation by Learning How to Edit from Samples
Amirhossein KazemnejadMohammadreza SalehiMahdieh Soleymani Baghshah
2020-07-01
Dependency Graph Enhanced Dual-transformer Structure for Aspect-based Sentiment Classification
Hao TangDonghong JiChenliang LiQiji Zhou
2020-07-01
In Neural Machine Translation, What Does Transfer Learning Transfer?
Alham Fikri AjiNikolay BogoychevKenneth HeafieldRico Sennrich
2020-07-01
Feature Projection for Improved Text Classification
Qi QinWenpeng HuBing Liu
2020-07-01
Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation
Arya D. McCarthyXian LiJiatao GuNing Dong
2020-07-01
DIALOGPT : Large-Scale Generative Pre-training for Conversational Response Generation
| Yizhe ZhangSiqi SunMichel GalleyYen-Chun ChenChris BrockettXiang GaoJianfeng GaoJingjing LiuBill Dolan
2020-07-01
Combining Subword Representations into Word-level Representations in the Transformer Architecture
Noe CasasMarta R. Costa-juss{\`a}Jos{\'e} A. R. Fonollosa
2020-07-01
Robust Neural Machine Translation with ASR Errors
Haiyang XueYang FengShuhao GuWei Chen
2020-07-01
An empirical investigation of neural methods for content scoring of science explanations
Brian RiordanSarah BichlerAllison BradfordJennifer King ChenKorah WileyLibby GerardMarcia C. Linn
2020-07-01
Neural Transduction of Letter Position Dyslexia using an Anagram Matrix Representation
Avi Bleiweiss
2020-07-01
Character aware models with similarity learning for metaphor detection
Tarun KumarYashvardhan Sharma
2020-07-01
A Transformer Approach to Contextual Sarcasm Detection in Twitter
Hunter GregorySteven LiPouya MohammadiNatalie TarnRachel DraelosCynthia Rudin
2020-07-01
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal Dependency Parsing
Han HeJinho D. Choi
2020-07-01
KIT's IWSLT 2020 SLT Translation System
Ngoc-Quan PhamFelix SchneiderTuan-Nam NguyenThanh-Le HaThai Son NguyenMaximilian AwiszusSebastian St{\"u}kerAlex Waibeler
2020-07-01
End-to-End Simultaneous Translation System for IWSLT2020 Using Modality Agnostic Meta-Learning
Hou Jeung HanMohd Abbas ZaidiSathish Reddy IndurthiNikhil Kumar LakumarapuBeomseok LeeSangha Kim
2020-07-01
End-to-End Offline Speech Translation System for IWSLT 2020 using Modality Agnostic Meta-Learning
Nikhil Kumar LakumarapuBeomseok LeeSathish Reddy IndurthiHou Jeung HanMohd Abbas ZaidiSangha Kim
2020-07-01
SRPOL's System for the IWSLT 2020 End-to-End Speech Translation Task
Tomasz PotapczykPawel Przybysz
2020-07-01
The AFRL IWSLT 2020 Systems: Work-From-Home Edition
Brian OreEric HansenTim AndersonJeremy Gwinnup
2020-07-01
OPPO's Machine Translation System for the IWSLT 2020 Open Domain Translation Task
Qian ZhangXiaopu LiDawei DangTingxun ShiDi AiZhengshan XueJie Hao
2020-07-01
CASIA's System for IWSLT 2020 Open Domain Translation
Qian WangYuchen LiuCong MaYu LuYining WangLong ZhouYang ZhaoJiajun ZhangChengqing Zong
2020-07-01
Deep Blue Sonics' Submission to IWSLT 2020 Open Domain Translation Task
Enmin SuYi Ren
2020-07-01
University of Tsukuba's Machine Translation System for IWSLT20 Open Domain Translation Task
Hongyi CuiYizhen WeiShohei IidaTakehito UtsuroMasaaki Nagata
2020-07-01
Xiaomi's Submissions for IWSLT 2020 Open Domain Translation Task
Yuhui SunMengxue GuoXiang LiJianwei CuiBin Wang
2020-07-01
The HW-TSC Video Speech Translation System at IWSLT 2020
Minghan WangHao YangYao DengYing QinLizhi LeiDaimeng WeiHengchao ShangNing XieXiaochun LiJiaxian Guo
2020-07-01
Towards Stream Translation: Adaptive Computation Time for Simultaneous Machine Translation
Felix SchneiderAlex Waibeler
2020-07-01
Compressing Neural Machine Translation Models with 4-bit Precision
Alham Fikri AjiKenneth Heafield
2020-07-01
Training and Inference Methods for High-Coverage Neural Machine Translation
Michael YangYixin LiuRahul Mayuranath
2020-07-01
Expand and Filter: CUNI and LMU Systems for the WNGT 2020 Duolingo Shared Task
Jind{\v{r}}ich Libovick{\'y}Zden{\v{e}}k KasnerJind{\v{r}}ich HelclOnd{\v{r}}ej Du{\v{s}}ek
2020-07-01
The NiuTrans System for WNGT 2020 Efficiency Task
Chi HuBei LiYinqiao LiYe LinYanyang LiChenglong WangTong XiaoJingbo Zhu
2020-07-01
Efficient and High-Quality Neural Machine Translation with OpenNMT
Guillaume KleinDakun ZhangCl{\'e}ment ChouteauJosep CregoJean Senellart
2020-07-01
Improving Document-Level Neural Machine Translation with Domain Adaptation
Sami Ul HaqSadaf Abdul RaufArslan ShoukatNoor-e- Hira
2020-07-01
CopyBERT: A Unified Approach to Question Generation with Self-Attention
Stalin VaranasiSaadullah AminGuenter Neumann
2020-07-01
How to Tame Your Data: Data Augmentation for Dialog State Tracking
Adam SummervilleJordan HashemiJames Ryanwilliam ferguson
2020-07-01
Methods for Extracting Information from Messages from Primary Care Providers to Specialists
Xiyu DingMichael BarnettAteev MehrotraTimothy Miller
2020-07-01
Generating Medical Reports from Patient-Doctor Conversations Using Sequence-to-Sequence Models
Seppo EnarviMarilisa AmoiaMiguel Del-Agua TebaBrian DelaneyFrank DiehlStefan HahnKristina HarrisLiam McGrathYue PanJoel PintoLuca RubiniMiguel RuizGag SingheepFabian StemmerWeiyi SunPaul VozilaThomas LinRanjani Ramamurthy
2020-07-01
Enhancing Transformer with Sememe Knowledge
Yuhui ZhangChenghao YangZhengping ZhouZhiyuan Liu
2020-07-01
Grapheme-to-Phoneme Conversion with a Multilingual Transformer Model
Omnia ElSaadanyBenjamin Suter
2020-07-01
Frustratingly Easy Multilingual Grapheme-to-Phoneme Conversion
Nikhil PrabhuKatharina Kann
2020-07-01
Leveraging Principal Parts for Morphological Inflection
Ling LiuMans Hulden
2020-07-01
Data Augmentation for Transformer-based G2P
Zach RyanMans Hulden
2020-07-01
HausaMT v1.0: Towards English--Hausa Neural Machine Translation
Adewale Akinfaderin
2020-07-01
Image-level Harmonization of Multi-Site Data using Image-and-Spatial Transformer Networks
| R. RobinsonQ. DouD. C. CastroK. KamnitsasM. de GrootR. M. SummersD. RueckertB. Glocker
2020-06-30
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
| Dmitry LepikhinHyoukJoong LeeYuanzhong XuDehao ChenOrhan FiratYanping HuangMaxim KrikunNoam ShazeerZhifeng Chen
2020-06-30
Correction of Faulty Background Knowledge based on Condition Aware and Revise Transformer for Question Answering
Xinyan ZhaoXiao FengHaoming ZhongJun YaoHuanhuan Chen
2020-06-30
BERTERS: Multimodal Representation Learning for Expert Recommendation System with Transformer
N. Nikzad-KhasmakhiM. A. BalafarM. Reza Feizi-DerakhshiCina Motamed
2020-06-30
Simplifying Models with Unlabeled Output Data
Sang Michael XieTengyu MaPercy Liang
2020-06-29
Predicting Length of Stay in the Intensive Care Unit with Temporal Pointwise Convolutional Networks
| Emma RocheteauPietro LiòStephanie Hyland
2020-06-29
A Transformer-based joint-encoding for Emotion Recognition and Sentiment Analysis
| Jean-Benoit DelbrouckNoé TitsMathilde BrousmicheStéphane Dupont
2020-06-29
Multi-level colonoscopy malignant tissue detection with adversarial CAC-UNet
Chuang ZhuKe MeiTing PengYihao LuoJun LiuYing WangMulan Jin
2020-06-29
Improving Uncertainty Estimates through the Relationship with Adversarial Robustness
Yao QinXuezhi WangAlex BeutelEd H. Chi
2020-06-29
Multi-Head Attention: Collaborate Instead of Concatenate
| Jean-Baptiste CordonnierAndreas LoukasMartin Jaggi
2020-06-29
Interpreting Hierarchical Linguistic Interactions in DNNs
Die ZhangHuilin ZhouXiaoyi BaoDa HuoRuizhao ChenXu ChengHao ZhangMengyue WuQuanshi Zhang
2020-06-29
Rethinking Positional Encoding in Language Pre-training
| Guolin KeDi HeTie-Yan Liu
2020-06-28
Self-Attention Networks for Intent Detection
Sevinj YolchuyevaGéza NémethBálint Gyires-Tóth
2020-06-28
Bottom-Up Human Pose Estimation by Ranking Heatmap-Guided Adaptive Keypoint Estimates
| Ke SunZigang GengDepu MengBin XiaoDong LiuZhaoxiang ZhangJingdong Wang
2020-06-28
Causal Explanations of Image Misclassifications
Yan MinMiles Bennett
2020-06-28
Mind The Facts: Knowledge-Boosted Coherent Abstractive Text Summarization
Beliz GunelChenguang ZhuMichael ZengXuedong Huang
2020-06-27
What they do when in doubt: a study of inductive biases in seq2seq learners
Eugene KharitonovRahma Chaabouni
2020-06-26
TURL: Table Understanding through Representation Learning
| Xiang DengHuan SunAlyssa LeesYou WuCong Yu
2020-06-26
BERTology Meets Biology: Interpreting Attention in Protein Language Models
| Jesse VigAli MadaniLav R. VarshneyCaiming XiongRichard SocherNazneen Fatema Rajani
2020-06-26
Conditional Set Generation with Transformers
Adam R KosiorekHyunjik KimDanilo J Rezende
2020-06-26
Learning Source Phrase Representations for Neural Machine Translation
Hongfei XuJosef van GenabithDeyi XiongQiuhui LiuJingyi Zhang
2020-06-25
Self-Segregating and Coordinated-Segregating Transformer for Focused Deep Multi-Modular Network for Visual Question Answering
Chiranjib Sur
2020-06-25
SACT: Self-Aware Multi-Space Feature Composition Transformer for Multinomial Attention for Video Captioning
Chiranjib Sur
2020-06-25
Differentiable Window for Dynamic Local Attention
Thanh-Tung NguyenXuan-Phi NguyenShafiq JotyXiaoli Li
2020-06-24
Imbalanced Gradients: A New Cause of Overestimated Adversarial Robustness
Linxi JiangXingjun MaZejia WengJames BaileyYu-Gang Jiang
2020-06-24
Class-Similarity Based Label Smoothing for Generalized Confidence Calibration
Chihuang LiuJoseph JaJa
2020-06-24
A Novel and Reliable Deep Learning Web-Based Tool to Detect COVID-19 Infection from Chest CT-Scan
| Abdolkarim SaeediMaryam SaeediArash Maghsoudi
2020-06-24
Hybrid Spatio-Temporal Graph Convolutional Network: Improving Traffic Prediction with Navigation Data
| Rui DaiShenkun XuQian GuChenguang JiKaikui Liu
2020-06-23
Bach or Mock? A Grading Function for Chorales in the Style of J.S. Bach
| Alexander FangAlisa LiuPrem SeetharamanBryan Pardo
2020-06-23
Self-supervised edge features for improved Graph Neural Network training
| Arijit SehanobishNeal G. RavindraDavid van Dijk
2020-06-23
A Self-Attention Network based Node Embedding Model
Dai Quoc NguyenTu Dinh NguyenDinh Phung
2020-06-22
Exploring Software Naturalness through Neural Language Models
Luca BurattiSaurabh PujarMihaela BorneaScott McCarleyYunhui ZhengGaetano RossielloAlessandro MorariJim LaredoVeronika ThostYufan ZhuangGiacomo Domeniconi
2020-06-22
AdvAug: Robust Adversarial Augmentation for Neural Machine Translation
Yong ChengLu JiangWolfgang MachereyJacob Eisenstein
2020-06-21
The NYU-CUBoulder Systems for SIGMORPHON 2020 Task 0 and Task 2
Assaf SingerKatharina Kann
2020-06-21
Off-Policy Self-Critical Training for Transformer in Visual Paragraph Generation
Shiyang YanYang HuaNeil M. Robertson
2020-06-21
A Universal Representation Transformer Layer for Few-Shot Image Classification
| Lu LiuWilliam HamiltonGuodong LongJing JiangHugo Larochelle
2020-06-21
Towards Understanding Label Smoothing
Yi XuYuanhong XuQi QianHao LiRong Jin
2020-06-20
Memory Transformer
Mikhail S. BurtsevGrigory V. Sapunov
2020-06-20
Unsupervised Vehicle Re-identification with Progressive Adaptation
Jinjia PengYang WangHuibing WangZhao ZhangXianping FuMeng Wang
2020-06-20
End-to-end deep metamodeling to calibrate and optimize energy loads
Max CohenMaurice CharbitSylvain Le CorffMarius PredaGilles Nozière
2020-06-19
Boosting Objective Scores of Speech Enhancement Model through MetricGAN Post-Processing
Szu-Wei FuChien-Feng LiaoTsun-An HsiehKuo-Hsuan HungSyu-Siang WangCheng YuHeng-Cheng KuoRyandhimas E. ZezarioYou-Jin LiShang-Yi ChuangYen-Ju LuYu Tsao
2020-06-18
Multi-branch Attentive Transformer
| Yang FanShufang XieYingce XiaLijun WuTao QinXiang-Yang LiTie-Yan Liu
2020-06-18
I-BERT: Inductive Generalization of Transformer to Arbitrary Context Lengths
| Hyoungwook NamSeung Byum SeoVikram Sharma MailthodyNoor MichaelLan Li
2020-06-18
SEAL: Segment-wise Extractive-Abstractive Long-form Text Summarization
Yao ZhaoMohammad SalehPeter J. Liu
2020-06-18
Sparse GPU Kernels for Deep Learning
Trevor GaleMatei ZahariaCliff YoungErich Elsen
2020-06-18
SenWave: Monitoring the Global Sentiments under the COVID-19 Pandemic
Qiang YangHind AlamroSomayah AlbaradeiAdil SalhiXiaoting LvChangsheng MaManal AlshehriInji JaberFaroug TifrateneWei WangTakashi GojoboriCarlos M. DuarteXin GaoXiangliang Zhang
2020-06-18
Intelligent Protection & Classification of Transients in Two-Core Symmetric Phase Angle Regulating Transformers
Pallav Kumar BeraCan Isik
2020-06-17
Automatically Ranked Russian Paraphrase Corpus for Text Generation
Vadim GudkovOlga MitrofanovaElizaveta Filippskikh
2020-06-17
Learning Visual Commonsense for Robust Scene Graph Generation
Alireza ZareianZhecan WangHaoxuan YouShih-Fu Chang
2020-06-17
Modeling Graph Structure via Relative Position for Better Text Generation from Knowledge Graphs
Martin SchmittLeonardo F. R. RibeiroPhilipp DufterIryna GurevychHinrich Schütze
2020-06-16
COVID-CXNet: Detecting COVID-19 in Frontal Chest X-ray Images using Deep Learning
| Arman HaghanifarMahdiyar Molahasani MajdabadiYounhee ChoiS. DeivalakshmiSeokbum Ko
2020-06-16
Fine-grained Human Evaluation of Transformer and Recurrent Approaches to Neural Machine Translation for English-to-Chinese
| Yuying YeAntonio Toral
2020-06-15
On the Multi-Property Extraction and Beyond
Tomasz DwojakMichał PietruszkaŁukasz BorchmannFilip GralińskiJakub Chłędowski
2020-06-15
Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset
Andrei AndrusenkoAleksandr LaptevIvan Medennikov
2020-06-15
Differentiable Neural Architecture Transformation for Reproducible Architecture Improvement
Do-Guk KimHeung-Chang Lee
2020-06-15
Multi-Image Summarization: Textual Summary from a Set of Cohesive Images
Nicholas TrieuSebastian GoodmanPradyumna NarayanaKazoo SoneRadu Soricut
2020-06-15
Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya
Abrhalei TelaAbraham WoubieVille Hautamaki
2020-06-13
Guided Transformer: Leveraging Multiple External Sources for Representation Learning in Conversational Search
Helia HashemiHamed ZamaniW. Bruce Croft
2020-06-13
Temporal Fusion Network for Temporal Action Localization:Submission to ActivityNet Challenge 2020 (Task E)
Zhiwu QingXiang WangYongpeng SangChangxin GaoShiwei ZhangNong Sang
2020-06-13
Modelling High-Level Mathematical Reasoning in Mechanised Declarative Proofs
Wenda LiLei YuYuhuai WuLawrence C. Paulson
2020-06-13
Comparing Natural Language Processing Techniques for Alzheimer's Dementia Prediction in Spontaneous Speech
Thomas SearleZina IbrahimRichard Dobson
2020-06-12
Unmasking the Inductive Biases of Unsupervised Object Representations for Video Sequences
Marissa A. WeisKashyap ChittaYash SharmaWieland BrendelMatthias BethgeAndreas GeigerAlexander S. Ecker
2020-06-12
Dance Revolution: Long Sequence Dance Generation with Music via Curriculum Learning
| Ruozi HuangHuang HuWei WuKei SawadaMi Zhang
2020-06-11
FastPitch: Parallel Text-to-speech with Pitch Prediction
Adrian Łańcucki
2020-06-11
Extrapolation for Large-batch Training in Deep Learning
Tao LinLingjing KongSebastian U. StichMartin Jaggi
2020-06-10
On Mixup Regularization
Luigi CarratinoMoustapha CisséRodolphe JenattonJean-Philippe Vert
2020-06-10
Graph-Aware Transformer: Is Attention All Graphs Need?
Sanghyun YooYoung-Seok KimKang Hyun LeeKuhwan JeongJunhwi ChoiHoshik LeeYoung Sang Choi
2020-06-09
Self-Distillation as Instance-Specific Label Smoothing
Zhilu ZhangMert R. Sabuncu
2020-06-09
HausaMT v1.0: Towards English-Hausa Neural Machine Translation
Adewale Akinfaderin
2020-06-09
Linformer: Self-Attention with Linear Complexity
| Sinong WangBelinda Z. LiMadian KhabsaHan FangHao Ma
2020-06-08
Modeling Discourse Structure for Document-level Neural Machine Translation
Junxuan ChenXiang LiJiarui ZhangChulun ZhouJianwei CuiBin WangJinsong Su
2020-06-08
MultiSpeech: Multi-Speaker Text to Speech with Transformer
Mingjian ChenXu TanYi RenJin XuHao SunSheng ZhaoTao Qin
2020-06-08
Learning to Count Words in Fluent Speech enables Online Speech Recognition
| George SterpuChristian SaamNaomi Harte
2020-06-08
Wat zei je? Detecting Out-of-Distribution Translations with Variational Transformers
Tim Z. XiaoAidan N. GomezYarin Gal
2020-06-08
Learning Texture Transformer Network for Image Super-Resolution
| Fuzhi YangHuan YangJianlong FuHongtao LuBaining Guo
2020-06-07
Challenges and Thrills of Legal Arguments
Anurag PallaproluRadha VaidyaAditya Swaroop Attawar
2020-06-06
Masked Language Modeling for Proteins via Linearly Scalable Long-Context Transformers
Krzysztof ChoromanskiValerii LikhosherstovDavid DohanXingyou SongJared DavisTamas SarlosDavid BelangerLucy ColwellAdrian Weller
2020-06-05
GMAT: Global Memory Augmentation for Transformers
| Ankit GuptaJonathan Berant
2020-06-05
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
| Zihang DaiGuokun LaiYiming YangQuoc V. Le
2020-06-05
An Overview of Neural Network Compression
James O' Neill
2020-06-05
End-to-End Speech-Translation with Knowledge Distillation: [email protected]
Marco GaidoMattia Antonino Di GangiMatteo NegriMarco Turchi
2020-06-04
On the Predictive Power of Neural Language Models for Human Real-Time Comprehension Behavior
Ethan Gotlieb WilcoxJon GauthierJennifer HuPeng QianRoger Levy
2020-06-02
Subjective Question Answering: Deciphering the inner workings of Transformers in the realm of subjectivity
Lukas Muttenthaler
2020-06-02
Online Versus Offline NMT Quality: An In-depth Analysis on English-German and German-English
Maha ElbayadMichael UstaszewskiEmmanuelle Esperança-RodierFrancis Brunet ManquatLaurent Besacier
2020-06-01
Context-based Transformer Models for Answer Sentence Selection
Ivano LauriolaAlessandro Moschitti
2020-06-01
Unsupervised Sparse-view Backprojection via Convolutional and Spatial Transformer Networks
Xueqing LiuPaul Sajda
2020-06-01
Image Search With Text Feedback by Visiolinguistic Attention Learning
| Yanbei Chen Shaogang Gong Loris Bazzani
2020-06-01
Revisiting Knowledge Distillation via Label Smoothing Regularization
Li Yuan Francis EH Tay Guilin Li Tao Wang Jiashi Feng
2020-06-01
Few-Shot Learning of Part-Specific Probability Space for 3D Shape Segmentation
Lingjing Wang Xiang Li Yi Fang
2020-06-01
RDCFace: Radial Distortion Correction for Face Recognition
He Zhao Xianghua Ying Yongjie Shi Xin Tong Jingsi Wen Hongbin Zha
2020-06-01
ActBERT: Learning Global-Local Video-Text Representations
Linchao Zhu Yi Yang
2020-06-01
BWCNN: Blink to Word, a Real-Time Convolutional Neural Network Approach
Albara Ah RamliRex LiuRahul KrishnamoorthyVishal I BXiaoxiao WangIlias TagkopoulosXin Liu
2020-06-01
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
| Zhewei YaoAmir GholamiSheng ShenKurt KeutzerMichael W. Mahoney
2020-06-01
BPGC at SemEval-2020 Task 11: Propaganda Detection in News Articles with Multi-Granularity Knowledge Sharing and Linguistic Features based Ensemble Learning
Rajaswa PatilSomesh SinghSwati Agarwal
2020-05-31
CNRL at SemEval-2020 Task 5: Modelling Causal Reasoning in Language with Multi-Head Self-Attention Weights based Counterfactual Detection
Rajaswa PatilVeeky Baths
2020-05-31
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
| Hanrui WangZhanghao WuZhijian LiuHan CaiLigeng ZhuChuang GanSong Han
2020-05-28
Variational Neural Machine Translation with Normalizing Flows
Hendra SetiawanMatthias SperberUdhay NallasamyMatthias Paulik
2020-05-28
Empirical Evaluation of Pretraining Strategies for Supervised Entity Linking
Thibault FévryNicholas FitzGeraldLivio Baldini SoaresTom Kwiatkowski
2020-05-28
General-Purpose User Embeddings based on Mobile App Usage
| Junqi ZhangBing BaiYe LinJian LiangKun BaiFei Wang
2020-05-27
Permutation Matters: Anisotropic Convolutional Layer for Learning on Point Clouds
| Zhongpai GaoGuangtao ZhaiJunchi YanXiaokang Yang
2020-05-27
Insertion-Based Modeling for End-to-End Automatic Speech Recognition
Yuya FujitaShinji WatanabeMotoi OmachiXuankai Chan
2020-05-27
End-to-End Object Detection with Transformers
| Nicolas CarionFrancisco MassaGabriel SynnaeveNicolas UsunierAlexander KirillovSergey Zagoruyko
2020-05-26
GECToR -- Grammatical Error Correction: Tag, Not Rewrite
| Kostiantyn OmelianchukVitaliy AtrasevychArtem ChernodubOleksandr Skurzhanskyi
2020-05-26
Guiding Symbolic Natural Language Grammar Induction via Transformer-Based Sequence Probabilities
Ben GoertzelAndres Suarez MadrigalGino Yu
2020-05-26
Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition
Lei KangPau RibaMarçal RusiñolAlicia FornésMauricio Villegas
2020-05-26
Deep Learning Models for Automatic Summarization
Pirmin Lemberger
2020-05-25
The Unreasonable Volatility of Neural Machine Translation Models
| Marzieh FadaeeChristof Monz
2020-05-25
Adversarial NLI for Factual Correctness in Text Summarisation Models
Mario BarrantesBenedikt HerudekRichard Wang
2020-05-24
Devising Malware Characterstics using Transformers
Simra ShahidTanmay SinghYash SharmaKapil Sharma
2020-05-23
Coronavirus: Comparing COVID-19, SARS and MERS in the eyes of AI
Anas TahirYazan QiblaweyAmith KhandakarTawsifur RahmanUzair KhurshidFarayi MusharavatiM. T. IslamSerkan KiranyazMuhammad E. H. Chowdhury
2020-05-23
Character-level Transformer-based Neural Machine Translation
Nikolay BanarWalter DaelemansMike Kestemont
2020-05-22
A Generative Approach to Titling and Clustering Wikipedia Sections
Anjalie FieldSascha RotheSimon BaumgartnerCong YuAbe Ittycheriah
2020-05-22
Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection
Danni LiuGerasimos SpanakisJan Niehues
2020-05-22
Transformer-based Context-aware Sarcasm Detection in Conversation Threads from Social Media
Xiangjue DongChangmao LiJinho D. Choi
2020-05-22
Simplified Self-Attention for Transformer-based End-to-End Speech Recognition
Haoneng LuoShiliang ZhangMing LeiLei Xie
2020-05-21
Leveraging Text Data Using Hybrid Transformer-LSTM Based End-to-End ASR in Transfer Learning
Zhiping ZengVan Tung PhamHaihua XuYerbolat KhassanovEng Siong ChngChongjia NiBin Ma
2020-05-21
Applying the Transformer to Character-level Transduction
Shijie WuRyan CotterellMans Hulden
2020-05-20
Relative Positional Encoding for Speech Recognition and Direct Translation
Ngoc-Quan PhamThanh-Le HaTuan-Nam NguyenThai-Son NguyenElizabeth SaleskySebastian StuekerJan NiehuesAlexander Waibel
2020-05-20
A Further Study of Unsupervised Pre-training for Transformer Based Speech Recognition
Dongwei JiangWubo LiRuixiong ZhangMiao CaoNe LuoYang HanWei ZouXiangang Li
2020-05-20
Comparing Transformers and RNNs on predicting human sentence processing data
Danny MerkxStefan L. Frank
2020-05-19
Learning from a Lightweight Teacher for Efficient Knowledge Distillation
Yuang LiuWei ZhangJun Wang
2020-05-19
Should we hard-code the recurrence concept or learn it instead ? Exploring the Transformer architecture for Audio-Visual Speech Recognition
| George SterpuChristian SaamNaomi Harte
2020-05-19
Sketch-BERT: Learning Sketch Bidirectional Encoder Representation from Transformers by Self-supervised Learning of Sketch Gestalt
Hangyu LinYanwei FuYu-Gang JiangXiangyang Xue
2020-05-19
Exploring Transformers for Large-Scale Speech Recognition
Liang LuChangliang LiuJinyu LiYifan Gong
2020-05-19
A Transformer-based Embedding Model for Personalized Product Search
Keping BiQingyao AiW. Bruce Croft
2020-05-18
Efficient Wait-k Models for Simultaneous Machine Translation
Maha ElbayadLaurent BesacierJakob Verbeek
2020-05-18
Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction
Cunjun YuXiao MaJiawei RenHaiyu ZhaoShuai Yi
2020-05-18
Many-to-Many Voice Transformer Network
Hirokazu KameokaWen-Chin HuangKou TanakaTakuhiro KanekoNobukatsu HojoTomoki Toda
2020-05-18
GPT-too: A language-model-first approach for AMR-to-text generation
| Manuel MagerRamon Fernandez AstudilloTahira NaseemMd Arafat SultanYoung-Suk LeeRadu FlorianSalim Roukos
2020-05-18
Weak-Attention Suppression For Transformer Based Speech Recognition
Yangyang ShiYongqiang WangChunyang WuChristian FuegenFrank ZhangDuc LeChing-Feng YehMichael L. Seltzer
2020-05-18
Mask CTC: Non-Autoregressive End-to-End ASR with CTC and Mask Predict
Yosuke HiguchiShinji WatanabeNanxin ChenTetsuji OgawaTetsunori Kobayashi
2020-05-18
A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer
| Vladimir IashinEsa Rahtu
2020-05-17
Building a Hebrew Semantic Role Labeling Lexical Resource from Parallel Movie Subtitles
Ben EyalMichael Elhadad
2020-05-17
Conformer: Convolution-augmented Transformer for Speech Recognition
| Anmol GulatiJames QinChung-Cheng ChiuNiki ParmarYu ZhangJiahui YuWei HanShibo WangZhengdong ZhangYonghui WuRuoming Pang
2020-05-16
Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension
Hongyu GongYelong ShenDian YuJianshu ChenDong Yu
2020-05-16
Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory
Chunyang WuYongqiang WangYangyang ShiChing-Feng YehFrank Zhang
2020-05-16
IntelliCode Compose: Code Generation Using Transformer
Alexey SvyatkovskiyShao Kun DengShengyu FuNeel Sundaresan
2020-05-16
Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition
Zhengkun TianJiangyan YiJianhua TaoYe BaiShuai ZhangZhengqi Wen
2020-05-16
COVID-Twitter-BERT: A Natural Language Processing Model to Analyse COVID-19 Content on Twitter
| Martin MüllerMarcel SalathéPer E Kummervold
2020-05-15
Neural Entity Linking on Technical Service Tickets
Nadja KurzFelix HamannAdrian Ulges
2020-05-15
Finding Experts in Transformer Models
Xavier SuauLuca ZappellaNicholas Apostoloff
2020-05-15
JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech without Explicit Alignment
Dan LimWon JangGyeonghwan OHyeyeong ParkBongwan KimJesam Yoon
2020-05-15
The Unstoppable Rise of Computational Linguistics in Deep Learning
James Henderson
2020-05-13
Discriminative Multi-modality Speech Recognition
| Bo XuCheng LuYandong GuoJacob Wang
2020-05-12
Simultaneous paraphrasing and translation by fine-tuning Transformer models
Rakesh Chada
2020-05-12
SOLOIST: Few-shot Task-Oriented Dialog with A Single Pre-trained Auto-regressive Model
Baolin PengChunyuan LiJinchao LiShahin ShayandehLars LidenJianfeng Gao
2020-05-11
Hierarchical Attention Transformer Architecture For Syntactic Spell Correction
Abhishek NiranjanM Ali Basha ShaikKushal Verma
2020-05-11
Listen Attentively, and Spell Once: Whole Sentence Generation via a Non-Autoregressive Architecture for Low-Latency Speech Recognition
Ye BaiJiangyan YiJianhua TaoZhengkun TianZhengqi WenShuai Zhang
2020-05-11
MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
| Jie LeiLiwei WangYelong ShenDong YuTamara L. BergMohit Bansal
2020-05-11
On the Generation of Medical Dialogues for COVID-19
| Wenmian YangGuangtao ZengBowen TanZeqian JuSubrato ChakravortyXuehai HeShu ChenXingyi YangQingyang WuZhou YuEric XingPengtao Xie
2020-05-11
Epipolar Transformers
| Yihui HeRui YanKaterina FragkiadakiShoou-I Yu
2020-05-10
Transformer Based Language Models for Similar Text Retrieval and Ranking
Javed Qadrud-DinAshraf Bah RabiouRyan WalkerRavi SoniMartin GajekGabriel PackAkhil Rangaraj
2020-05-10
SocialTrans: A Deep Sequential Model with Social Information for Web-Scale Recommendation Systems
Qiaoan ChenHao GuLingling YiYishi LinPeng HeChuan ChenYangqiu Song
2020-05-09
It's Morphin' Time! Combating Linguistic Discrimination with Inflectional Perturbations
| Samson TanShafiq JotyMin-Yen KanRichard Socher
2020-05-09
schuBERT: Optimizing Elements of BERT
Ashish KhetanZohar Karnin
2020-05-09
Character Matters: Video Story Understanding with Character-Aware Relations
Shijie GengJi ZhangZuohui FuPeng GaoHang ZhangGerard de Melo
2020-05-09
ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks
Xinshao WangYang HuaElyor KodirovNeil M. Robertson
2020-05-07
Mapping Natural Language Instructions to Mobile UI Action Sequences
| Yang LiJiacong HeXin ZhouYuan ZhangJason Baldridge
2020-05-07
A Systematic Assessment of Syntactic Generalization in Neural Language Models
Jennifer HuJon GauthierPeng QianEthan WilcoxRoger P. Levy
2020-05-07
Comparison and Benchmarking of AI Models and Frameworks on Mobile Devices
Chunjie LuoXiwen HeJianfeng ZhanLei WangWanling GaoJiahui Dai
2020-05-07
An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining
Yifan PengQingyu ChenZhiyong Lu
2020-05-06
The Cascade Transformer: an Application for Efficient Answer Sentence Selection
| Luca SoldainiAlessandro Moschitti
2020-05-05
Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change
Hongfei XuJosef van GenabithDeyi XiongQiuhui Liu
2020-05-05
OpinionDigest: A Simple Framework for Opinion Summarization
Yoshihiko SuharaXiaolan WangStefanos AngelidisWang-Chiew Tan
2020-05-05
Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture
Christopher BrixParnia BaharHermann Ney
2020-05-04
3D Printed Brain-Controlled Robot-Arm Prosthetic via Embedded Deep Learning from sEMG Sensors
David LonsdaleLi ZhangRichard Jiang
2020-05-04
On the Inference Calibration of Neural Machine Translation
Shuo WangZhaopeng TuShuming ShiYang Liu
2020-05-03
An Accurate Model for Predicting the (Graded) Effect of Context in Word Similarity Based on Bert
Wei BaoHongshu CheJiandong Zhang
2020-05-03
Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints
Zhenyi WangXiaoyang WangBang AnDong YuChangyou Chen
2020-05-03
Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation
| Xuanli HeGholamreza HaffariMohammad Norouzi
2020-05-03
Generalized Entropy Regularization or: There's Nothing Special about Label Smoothing
Clara MeisterElizabeth SaleskyRyan Cotterell
2020-05-02
Quantifying Attention Flow in Transformers
| Samira AbnarWillem Zuidema
2020-05-02
Measuring and Reducing Non-Multifact Reasoning in Multi-hop Question Answering
Harsh TrivediNiranjan BalasubramanianTushar KhotAshish Sabharwal
2020-05-02
Synthesizer: Rethinking Self-Attention in Transformer Models
Yi TayDara BahriDonald MetzlerDa-Cheng JuanZhe ZhaoChe Zheng
2020-05-02
Hard-Coded Gaussian Attention for Neural Machine Translation
Weiqiu YouSimeng SunMohit Iyyer
2020-05-02
Contrastive Self-Supervised Learning for Commonsense Reasoning
| Tassilo KleinMoin Nabi
2020-05-02
The AVA-Kinetics Localized Human Actions Video Dataset
Ang LiMeghana ThotakuriDavid A. RossJoão CarreiraAlexander VostrikovAndrew Zisserman
2020-05-01
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie LiYen-Chun ChenYu ChengZhe GanLicheng YuJingjing Liu
2020-05-01
A Transformer-based Approach for Source Code Summarization
| Wasi Uddin AhmadSaikat ChakrabortyBaishakhi RayKai-Wei Chang
2020-05-01
Multi-scale Transformer Language Models
Sandeep SubramanianRonan CollobertMarc'Aurelio RanzatoY-Lan Boureau
2020-05-01
Event Clustering within News Articles
Faik Kerem {\"O}rsS{\"u}veyda YeniterziReyyan Yeniterzi
2020-05-01
Detecting Direct Speech in Multilingual Collection of 19th-century Novels
Joanna ByszukMicha{\l} Wo{\'z}niakMike KestemontAlbert Le{\'s}niakWojciech {\L}ukasikArtjoms {\v{S}}e{\c{l}}aMaciej Eder
2020-05-01
ASU\_OPTO at OSACT4 - Offensive Language Detection for Arabic text
Amr KelegSamhaa R. El-BeltagyMahmoud Khalil
2020-05-01
Scaling Language Data Import/Export with a Data Transformer Interface
Nicholas BuckeridgeBen Foley
2020-05-01
Aggression Identification in Social Media: a Transfer Learning Based Approach
RamiFaneva risoaJosiane Mothe
2020-05-01
IRIT at TRAC 2020
RamiFaneva risoaJosiane Mothe
2020-05-01
Multilingual Joint Fine-tuning of Transformer models for identifying Trolling, Aggression and Cyberbullying at TRAC 2020
| Sudhanshu MishraShivangi PrasadShubhanshu Mishra
2020-05-01
On the Influence of Coreference Resolution on Word Embeddings in Lexical-semantic Evaluation Tasks
Alex HenleinerAlex Mehlerer
2020-05-01
Chinese Discourse Parsing: Model and Evaluation
Lin Chuan-AnShyh-Shiun HungHen-Hsen HuangHsin-Hsi Chen
2020-05-01
DecOp: A Multilingual and Multi-domain Corpus For Detecting Deception In Typed Text
Pasquale CapuozzoIvano LauriolaCarlo StrapparavaFabio AiolliGiuseppe Sartori
2020-05-01
Paraphrase Generation and Evaluation on Colloquial-Style Sentences
Eetu Sj{\"o}blomMathias CreutzYves Scherrer
2020-05-01
Building a Task-oriented Dialog System for Languages with no Training Data: the Case for Basque
Maddalen L{\'o}pez de LacalleXabier SaralegiI{\~n}aki San Vicente
2020-05-01
Linguistically Informed Hindi-English Neural Machine Translation
Vikrant GoyalPruthwik MishraDipti Misra Sharma
2020-05-01
Corpora for Document-Level Neural Machine Translation
Siyou LiuXiaojun Zhang
2020-05-01
Exploring Transformer Text Generation for Medical Dataset Augmentation
Ali Amin-NejadJulia IveSumithra Velupillai
2020-05-01
Much Ado About Nothing -- Identification of Zero Copulas in Hungarian Using an NMT Model
Andrea D{\"o}m{\"o}t{\"o}rZijian Gy{\H{o}}z{\H{o}} YangAttila Nov{\'a}k
2020-05-01
ParlVote: A Corpus for Sentiment Analysis of Political Debates
Gavin AbercrombieRiza Batista-Navarro
2020-05-01
Cross-lingual and Cross-domain Evaluation of Machine Reading Comprehension with Squad and CALOR-Quest Corpora
Delphine CharletGeraldine DamnatiFrederic Bechetgabriel marzinottoJohannes Heinecke
2020-05-01
Contextualized Embeddings based Transformer Encoder for Sentence Similarity Modeling in Answer Selection Task
| Md Tahmid Rahman LaskarJimmy Xiangji HuangEnamul Hoque
2020-05-01
Evaluation of Dataset Selection for Pre-Training and Fine-Tuning Transformer Language Models for Clinical Question Answering
Sarvesh SoniKirk Roberts
2020-05-01
Minority Positive Sampling for Switching Points - an Anecdote for the Code-Mixing Language Modeling
Arindam ChatterjereVineeth GupthaParul ChopraAmitava Das
2020-05-01
Seq2SeqPy: A Lightweight and Customizable Toolkit for Neural Sequence-to-Sequence Modeling
Raheel QaderFran{\c{c}}ois PortetCyril Labbe
2020-05-01
SegaBERT: Pre-training of Segment-aware BERT for Language Understanding
He BaiPeng ShiJimmy LinLuchen TanKun XiongWen GaoMing Li
2020-04-30
Addressing Zero-Resource Domains Using Document-Level Context in Neural Machine Translation
Dario StojanovskiAlexander Fraser
2020-04-30
Progressive Transformers for End-to-End Sign Language Production
| Ben SaundersNecati Cihan CamgozRichard Bowden
2020-04-30
Accurate Word Alignment Induction from Neural Machine Translation
Yun ChenYang LiuGuanhua ChenXin JiangQun Liu
2020-04-30
Character-Level Translation with Self-attention
Yingqiang GaoNikola I. NikolovYuhuang HuRichard H. R. Hahnloser
2020-04-30
Semantic Triple Encoder for Fast Open-Set Link Prediction
Bo WangTao ShenGuodong LongTianyi ZhouYi Chang
2020-04-30
Self-Supervised and Controlled Multi-Document Opinion Summarization
Hady ElsaharMaximin CoavouxMatthias GalléJos Rozen
2020-04-30
End-to-End Neural Word Alignment Outperforms GIZA++
Thomas ZenkelJoern WuebkerJohn DeNero
2020-04-30
Capsule-Transformer for Neural Machine Translation
Sufeng DuanJuncheng CaoHai Zhao
2020-04-30
Breaking (Global) Barriers in Parallel Stochastic Optimization with Wait-Avoiding Group Averaging
Shigang LiTal Ben-NunGiorgi NadiradzeSalvatore Di GirolamoNikoli DrydenDan AlistarhTorsten Hoefler
2020-04-30
Towards Character-Level Transformer NMT by Finetuning Subword Systems
Jindřich LibovickýAlexander Fraser
2020-04-29
Efficient Document Re-Ranking for Transformers by Precomputing Term Representations
| Sean MacAvaneyFranco Maria NardiniRaffaele PeregoNicola TonellottoNazli GoharianOphir Frieder
2020-04-29
Image Captioning through Image Transformer
Sen HeWentong LiaoHamed R. TavakoliMichael YangBodo RosenhahnNicolas Pugeault
2020-04-29
Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning
Alexandre TamborrinoNicola PellicanoBaptiste PannierPascal VoitotLouise Naudin
2020-04-29
Image Morphing with Perceptual Constraints and STN Alignment
Noa FishRichard ZhangLilach PerryDaniel Cohen-OrEli ShechtmanConnelly Barnes
2020-04-29
Multiresolution and Multimodal Speech Recognition with Transformers
Georgios ParaskevopoulosSrinivas ParthasarathyAparna KhareShiva Sundaram
2020-04-29
EARL: Speedup Transformer-based Rankers with Pre-computed Representation
Luyu GaoZhuyun DaiJamie Callan
2020-04-28
VD-BERT: A Unified Vision and Dialog Transformer with BERT
Yue WangShafiq JotyMichael R. LyuIrwin KingCaiming XiongSteven C. H. Hoi
2020-04-28
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference
| Ji XinRaphael TangJaejun LeeYaoliang YuJimmy Lin
2020-04-27
Augmenting Transformers with KNN-Based Composite Memory for Dialogue
Angela FanClaire GardentChloe BraudAntoine Bordes
2020-04-27
Lexically Constrained Neural Machine Translation with Levenshtein Transformer
| Raymond Hendy SusantoShamil ChollampattLiling Tan
2020-04-27
Explicitly Modeling Adaptive Depths for Transformer
Yijin LiuFandong MengJie ZhouYufeng ChenJinan Xu
2020-04-27
Experiments with LVT and FRE for Transformer model
Ilshat GibadullinAidar Valeev
2020-04-26
Causal Mediation Analysis for Interpreting Neural NLP: The Case of Gender Bias
Jesse VigSebastian GehrmannYonatan BelinkovSharon QianDaniel NevoYaron SingerStuart Shieber
2020-04-26
Research on Modeling Units of Transformer Transducer for Mandarin Speech Recognition
Li FuXiaoxiao LiLibo Zi
2020-04-26
Choppy: Cut Transformer For Ranked List Truncation
Dara BahriYi TayChe ZhengDonald MetzlerAndrew Tomkins
2020-04-26
Combining Word Embeddings and N-grams for Unsupervised Document Summarization
Zhuolin JiangManaj SrivastavaSanjay KrishnaDavid AkodesRichard Schwartz
2020-04-25
All Word Embeddings from One Embedding
| Sho TakaseSosuke Kobayashi
2020-04-25
Lite Transformer with Long-Short Range Attention
| Zhanghao WuZhijian LiuJi LinYujun LinSong Han
2020-04-24
On Sparsifying Encoder Outputs in Sequence-to-Sequence Models
Biao ZhangIvan TitovRico Sennrich
2020-04-24
FLAT: Chinese NER Using Flat-Lattice Transformer
Xiaonan LiHang YanXipeng QiuXuanjing Huang
2020-04-24
Understanding when spatial transformer networks do not support invariance, and what to do about it
Lukas FinnvedenYlva JanssonTony Lindeberg
2020-04-24
UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer Networks for Offensive Language Detection
Gregor WiedemannSeid Muhie YimamChris Biemann
2020-04-23
MolTrans: Molecular Interaction Transformer for Drug Target Interaction Prediction
Kexin HuangCao XiaoLucas GlassJimeng Sun
2020-04-23
Self-Attention Attribution: Interpreting Information Interactions Inside Transformer
Yaru HaoLi DongFuru WeiKe Xu
2020-04-23
YOLOv4: Optimal Speed and Accuracy of Object Detection
| Alexey BochkovskiyChien-Yao WangHong-Yuan Mark Liao
2020-04-23
Automated diagnosis of COVID-19 with limited posteroanterior chest X-ray images using fine-tuned deep neural networks
Narinder Singh PunnSonali Agarwal
2020-04-23
Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription
Andrei AndrusenkoAleksandr LaptevIvan Medennikov
2020-04-22
Logical Natural Language Generation from Open-Domain Tables
| Wenhu ChenJianshu ChenYu SuZhiyu ChenWilliam Yang Wang
2020-04-22
Vector Quantized Contrastive Predictive Coding for Template-based Music Generation
| Gaëtan HadjeresLéopold Crestel
2020-04-21
Joint Cross-Modality Super Resolution
Guy ShachtSharon FogelDov DanonDaniel Cohen-OrIlya Leizerson
2020-04-21
DIET: Lightweight Language Understanding for Dialogue Systems
| Tanja BunkDaksh VarshneyaVladimir VlasovAlan Nichol
2020-04-21
Contextual Neural Machine Translation Improves Translation of Cataphoric Pronouns
KayYen WongSameen MarufGholamreza Haffari
2020-04-21
Keyphrase Generation with Cross-Document Attention
| Shizhe DiaoYan SongTong Zhang
2020-04-21
Learning Local Neighboring Structure for Robust 3D Shape Representation
| Zhongpai GaoGuangtao ZhaiJuyong ZhangJunchi YanYiyan YangXiaokang Yang
2020-04-21
A Review-based Transformer Model for Personalized Product Search
Keping BiQingyao AiW. Bruce Croft
2020-04-20
WHALETRANS: E2E WHisper to nAturaL spEech conversion using modified TRANSformer network
Abhishek NiranjanMukesh SharmaSai Bharath Chandra GuthaM Ali Basha Shaik
2020-04-20
Transformer Reasoning Network for Image-Text Matching and Retrieval
| Nicola MessinaFabrizio FalchiAndrea EsuliGiuseppe Amato
2020-04-20
Deep-COVID: Predicting COVID-19 From Chest X-Ray Images Using Deep Transfer Learning
| Shervin MinaeeRahele KafiehMilan SonkaShakib YazdaniGhazaleh Jamalipour Soufi
2020-04-20
ResNeSt: Split-Attention Networks
| Hang ZhangChongruo WuZhongyue ZhangYi ZhuZhi ZhangHaibin LinYue SunTong HeJonas MuellerR. ManmathaMu LiAlexander Smola
2020-04-19
Motion Segmentation using Frequency Domain Transformer Networks
Hafez FaraziSven Behnke
2020-04-18
Understanding the Difficulty of Training Transformers
| Liyuan LiuXiaodong LiuJianfeng GaoWeizhu ChenJiawei Han
2020-04-17
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
Yekun ChaiJin ShuoXinwen Hou
2020-04-17
Transform and Tell: Entity-Aware News Image Captioning
| Alasdair TranAlexander MathewsLexing Xie
2020-04-17
Enriching the Transformer with Linguistic and Semantic Factors for Low-Resource Machine Translation
Jordi Armengol-EstapéMarta R. Costa-jussàCarlos Escolano
2020-04-17
ETC: Encoding Long and Structured Data in Transformers
Joshua AinslieSantiago OntanonChris AlbertiPhilip PhamAnirudh RavulaSumit Sanghai
2020-04-17
Knowledge Distillation for Action Anticipation via Label Smoothing
Guglielmo CamporesePasquale CosciaAntonino FurnariGiovanni Maria FarinellaLamberto Ballan
2020-04-16
Non-Autoregressive Machine Translation with Latent Alignments
| Chitwan SahariaWilliam ChanSaurabh SaxenaMohammad Norouzi
2020-04-16
Towards Instance-Level Parser Selection for Cross-Lingual Transfer of Dependency Parsers
Robert LitschkoIvan VulićŽeljko AgićGoran Glavaš
2020-04-16
Entities as Experts: Sparse Memory Access with Entity Supervision
Thibault FévryLivio Baldini SoaresNicholas FitzGeraldEunsol ChoiTom Kwiatkowski
2020-04-15
SPECTER: Document-level Representation Learning using Citation-informed Transformers
| Arman CohanSergey FeldmanIz BeltagyDoug DowneyDaniel S. Weld
2020-04-15
Training with Quantization Noise for Extreme Model Compression
| Angela FanPierre StockBenjamin GrahamEdouard GraveRemi GribonvalHerve JegouArmand Joulin
2020-04-15
Transformer based Grapheme-to-Phoneme Conversion
Sevinj YolchuyevaGéza NémethBálint Gyires-Tóth
2020-04-14
ProFormer: Towards On-Device LSH Projection Based Transformers
Chinnadhurai SankarSujith RaviZornitsa Kozareva
2020-04-13
Relation Transformer Network
Rajat KonerPoulami SinhamahapatraVolker Tresp
2020-04-13
Relational Learning between Multiple Pulmonary Nodules via Deep Set Attention Transformers
Jiancheng YangHaoran DengXiaoyang HuangBingbing NiYi Xu
2020-04-12
Stacked Convolutional Deep Encoding Network for Video-Text Retrieval
Rui ZhaoKecheng ZhengZheng-jun Zha
2020-04-10
Telling BERT's full story: from Local Attention to Global Aggregation
Damian PascualGino BrunnerRoger Wattenhofer
2020-04-10
Cortical surface registration using unsupervised learning
| Jieyu ChengAdrian V. DalcaBruce FischlLilla Zollei
2020-04-09
On Optimal Transformer Depth for Low-Resource Language Translation
Elan van BiljonArnu PretoriusJulia Kreutzer
2020-04-09
GSA-DenseNet121-COVID-19: a Hybrid Deep Learning Architecture for the Diagnosis of COVID-19 Disease based on Gravitational Search Optimization Algorithm
Dalia EzzatAboul ell HassanienHassan Aboul Ella
2020-04-09
Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News Multi-Headline Generation
Dayiheng LiuYeyun GongJie FuWei LiuYu YanBo ShaoDaxin JiangJiancheng LvNan Duan
2020-04-08
Poor Man's BERT: Smaller and Faster Transformer Models
| Hassan SajjadFahim DalviNadir DurraniPreslav Nakov
2020-04-08
Adaptive Transformers in RL
| Shakti KumarJerrod ParkerPanteha Naderian
2020-04-08
A Deep Learning Approach for Determining Effects of Tuta Absoluta in Tomato Plants
Denis P. RubangaLoyani K. LoyaniMgaya RichardSawahiko Shimada
2020-04-08
Homophone-based Label Smoothing in End-to-End Automatic Speech Recognition
Yi ZhengXianjie YangXuyong Dang
2020-04-07
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-based Question Answering
| Changmao LiJinho D. Choi
2020-04-07
Byte Pair Encoding is Suboptimal for Language Model Pretraining
Kaj BostromGreg Durrett
2020-04-07
Probabilistic Spatial Transformers for Bayesian Data Augmentation
Pola SchwöbelFrederik WarburgMartin JørgensenKristoffer H. MadsenSøren Hauberg
2020-04-07
AutoToon: Automatic Geometric Warping for Face Cartoon Generation
Julia GongYannick Hold-GeoffroyJingwan Lu
2020-04-06
A Systematic Analysis of Morphological Content in BERT Models for Multiple Languages
| Daniel Edmiston
2020-04-06
Syntax-driven Iterative Expansion Language Models for Controllable Text Generation
Noe CasasJosé A. R. FonollosaMarta R. Costa-jussà
2020-04-05
Conversational Question Reformulation via Sequence-to-Sequence Architectures and Pretrained Language Models
Sheng-Chieh LinJheng-Hong YangRodrigo NogueiraMing-Feng TsaiChuan-Ju WangJimmy Lin
2020-04-04
STEP: Sequence-to-Sequence Transformer Pre-training for Document Summarization
Yanyan ZouXingxing ZhangWei LuFuru WeiMing Zhou
2020-04-04
LiDAR-based Online 3D Video Object Detection with Graph-based Message Passing and Spatiotemporal Transformer Attention
| Junbo YinJianbing ShenChenye GuanDingfu ZhouRuigang Yang
2020-04-03
Testing pre-trained Transformer models for Lithuanian news clustering
Lukas StankevičiusMantas Lukoševičius
2020-04-03
The RWTH ASR System for TED-LIUM Release 2: Improving Hybrid HMM with SpecAugment
Wei ZhouWilfried MichelKazuki IrieMarkus KitzaRalf SchlüterHermann Ney
2020-04-02
Sign Language Translation with Transformers
| Kayo Yin
2020-04-01
Sign Language Translation with Transformers
| Kayo Yin
2020-04-01
DSTC8-AVSD: Multimodal Semantic Transformer Network with Retrieval Style Word Generator
Hwanhee LeeSeunghyun YoonFranck DernoncourtDoo Soon KimTrung BuiKyomin Jung
2020-04-01
Graph Enhanced Representation Learning for News Recommendation
Suyu GeChuhan WuFangzhao WuTao QiYongfeng Huang
2020-03-31
X-Linear Attention Networks for Image Captioning
| Yingwei PanTing YaoYehao LiTao Mei
2020-03-31
A Swiss German Dictionary: Variation in Speech and Writing
Larissa SchmidtLucy LinderSandra DjambazovskaAlexandros LazaridisTanja SamardžićClaudiu Musat
2020-03-31
DeepSumm -- Deep Code Summaries using Neural Transformer Architecture
Vivek Gupta
2020-03-31
AriEL: volume coding for sentence generation
Luca CelottiSimon BrodeurJean Rouat
2020-03-30
Learning Contextualized Sentence Representations for Document-Level Neural Machine Translation
Pei ZhangXu ZhangWei ChenJian YuYanfeng WangDeyi Xiong
2020-03-30
A Hierarchical Transformer for Unsupervised Parsing
Ashok Thillaisundaram
2020-03-30
Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation
| Necati Cihan CamgozOscar KollerSimon HadfieldRichard Bowden
2020-03-30
TResNet: High Performance GPU-Dedicated Architecture
| Tal RidnikHussam LawenAsaf NoyItamar FriedmanEmanuel Ben BaruchGilad Sharir
2020-03-30
Code Prediction by Feeding Trees to Transformers
| Seohyun KimJinman ZhaoYuchi TianSatish Chandra
2020-03-30
Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement
Alireza MohammadshahiJames Henderson
2020-03-29
Abstractive Text Summarization based on Language Model Conditioning and Locality Modeling
Dmitrii AksenovJulián Moreno-SchneiderPeter BourgonjeRobert SchwarzenbergLeonhard HennigGeorg Rehm
2020-03-29
Variational Transformers for Diverse Response Generation
| Zhaojiang LinGenta Indra WinataPeng XuZihan LiuPascale Fung
2020-03-28
Actor-Transformers for Group Activity Recognition
Kirill GavrilyukRyan SanfordMehrsan JavanCees G. M. Snoek
2020-03-28
TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation
| Shaojie JiangThomas WolfChristof MonzMaarten de Rijke
2020-03-26
StrokeCoder: Path-Based Image Generation from Single Examples using Transformers
Sabine WieluchFriedhelm Schwenker
2020-03-26
Generalizing Spatial Transformers to Projective Geometry with Applications to 2D/3D Registration
| Cong GaoXingtong LiuWenhao GuBenjamin KilleenMehran ArmandRussell TaylorMathias Unberath
2020-03-24
Analyzing Word Translation of Transformer Layers
Hongfei XuJosef van GenabithDeyi XiongQiuhui Liu
2020-03-21
TNT-KID: Transformer-based Neural Tagger for Keyword Identification
Matej MartincBlaž ŠkrljSenja Pollak
2020-03-20
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
Longteng GuoJing LiuXinxin ZhuPeng YaoShichen LuHanqing Lu
2020-03-19
Temporal Embeddings and Transformer Models for Narrative Text Understanding
Vani KSimone MellaceAlessandro Antonucci
2020-03-19
Detecting Lane and Road Markings at A Distance with Perspective Transformer Layers
Zhuoping YuXiaozhou RenYuyao HuangWei TianJunqiao Zhao
2020-03-19
Transformer Networks for Trajectory Forecasting
| Francesco GiuliariIrtiza HasanMarco CristaniFabio Galasso
2020-03-18
Scene Text Recognition via Transformer
Xinjie FengHongxun YaoYuankai QiJun ZhangShengping Zhang
2020-03-18
Fixing the train-test resolution discrepancy: FixEfficientNet
| Hugo TouvronAndrea VedaldiMatthijs DouzeHervé Jégou
2020-03-18
Calibration of Pre-trained Transformers
Shrey DesaiGreg Durrett
2020-03-17
PowerNorm: Rethinking Batch Normalization in Transformers
| Sheng ShenZhewei YaoAmir GholamiMichael W. MahoneyKurt Keutzer
2020-03-17
Multi-modal Dense Video Captioning
| Vladimir IashinEsa Rahtu
2020-03-17
TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding
Zhiheng HuangPeng XuDavis LiangAjay MishraBing Xiang
2020-03-16
Document Ranking with a Pretrained Sequence-to-Sequence Model
Rodrigo NogueiraZhiying JiangJimmy Lin
2020-03-14
Identifying Individual Dogs in Social Media Images
Djordje BaticDubravko Culibrk
2020-03-14
Learning to Encode Position for Transformer with Continuous Dynamical Model
| Xuanqing LiuHsiang-Fu YuInderjit DhillonCho-Jui Hsieh
2020-03-13
Advanced Deep Learning Methodologies for Skin Cancer Classification in Prodromal Stages
Muhammad Ali FarooqAsma KhatoonViktor VarkarakisPeter Corcoran
2020-03-13
SynCGAN: Using learnable class specific priors to generate synthetic data for improving classifier performance on cytological images
Soumyajyoti DeySoham DasSwarnendu GhoshShyamali MitraSukanta ChakrabartyNibaran Das
2020-03-12
Efficient Content-Based Sparse Attention with Routing Transformers
| Aurko RoyMohammad SaffarAshish VaswaniDavid Grangier
2020-03-12
Keyword-Attentive Deep Semantic Matching
| Changyu MiaoZhen CaoYik-Cheung Tam
2020-03-11
ReZero is All You Need: Fast Convergence at Large Depth
| Thomas BachlechnerBodhisattwa Prasad MajumderHuanru Henry MaoGarrison W. CottrellJulian McAuley
2020-03-10
Hybrid Attention-Based Transformer Block Model for Distant Supervision Relation Extraction
Yan XiaoYaochu JinRan ChengKuangrong Hao
2020-03-10
Capacity of Continuous Channels with Memory via Directed Information Neural Estimator
Ziv AharoniDor TsurZiv GoldfeldHaim Henry Permuter
2020-03-09
TTPP: Temporal Transformer with Progressive Prediction for Efficient Action Anticipation
Wen WangXiaojiang PengYanzhou SuYu QiaoJian Cheng
2020-03-07
Cross-modal Learning for Multi-modal Video Categorization
Palash GoyalSaurabh SahuShalini GhoshChul Lee
2020-03-07
Transformers Generalize to the Semantics of Logics
| Christopher HahnFrederik SchmittJens U. KreberMarkus N. RabeBernd Finkbeiner
2020-03-06
Does label smoothing mitigate label noise?
Michal LukasikSrinadh BhojanapalliAditya Krishna MenonSanjiv Kumar
2020-03-05
EmpTransfo: A Multi-head Transformer Architecture for Creating Empathetic Dialog Systems
| Rohola ZandieMohammad H. Mahoor
2020-03-05
Data Augmentation using Pre-trained Transformer Models
| Varun KumarAshutosh ChoudharyEunah Cho
2020-03-04
AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment
Zhen ZengJianzong WangNing ChengTian XiaJing Xiao
2020-03-04
Meta-Embeddings Based On Self-Attention
Qichen LiYuanqing LinLuofeng ZhouJian Li
2020-03-03
Heterogeneous Graph Transformer
| Ziniu HuYuxiao DongKuansan WangYizhou Sun
2020-03-03
Controllable Time-Delay Transformer for Real-Time Punctuation Prediction and Disfluency Detection
Qian ChenMengzhe ChenBo LiWen Wang
2020-03-03
Transfer Learning for Context-Aware Spoken Language Understanding
Qian ChenZhu ZhuoWen WangQiuyun Xu
2020-03-03
Transformer++
Prakhar ThapakProdip Hore
2020-03-02
Exploring and Distilling Cross-Modal Information for Image Captioning
Fenglin LiuXuancheng RenYuanxin LiuKai LeiXu Sun
2020-02-28
Provable, Scalable and Automatic Perturbation Analysis on General Computational Graphs
| Kaidi XuZhouxing ShiHuan ZhangYihan WangKai-Wei ChangMinlie HuangBhavya KailkhuraXue LinCho-Jui Hsieh
2020-02-28
Compressing Large-Scale Transformer-Based Models: A Case Study on BERT
Prakhar GaneshYao ChenXin LouMohammad Ali KhanYin YangDeming ChenMarianne WinslettHassan SajjadPreslav Nakov
2020-02-27
Marathi To English Neural Machine Translation With Near Perfect Corpus And Transformers
Swapnil Ashok Jadhav
2020-02-26
Sparse Sinkhorn Attention
| Yi TayDara BahriLiu YangDonald MetzlerDa-Cheng Juan
2020-02-26
Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers
Zhuohan LiEric WallaceSheng ShenKevin LinKurt KeutzerDan KleinJoseph E. Gonzalez
2020-02-26
Multi-task Learning with Multi-head Attention for Multi-choice Reading Comprehension
Hui Wan
2020-02-26
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
| Wenhui WangFuru WeiLi DongHangbo BaoNan YangMing Zhou
2020-02-25
Exploring BERT Parameter Efficiency on the Stanford Question Answering Dataset v2.0
Eric Hulburd
2020-02-25
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation
Alessandro RaganatoYves ScherrerJörg Tiedemann
2020-02-24
GRET: Global Representation Enhanced Transformer
Rongxiang WengHaoran WeiShujian HuangHeng YuLidong BingWeihua LuoJiajun Chen
2020-02-24
Neuron Shapley: Discovering the Responsible Neurons
Amirata GhorbaniJames Zou
2020-02-23
Accessing Higher-level Representations in Sequential Transformers with Feedback Memory
Angela FanThibaut LavrilEdouard GraveArmand JoulinSainbayar Sukhbaatar
2020-02-21
Transformer Hawkes Process
| Simiao ZuoHaoming JiangZichong LiTuo ZhaoHongyuan Zha
2020-02-21
Introducing Fuzzy Layers for Deep Learning
Stanton R. PriceSteven R. PriceDerek T. Anderson
2020-02-21
Learning Dynamic Belief Graphs to Generalize on Text-Based Games
Ashutosh AdhikariXingdi YuanMarc-Alexandre CôtéMikuláš ZelinkaMarc-Antoine RondeauRomain LarochePascal PoupartJian TangAdam TrischlerWilliam L. Hamilton
2020-02-21
iSEGAN: Improved Speech Enhancement Generative Adversarial Networks
Deepak Baby
2020-02-20
Molecule Attention Transformer
| Łukasz MaziarkaTomasz DanelSławomir MuchaKrzysztof RatajJacek TaborStanisław Jastrzębski
2020-02-19
LAMBERT: Layout-Aware (Language) Modeling using BERT for information extraction
Łukasz GarncarekRafał PowalskiTomasz StanisławekBartosz TopolskiPiotr HalamaFilip Graliński
2020-02-19
Tree-structured Attention with Hierarchical Accumulation
Xuan-Phi NguyenShafiq JotySteven C. H. HoiRichard Socher
2020-02-19
Toward Making the Most of Context in Neural Machine Translation
Zaixiang ZhengXiang YueShujian HuangJiajun ChenAlexandra Birch
2020-02-19
Gradient-Based Adversarial Training on Transformer Networks for Detecting Check-Worthy Factual Claims
Kevin MengDamian JimenezFatma ArslanJacob Daniel DevasierDaniel ObembeChengkai Li
2020-02-18
Uncertainty in Structured Prediction
Andrey MalininMark Gales
2020-02-18
Hierarchical Transformer Network for Utterance-level Emotion Recognition
QingBiao LiChunHua WuKangFeng ZhengZhe Wang
2020-02-18
Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue
| Byeongchang KimJaewoo AhnGunhee Kim
2020-02-18
Conditional Self-Attention for Query-based Summarization
Yujia XieTianyi ZhouYi MaoWeizhu Chen
2020-02-18
Controlling Computation versus Quality for Neural Sequence Models
Ankur BapnaNaveen ArivazhaganOrhan Firat
2020-02-17
Low-Rank Bottleneck in Multi-head Attention Models
Srinadh BhojanapalliChulhee YunAnkit Singh RawatSashank J. ReddiSanjiv Kumar
2020-02-17
A Financial Service Chatbot based on Deep Bidirectional Transformers
Shi YuYuxin ChenHussain Zaidi
2020-02-17
Multi-layer Representation Fusion for Neural Machine Translation
Qiang WangFuxue LiTong XiaoYanyang LiYinqiao LiJingbo Zhu
2020-02-16
Neural Machine Translation with Joint Representation
| Yanyang LiQiang WangTong XiaoTongran LiuJingbo Zhu
2020-02-16
UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation
Huaishao LuoLei JiBotian ShiHaoyang HuangNan DuanTianrui LiXilin ChenMing Zhou
2020-02-15
Small energy masking for improved neural network training for end-to-end speech recognition
Chanwoo KimKwangyoun KimSathish Reddy Indurthi
2020-02-15
Transformer on a Diet
| Chenguang WangZihao YeAston ZhangZheng ZhangAlexander J. Smola
2020-02-14
Towards an Appropriate Query, Key, and Value Computation for Knowledge Tracing
Youngduck ChoiYoungnam LeeJunghyun ChoJineon BaekByungsoo KimYeongmin ChaDongmin ShinChan BaeJaewe Heo
2020-02-14
Stress Test Evaluation of Transformer-based Models in Natural Language Understanding Tasks
Carlos AspillagaAndrés CarvalloVladimir Araujo
2020-02-14
Deep Attentive Study Session Dropout Prediction in Mobile Learning Environment
Youngnam LeeDongmin ShinHyunBin LohJaemin LeePiljae ChaeJunghyun ChoSeoyon ParkJinhwan LeeJineon BaekByungsoo KimYoungduck Choi
2020-02-14
Sparse and Structured Visual Attention
Pedro Henrique MartinsVlad NiculaeZita MarinhoAndré Martins
2020-02-13
Attentional Speech Recognition Models Misbehave on Out-of-domain Utterances
Phillip KeungWei NiuYichao LuJulian SalazarVikas Bhardwaj
2020-02-12
End-to-End Face Parsing via Interlinked Convolutional Neural Networks
| Zi YinValentin YiuXiaolin HuLiang Tang
2020-02-12
On Layer Normalization in the Transformer Architecture
Ruibin XiongYunchang YangDi HeKai ZhengShuxin ZhengChen XingHuishuai ZhangYanyan LanLiwei WangTie-Yan Liu
2020-02-12
GLU Variants Improve Transformer
| Noam Shazeer
2020-02-12
Training with Streaming Annotation
Tongtao ZhangHeng JiShih-Fu ChangMarjorie Freedman
2020-02-11
Superbloom: Bloom filter meets Transformer
John AndersonQingqing HuangWalid KricheneSteffen RendleLi Zhang
2020-02-11
Pre-training Tasks for Embedding-based Large-scale Retrieval
Wei-Cheng ChangFelix X. YuYin-Wen ChangYiming YangSanjiv Kumar
2020-02-10
End-to-End Multi-speaker Speech Recognition with Transformer
Xuankai ChangWangyou ZhangYanmin QianJonathan Le RouxShinji Watanabe
2020-02-10
Deep Representation Learning for Dynamical Systems Modeling
Anna ShalovaIvan Oseledets
2020-02-10
StickyPillars: Robust and Efficient Feature Matching on Point Clouds using Graph Neural Networks
Martin SimonKai FischerStefan MilzChristian WittFlorian OelsnerPatrick MaederHorst-Michael Gross
2020-02-10
On the distance between two neural networks and the stability of learning
| Jeremy BernsteinArash VahdatYisong YueMing-Yu Liu
2020-02-09
Blank Language Models
Tianxiao ShenVictor QuachRegina BarzilayTommi Jaakkola
2020-02-08
Multimodal Matching Transformer for Live Commenting
Chaoqun DuanLei CuiShuming MaFuru WeiConghui ZhuTiejun Zhao
2020-02-07
Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss
| Qian ZhangHan LuHasim SakAnshuman TripathiErik McDermottStephen KooShankar Kumar
2020-02-07
Transformer-Capsule Model for Intent Detection
Aleksander ObuchowskiMichał Lew
2020-02-07
perm2vec: Graph Permutation Selection for Decoding of Error Correction Codes using Self-Attention
Nir RavivAvi CaciularuTomer RavivJacob GoldbergerYair Be'ery
2020-02-06
Few-Shot Learning as Domain Adaptation: Algorithm and Analysis
Jiechao GuanZhiwu LuTao XiangJi-Rong Wen
2020-02-06
Aligning the Pretraining and Finetuning Objectives of Language Models
Nuo Wang PierseJingwen Lu
2020-02-05
Vocoder-free End-to-End Voice Conversion with Transformer Network
June-Woo KimHo-Young JungMinho Lee
2020-02-05
Learning Long- and Short-Term User Literal-Preference with Multimodal Hierarchical Transformer Network for Personalized Image Caption
Wei ZhangYue YingPan LuHongyuan Zha
2020-02-04
Multistage Model for Robust Face Alignment Using Deep Neural Networks
Huabin WangRui ChengJian ZhouLiang TaoHon Keung Kwan
2020-02-04
Interpretable & Time-Budget-Constrained Contextualization for Re-Ranking
| Sebastian HofstätterMarkus ZlabingerAllan Hanbury
2020-02-04
IART: Intent-aware Response Ranking with Transformers in Information-seeking Conversation Systems
| Liu YangMinghui QiuChen QuCen ChenJiafeng GuoYongfeng ZhangW. Bruce CroftHaiqing Chen
2020-02-03
Exponential discretization of weights of neural network connections in pre-trained neural networks
Magomed Yu. MalsagovEmil M. KhayrovMaria M. PushkarevaIakov M. Karandashev
2020-02-03
Pop Music Transformer: Generating Music with Rhythm and Harmony
| Yu-Siang HuangYi-Hsuan Yang
2020-02-01
Bridging Text and Video: A Universal Multimodal Transformer for Video-Audio Scene-Aware Dialog
Zekang LiZongjia LiJinchao ZhangYang FengCheng NiuJie Zhou
2020-02-01
Pretrained Transformers for Simple Question Answering over Knowledge Graphs
D. LukovnikovA. FischerJ. Lehmann
2020-01-31
Interpretable Rumor Detection in Microblogs by Attending to User Interactions
| Ling Min Serena KhooHai Leong ChieuZhong QianJing Jiang
2020-01-29
A Study of Pyramid Structure for Code Correction
Shan HuangXiao ZhouSang Chin
2020-01-28
Applying Recent Innovations from NLP to MOOC Student Course Trajectory Modeling
Clarence ChenZachary Pardos
2020-01-23
Recommending Themes for Ad Creative Design via Visual-Linguistic Representations
| Yichao ZhouShaunak MishraManisha VermaNarayan BhamidipatiWei Wang
2020-01-20
Multi-level Head-wise Match and Aggregation in Transformer for Textual Sequence Matching
Shuohang WangYunshi LanYi TayJing JiangJingjing Liu
2020-01-20
Deep Learning for Hindi Text Classification: A Comparison
Ramchandra JoshiPurvi GoelRaviraj Joshi
2020-01-19
A multimodal deep learning approach for named entity recognition from social media
Meysam Asgari-ChenaghluM. Reza Feizi-DerakhshiLeili FarzinvashM. A. BalafarCina Motamed
2020-01-19
Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network
| Jungkyu LeeTaeryun WonTae Kwan LeeHyemin LeeGeonmo GuKiho Hong
2020-01-17
Shifted and Squeezed 8-bit Floating Point format for Low-Precision Training of Deep Neural Networks
Léopold CambierAnahita BhiwandiwallaTing GongMehran NekuiiOguz H ElibolHanlin Tang
2020-01-16
Non-Autoregressive Machine Translation with Disentangled Context Transformer
| Jungo KasaiJames CrossMarjan GhazvininejadJiatao Gu
2020-01-15
Insertion-Deletion Transformer
Laura RuisMitchell SternJulia ProskurniaWilliam Chan
2020-01-15
Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture
Haoran MiaoGaofeng ChengChangfeng GaoPengyuan ZhangYonghong Yan
2020-01-15
The problems with using STNs to align CNN feature maps
Lukas FinnvedenYlva JanssonTony Lindeberg
2020-01-14
Auto Completion of User Interface Layout Design Using Transformer-Based Tree Decoders
Yang LiJulien AmelotXin ZhouSamy BengioSi Si
2020-01-14
Reformer: The Efficient Transformer
| Nikita KitaevŁukasz KaiserAnselm Levskaya
2020-01-13
Attribute-guided Feature Learning Network for Vehicle Re-identification
Huibing WangJinjia PengDongyan ChenGuangqi JiangTongtong ZhaoXianping Fu
2020-01-12
Urdu-English Machine Transliteration using Neural Networks
Usman Mohy ud Din
2020-01-12
Spatial-Temporal Transformer Networks for Traffic Flow Forecasting
Mingxing XuWenrui DaiChunmiao LiuXing GaoWeiyao LinGuo-Jun QiHongkai Xiong
2020-01-09
Streaming automatic speech recognition with the transformer model
Niko MoritzTakaaki HoriJonathan Le Roux
2020-01-08
Regularization via Structural Label Smoothing
Weizhi LiGautam DasarathyVisar Berisha
2020-01-07
RECAST: Interactive Auditing of Automatic Toxicity Detection Models
Austin P. WrightOmar ShaikhHaekyu ParkWill EppersonMuhammed AhmedStephane PinelDiyi YangDuen Horng Chau
2020-01-07
FDFtNet: Facing Off Fake Images using Fake Detection Fine-tuning Network
Hyeonseong JeonYoungoh BangSimon S. Woo
2020-01-05
Learning Accurate Integer Transformer Machine-Translation Models
Ephrem Wu
2020-01-03
Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation
Goran GlavašSwapna Somasundaran
2020-01-03
Representing Unordered Data Using Complex-Weighted Multiset Automata
Justin DeBenedettoDavid Chiang
2020-01-02
SPROUT: Self-Progressing Robust Training
Anonymous
2020-01-01
Attacking Lifelong Learning Models with Gradient Reversion
Yunhui GuoMingrui LiuYandong LiLiqiang WangTianbao YangTajana Rosing
2020-01-01
Improved Training Techniques for Online Neural Machine Translation
Anonymous
2020-01-01
Efficient Transformer for Mobile Applications
Anonymous
2020-01-01
Resolving Lexical Ambiguity in English–Japanese Neural Machine Translation
Anonymous
2020-01-01
Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps
| Anonymous
2020-01-01
Faster and Just As Accurate: A Simple Decomposition for Transformer Models
Anonymous
2020-01-01
Lossless Data Compression with Transformer
Anonymous
2020-01-01
Sparse Transformer: Concentrated Attention Through Explicit Selection
Anonymous
2020-01-01
Forecasting Deep Learning Dynamics with Applications to Hyperparameter Tuning
Anonymous
2020-01-01
Compressive Transformers for Long-Range Sequence Modelling
Anonymous
2020-01-01
MUSE: Multi-Scale Attention Model for Sequence to Sequence Learning
| Anonymous
2020-01-01
DeFINE: Deep Factorized Input Word Embeddings for Neural Sequence Modeling
Anonymous
2020-01-01
Concise Multi-head Attention Models
Anonymous
2020-01-01
Logic and the 2-Simplicial Transformer
| Anonymous
2020-01-01
DeepEnFM: Deep neural networks with Encoder enhanced Factorization Machine
Anonymous
2020-01-01
GOING BEYOND TOKEN-LEVEL PRE-TRAINING FOR EMBEDDING-BASED LARGE-SCALE RETRIEVAL
Anonymous
2020-01-01
CGT: Clustered Graph Transformer for Urban Spatio-temporal Prediction
Anonymous
2020-01-01
Fully Quantized Transformer for Improved Translation
Anonymous
2020-01-01
Augmenting Transformers with KNN-Based Composite Memory
Anonymous
2020-01-01
BERT-AL: BERT for Arbitrarily Long Document Understanding
Ruixuan ZhangZhuoyu WeiYu ShiYining Chen
2020-01-01
NEURAL EXECUTION ENGINES
Yujun YanKevin SwerskyDanai KoutraParthasarathy RanganathanMilad Hashemi
2020-01-01
Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring
| Samuel HumeauKurt ShusterMarie-Anne LachauxJason Weston
2020-01-01
Attention over Phrases
Anonymous
2020-01-01
Group-Transformer: Towards A Lightweight Character-level Language Model
Anonymous
2020-01-01
Global Relational Models of Source Code
Anonymous
2020-01-01
ZeroQ: A Novel Zero Shot Quantization Framework
| Yaohui CaiZhewei YaoZhen DongAmir GholamiMichael W. MahoneyKurt Keutzer
2020-01-01
Putting Machine Translation in Context with the Noisy Channel Model
Anonymous
2020-01-01
Deep Attentive Ranking Networks for Learning to Order Sentences
Pawan KumarDhanajit BrahmaHarish KarnickPiyush Rai
2019-12-31
EEG based Continuous Speech Recognition using Transformers
Gautam KrishnaCo TranMason CarnahanAhmed H Tewfik
2019-12-31
AraNet: A Deep Learning Toolkit for Arabic Social Media
Muhammad Abdul-MageedChiyu ZhangAzadeh HashemiEl Moatez Billah Nagoudi
2019-12-30
All-in-One Image-Grounded Conversational Agents
Da JuKurt ShusterY-Lan BoureauJason Weston
2019-12-28
Is Attention All What You Need? -- An Empirical Investigation on Convolution-Based Active Memory and Self-Attention
Thomas DowdellHongyu Zhang
2019-12-27
Encoding word order in complex embeddings
| Benyou WangDonghao ZhaoChristina LiomaQiuchi LiPeng ZhangJakob Grue Simonsen
2019-12-27
Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection
Guangxiang ZhaoJunyang LinZhiyuan ZhangXuancheng RenQi SuXu Sun
2019-12-25
Multi-Graph Transformer for Free-Hand Sketch Recognition
| Peng XuChaitanya K. JoshiXavier Bresson
2019-12-24
Improving Abstractive Text Summarization with History Aggregation
Pengcheng LiaoChuang ZhangXiaojun ChenXiaofei Zhou
2019-12-24
end-to-end training of a large vocabulary end-to-end speech recognition system
Chanwoo KimSungsoo KimKwangyoun KimMehul KumarJiyeon KimKyungmin LeeChangwoo HanAbhinav GargEunhyang KimMinkyoo ShinShatrughan SinghLarry HeckDhananjaya Gowda
2019-12-22
Learning and Evaluating Contextual Embedding of Source Code
| Aditya KanadePetros ManiatisGogul BalakrishnanKensen Shi
2019-12-21
Are Transformers universal approximators of sequence-to-sequence functions?
Chulhee YunSrinadh BhojanapalliAnkit Singh RawatSashank J. ReddiSanjiv Kumar
2019-12-20
ET-USB: Transformer-Based Sequential Behavior Modeling for Inbound Customer Service
Ta-Chun SuGuan-Ying Chen
2019-12-20
Axial Attention in Multidimensional Transformers
| Jonathan HoNal KalchbrennerDirk WeissenbornTim Salimans
2019-12-20
Shareable Representations for Search Query Understanding
Mukul KumarYouna HuWill HeaddenRahul GoutamHeran LinBing Yin
2019-12-20
Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting
| Bryan LimSercan O. ArikNicolas LoeffTomas Pfister
2019-12-19
Meshed-Memory Transformer for Image Captioning
| Marcella CorniaMatteo StefaniniLorenzo BaraldiRita Cucchiara
2019-12-17
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining
Wen-Chin HuangTomoki HayashiYi-Chiao WuHirokazu KameokaTomoki Toda
2019-12-14
BERTQA -- Attention on Steroids
Ankit ChadhaRewa Sood
2019-12-14
WaLDORf: Wasteless Language-model Distillation On Reading-comprehension
James Yi TianAlexander P. KreuzerPai-Hung ChenHans-Martin Will
2019-12-13
Linear Mode Connectivity and the Lottery Ticket Hypothesis
| Jonathan FrankleGintare Karolina DziugaiteDaniel M. RoyMichael Carbin
2019-12-11
Diffeomorphic Temporal Alignment Nets
| Ron Shapira WeberMatan EyalNicki SkafteOren ShrikiOren Freifeld
2019-12-10
Encoding Musical Style with Transformer Autoencoders
Kristy ChoiCurtis HawthorneIan SimonMonica DinculescuJesse Engel
2019-12-10
Transformer Based Reinforcement Learning For Games
Uddeshya UpadhyayNikunj ShahSucheta RavikantiMayanka Medhe
2019-12-09
Learning a Layout Transfer Network for Context Aware Object Detection
Tao WangXuming HeYuanzheng CaiGuobao Xiao
2019-12-09
Bidirectional Scene Text Recognition with a Single Decoder
| Maurits BleekerMaarten de Rijke
2019-12-08
Personalized Patent Claim Generation and Measurement
Jieh-Sheng Lee
2019-12-07
Weak Supervision helps Emergence of Word-Object Alignment and improves Vision-Language Tasks
Corentin KervadecGrigory AntipovMoez BaccoucheChristian Wolf
2019-12-06
Synchronous Transformers for End-to-End Speech Recognition
Zhengkun TianJiangyan YiYe BaiJianhua TaoShuai ZhangZhengqi Wen
2019-12-06
Self-Supervised Contextual Language Representation of Radiology Reports to Improve the Identification of Communication Urgency
Xing MengCraig H. GanoeRyan T. SiebergYvonne Y. CheungSaeed Hassanpour
2019-12-05
Scratch that! An Evolution-based Adversarial Attack against Neural Networks
Malhar JereLoris RossiBriland HitajGabriela CiocarlieGiacomo BoracchiFarinaz Koushanfar
2019-12-05
AMUSED: A Multi-Stream Vector Representation Method for Use in Natural Dialogue
Gaurav KumarRishabh JoshiJaspreet SinghPromod Yenigalla
2019-12-04
Viewpoint-Aware Loss with Angular Regularization for Person Re-Identification
| Zhihui ZhuXinyang JiangFeng ZhengXiaowei GuoFeiyue HuangWeishi ZhengXing Sun
2019-12-03
TU Wien @ TREC Deep Learning '19 -- Simple Contextualization for Re-ranking
| Sebastian HofstätterMarkus ZlabingerAllan Hanbury
2019-12-03
Multi-Scale Self-Attention for Text Classification
Qipeng GuoXipeng QiuPengfei LiuXiangyang XueZheng Zhang
2019-12-02
BLiMP: The Benchmark of Linguistic Minimal Pairs for English
| Alex WarstadtAlicia ParrishHaokun LiuAnhad MohananeyWei PengSheng-Fu WangSamuel R. Bowman
2019-12-02
Solving Arithmetic Word Problems Automatically Using Transformer and Unambiguous Representations
| Kaden GriffithJugal Kalita
2019-12-02
Long Distance Relationships without Time Travel: Boosting the Performance of a Sparse Predictive Autoencoder in Sequence Modeling
Jeremy GordonDavid RawlinsonSubutai Ahmad
2019-12-02
Neural Academic Paper Generation
| Samet DemirUras MutluÖzgur Özdemir
2019-12-02
Audiovisual Transformer Architectures for Large-Scale Classification and Synchronization of Weakly Labeled Audio Events
Wim BoesHugo Van hamme
2019-12-02
Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks
Xiao SunJungwook ChoiChia-Yu ChenNaigang WangSwagath VenkataramaniVijayalakshmi (Viji) SrinivasanXiaodong CuiWei ZhangKailash Gopalakrishnan
2019-12-01
Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech Recognition
Chao WengChengzhu YuJia CuiChunlei ZhangDong Yu
2019-11-28
Do Attention Heads in BERT Track Syntactic Dependencies?
Phu Mon HtutJason PhangShikha BordiaSamuel R. Bowman
2019-11-27
Taking a Stance on Fake News: Towards Automatic Disinformation Assessment via Deep Bidirectional Transformer Language Models for Stance Detection
Chris DulhantyJason L. DeglintIbrahim Ben DayaAlexander Wong
2019-11-27
SimpleBooks: Long-term dependency book dataset with simplified English vocabulary for word-level language modeling
Huyen Nguyen
2019-11-27
DeFINE: DEep Factorized INput Token Embeddings for Neural Sequence Modeling
Sachin MehtaRik Koncel-KedziorskiMohammad RastegariHannaneh Hajishirzi
2019-11-27
Password-conditioned Anonymization and Deanonymization with Face Identity Transformers
Xiuye GuWeixin LuoMichael S. RyooYong Jae Lee
2019-11-26
Relevance-Promoting Language Model for Short-Text Conversation
Xin LiPiji LiWei BiXiaojiang LiuWai Lam
2019-11-26
Efficient Attention Mechanism for Visual Dialog that can Handle All the Interactions between Multiple Inputs
| Van-Quang NguyenMasanori SuganumaTakayuki Okatani
2019-11-26
Autoencoding Undirected Molecular Graphs With Neural Networks
Jeppe Johan Waarkjær OlsenPeter Ebert ChristensenMartin Hangaard HansenAlexander Rosenberg Johansen
2019-11-26
Who did They Respond to? Conversation Structure Modeling using Masked Hierarchical Transformer
| Henghui ZhuFeng NanZhiguo WangRamesh NallapatiBing Xiang
2019-11-25
Learning to Reuse Translations: Guiding Neural Machine Translation with Examples
Qian CaoShaohui KuangDeyi Xiong
2019-11-25
Spectral Graph Transformer Networks for Brain Surface Parcellation
Ran HeKarthik GopinathChristian DesrosiersHerve Lombaert
2019-11-22
Neuron Interaction Based Representation Composition for Neural Machine Translation
Jian LiXing WangBaosong YangShuming ShiMichael R. LyuZhaopeng Tu
2019-11-22
Factorized Multimodal Transformer for Multimodal Sequential Learning
| Amir ZadehChengfeng MaoKelly ShiYiwei ZhangPaul Pu LiangSoujanya PoriaLouis-Philippe Morency
2019-11-22
Improving N-gram Language Models with Pre-trained Deep Transformer
Yiren WangHongzhao HuangZhe LiuYutong PangYongqiang WangChengXiang ZhaiFuchun Peng
2019-11-22
WildMix Dataset and Spectro-Temporal Transformer Model for Monoaural Audio Source Separation
Amir ZadehTianjun MaSoujanya PoriaLouis-Philippe Morency
2019-11-21
Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks
| Saurabh SinghShankar Krishnan
2019-11-21
MarioNETte: Few-shot Face Reenactment Preserving Identity of Unseen Targets
Sungjoo HaMartin KersnerBeomsu KimSeokjun SeoDongyoung Kim
2019-11-19
Graph Transformer for Graph-to-Sequence Learning
| Deng CaiWai Lam
2019-11-18
MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning
| Guangxiang ZhaoXu SunJingjing XuZhiyuan ZhangLiangchen Luo
2019-11-17
Music theme recognition using CNN and self-attention
Manoj SukhavasiSainath Adapa
2019-11-16
Interpreting chest X-rays via CNNs that exploit hierarchical disease dependencies and uncertainty labels
Hieu H. PhamTung T. LeDat Q. TranDat T. NgoHa Q. Nguyen
2019-11-15
Sequential Recommendation with Relation-Aware Kernelized Self-Attention
Mingi JiWeonyoung JooKyungwoo SongYoon-Yeong KimIl-Chul Moon
2019-11-15
Evaluating robustness of language models for chief complaint extraction from patient-generated text
Ilya ValmianskiCaleb GoodwinIan M. FinnNaqi KhanDaniel S. Zisook
2019-11-15
Selection-based Question Answering of an MOOC
Atul SahaySmita GholkarKavi Arya
2019-11-15
Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA
Ronghang HuAmanpreet SinghTrevor DarrellMarcus Rohrbach
2019-11-14
Attention on Abstract Visual Reasoning
Lukas HahneTimo LüddeckeFlorentin WörgötterDavid Kappel
2019-11-14
Compressive Transformers for Long-Range Sequence Modelling
| Jack W. RaeAnna PotapenkoSiddhant M. JayakumarTimothy P. Lillicrap
2019-11-13
Character-based NMT with Transformer
Rohit GuptaLaurent BesacierMarc DymetmanMatthias Gallé
2019-11-12
SMILES Transformer: Pre-trained Molecular Fingerprint for Low Data Drug Discovery
| Shion HondaShoi ShiHiroki R. Ueda
2019-11-12
Disentangle, align and fuse for multimodal and zero-shot image segmentation
| Agisilaos ChartsiasGiorgos PapanastasiouChengjia WangScott SempleDavid E. NewbyRohan DharmakumarSotirios A. Tsaftaris
2019-11-11
Attending to Entities for Better Text Understanding
Pengxiang ChengKatrin Erk
2019-11-11
TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection
| Siddhant GargThuy VuAlessandro Moschitti
2019-11-11
BP-Transformer: Modelling Long-Range Context via Binary Partitioning
| Zihao YeQipeng GuoQuan GanXipeng QiuZheng Zhang
2019-11-11
Long-span language modeling for speech recognition
Sarangarajan ParthasarathyWilliam GaleXie ChenGeorge PolovetsShuangyu Chang
2019-11-11
TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection
| Siddhant GargThuy VuAlessandro Moschitti
2019-11-11
Two-Headed Monster And Crossed Co-Attention Networks
Yaoyiran LiJing Jiang
2019-11-10
Improving Transformer Models by Reordering their Sublayers
| Ofir PressNoah A. SmithOmer Levy
2019-11-10
Learning to Few-Shot Learn Across Diverse Natural Language Classification Tasks
Trapit BansalRishikesh JhaAndrew McCallum
2019-11-10
Distilling Knowledge Learned in BERT for Text Generation
| Yen-Chun ChenZhe GanYu ChengJingzhou LiuJingjing Liu
2019-11-10
TENER: Adapting Transformer Encoder for Named Entity Recognition
| Hang YanBocao DengXiaonan LiXipeng Qiu
2019-11-10
Listen and Fill in the Missing Letters: Non-Autoregressive Transformer for Speech Recognition
Nanxin ChenShinji WatanabeJesús VillalbaNajim Dehak
2019-11-10
Syntax-Infused Transformer and BERT models for Machine Translation and Natural Language Understanding
Dhanasekar SundararamanVivek SubramanianGuoyin WangShijing SiDinghan ShenDong WangLawrence Carin
2019-11-10
A Reinforced Generation of Adversarial Examples for Neural Machine Translation
Wei ZouShujian HuangJun XieXinyu DaiJiajun Chen
2019-11-09
Question Generation from Paragraphs: A Tale of Two Hierarchical Models
Vishwajeet KumarRaktim ChakiSai Teja TalluriGanesh RamakrishnanYuan-Fang LiGholamreza Haffari
2019-11-08
Lipschitz Constrained Parameter Initialization for Deep Transformers
Hongfei XuQiuhui LiuJosef van GenabithDeyi XiongJingyi Zhang
2019-11-08
Resurrecting Submodularity for Neural Text Generation
Simeng HanXiang LinShafiq Joty
2019-11-08
Graph-to-Graph Transformer for Transition-based Dependency Parsing
Alireza MohammadshahiJames Henderson
2019-11-08
Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models
Xisen JinZhongyu WeiJunyi DuXiangyang XueXiang Ren
2019-11-08
Probing Contextualized Sentence Representations with Visual Awareness