Methods > General > Attention Mechanisms

Scaled Dot-Product Attention

Introduced by Vaswani et al. in Attention Is All You Need

Scaled dot-product attention is an attention mechanism where the dot products are scaled down by $\sqrt{d_k}$. Formally we have a query $Q$, a key $K$ and a value $V$ and calculate the attention as:

$$ {\text{Attention}}(Q, K, V) = \text{softmax}(\frac{QK^{T}}{\sqrt{d_k}})V $$

If we assume that $q$ and $k$ are $d_k$-dimensional vectors whose components are independent random variables with mean $0$ and variance $1$, then their dot product, $q \cdot k = \sum_{i=1}^{d_k} u_iv_i$, has mean $0$ and variance $d_k$. Since we would prefer these values to have variance $1$, we divide by $\sqrt{d_k}$.

Source: Attention Is All You Need

Latest Papers

PAPER DATE
Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet
| Luke Melas-Kyriazi
2021-05-06
Aligning Subtitles in Sign Language Videos
Hannah BullTriantafyllos AfourasGül VarolSamuel AlbanieLiliane MomeniAndrew Zisserman
2021-05-06
Adapting Monolingual Models: Data can be Scarce when Language Similarity is High
Wietse de VriesMartijn BarteldsMalvina NissimMartijn Wieling
2021-05-06
Introducing Information Retrieval for Biomedical Informatics Students
| Sanya B. TanejaRichard D. BoyceWilliam T. ReynoldsDenis Newman-Griffis
2021-05-06
Bird's Eye: Probing for Linguistic Graph Structures with a Simple Information-Theoretic Approach
Yifan HouMrinmaya Sachan
2021-05-06
TABBIE: Pretrained Representations of Tabular Data
Hiroshi IidaDung ThaiVarun ManjunathaMohit Iyyer
2021-05-06
Sequential Encryption of Sparse Neural Networks Toward Optimum Representation of Irregular Sparsity
Baeseong ParkSe Jung KwonDongsoo LeeDaehwan OhByeongwook KimYongkweon JeonYeonju Ro
2021-05-05
TransHash: Transformer-based Hamming Hashing for Efficient Image Retrieval
Yongbiao ChenSheng ZhangFangxin LiuZhigang ChangMang YeZhengwei Qi
2021-05-05
Visual Composite Set Detection Using Part-and-Sum Transformers
Qi DongZhuowen TuHaofu LiaoYuting ZhangVijay MahadevanStefano Soatto
2021-05-05
Attention for Image Registration (AiR): an unsupervised Transformer approach
ZiHao WangHervé Delingette
2021-05-05
MLP-Mixer: An all-MLP Architecture for Vision
| Ilya TolstikhinNeil HoulsbyAlexander KolesnikovLucas BeyerXiaohua ZhaiThomas UnterthinerJessica YungDaniel KeysersJakob UszkoreitMario LucicAlexey Dosovitskiy
2021-05-04
Moving Towards Centers: Re-ranking with Attention and Memory for Re-identification
Yunhao ZhouYi WangLap-Pui Chau
2021-05-04
Retrieving Complex Tables with Multi-Granular Graph Representation Learning
| Fei WangKexuan SunMuhao ChenJay PujaraPedro Szekely
2021-05-04
ISTR: End-to-End Instance Segmentation with Transformers
| Jie HuLiujuan CaoYao LuShengchuan ZhangYan WangKe LiFeiyue HuangLing ShaoRongrong Ji
2021-05-03
One Model to Rule them All: Towards Zero-Shot Learning for Databases
Benjamin HilprechtCarsten Binnig
2021-05-03
Goldilocks: Just-Right Tuning of BERT for Technology-Assisted Review
Eugene YangSean MacAvaneyDavid D. LewisOphir Frieder
2021-05-03
SmoothI: Smooth Rank Indicators for Differentiable IR Metrics
Thibaut ThonetYagmur Gizem CinarEric GaussierMinghan LiJean-Michel Renders
2021-05-03
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks
Tatyana IazykovaDenis KapelyushnikOlga BystrovaAndrey Kutuzov
2021-05-03
MathBERT: A Pre-Trained Model for Mathematical Formula Understanding
Shuai PengKe YuanLiangcai GaoZhi Tang
2021-05-02
Anatomy-Guided Parallel Bottleneck Transformer Network for Automated Evaluation of Root Canal Therapy
Yunxiang LiGuodong ZengYifan ZhangJun WangQianni ZhangQun JinLingling SunQisi LianNeng XiaRuizi PengKai TangYaqi WangShuai Wang
2021-05-02
MRCBert: A Machine Reading ComprehensionApproach for Unsupervised Summarization
| Saurabh JainGuokai TangLim Sze Chi
2021-05-01
Incorporating Transformer and LSTM to Kalman Filter with EM algorithm for state estimation
| Zhuangwei Shi
2021-05-01
Audio Transformers:Transformer Architectures For Large Scale Audio Understanding. Adieu Convolutions
Prateek VermaJonathan Berger
2021-05-01
When to Fold'em: How to answer Unanswerable questions
| Marshall HoZhipeng ZhouJudith He
2021-05-01
SVT-Net: A Super Light-Weight Network for Large Scale Place Recognition using Sparse Voxel Transformers
Zhaoxin FanZhenbo SongHongyan LiuJun HeXiaoyong Du
2021-05-01
Mitigating Political Bias in Language Models Through Reinforced Calibration
Ruibo LiuChenyan JiaJason WeiGuangxuan XuLili WangSoroush Vosoughi
2021-04-30
CAT: Cross-Attention Transformer for One-Shot Object Detection
Weidong LinYuyan DengYang GaoNing WangJinghao ZhouLingqiao LiuLei ZhangPeng Wang
2021-04-30
BERT Meets Relational DB: Contextual Representations of Relational Databases
Siddhant AroraVinayak GuptaGarima GaurSrikanta Bedathur
2021-04-30
GTN-ED: Event Detection Using Graph Transformer Networks
Sanghamitra DuttaLiang MaTanay Kumar SahaDi LuJoel TetreaultAlejandro Jaimes
2021-04-30
Chop Chop BERT: Visual Question Answering by Chopping VisualBERT's Heads
Chenyu GaoQi ZhuPeng WangQi Wu
2021-04-30
CoSformer: Detecting Co-Salient Object with Transformers
Lv Tang
2021-04-30
CTLR@WiC-TSV: Target Sense Verification using Marked Inputs andPre-trained Models
José G. MorenoElvys Linhares PontesGaël Dias
2021-04-30
Word Sense Disambiguation with Transformer Models
Pierre-Yves VandenbusscheTony ScerriRon Daniel Jr.
2021-04-30
GasHis-Transformer: A Multi-scale Visual Transformer Approach for Gastric Histopathology Image Classification
HaoYuan ChenChen LiXiaoyan LiWeiming HuYixin LiWanli LiuChanghao SunYuDong YaoMarcin Grzegorzek
2021-04-29
Emerging Properties in Self-Supervised Vision Transformers
| Mathilde CaronHugo TouvronIshan MisraHervé JégouJulien MairalPiotr BojanowskiArmand Joulin
2021-04-29
Using Adaptive Gradient for Texture Learning in Single-View 3D Reconstruction
Luoyang LinDihong Tian
2021-04-29
Entailment as Few-Shot Learner
Sinong WangHan FangMadian KhabsaHanzi MaoHao Ma
2021-04-29
Let's Play Mono-Poly: BERT Can Reveal Words' Polysemy Level and Partitionability into Senses
Aina Garí SolerMarianna Apidianaki
2021-04-29
AMR Parsing with Action-Pointer Transformer
| Jiawei ZhouTahira NaseemRamón Fernandez AstudilloRadu Florian
2021-04-29
Pyramid Medical Transformer for Medical Image Segmentation
Zhuangzhuang ZhangBaozhou SunWeixiong Zhang
2021-04-29
HandsFormer: Keypoint Transformer for Monocular 3D Pose Estimation ofHands and Object in Interaction
Shreyas HampaliSayan Deb SarkarMahdi RadVincent Lepetit
2021-04-29
Medical Transformer: Universal Brain Encoder for 3D MRI Analysis
Eunji JunSeungwoo JeongDa-Woon HeoHeung-Il Suk
2021-04-28
Societal Biases in Retrieved Contents: Measurement Framework and Adversarial Mitigation for BERT Rankers
| Navid RekabsazSimone KopeinikMarkus Schedl
2021-04-28
Twins: Revisiting Spatial Attention Design in Vision Transformers
| Xiangxiang ChuZhi TianYuqing WangBo ZhangHaibing RenXiaolin WeiHuaxia XiaChunhua Shen
2021-04-28
MelBERT: Metaphor Detection via Contextualized Late Interaction using Metaphorical Identification Theories
| Minjin ChoiSunkyung LeeEunseong ChoiHeesoo ParkJunhyuk LeeDongwon LeeJongwuk Lee
2021-04-28
Improving BERT Model Using Contrastive Learning for Biomedical Relation Extraction
| Peng SuYifan PengK. Vijay-Shanker
2021-04-28
Point Cloud Learning with Transformer
Xian-Feng HanYu-Jia KuangGuo-Qiang Xiao
2021-04-28
Inpainting Transformer for Anomaly Detection
Jonathan PirnayKeng Chai
2021-04-28
Dual Transformer for Point Cloud Analysis
Xian-Feng HanYi-Fei JinHui-Xian ChengGuo-Qiang Xiao
2021-04-27
Generating Lead Sheets with Affect: A Novel Conditional seq2seq Framework
| Dimos MakrisKat R. AgresDorien Herremans
2021-04-27
UoT-UWF-PartAI at SemEval-2021 Task 5: Self Attention Based Bi-GRU with Multi-Embedding Representation for Toxicity Highlighter
Hamed Babaei GiglouTaher RahgooyMostafa RahgouyJafar Razmara
2021-04-27
Extractive and Abstractive Explanations for Fact-Checking and Evaluation of News
Ashkan KazemiZehua LiVerónica Pérez-RosasRada Mihalcea
2021-04-27
Semi-supervised Interactive Intent Labeling
Saurav SahayEda OkurNagib HakimLama Nachman
2021-04-27
Multi-class Text Classification using BERT-based Active Learning
Sumanth PrabhuMoosa MohamedHemant Misra
2021-04-27
Easy and Efficient Transformer : Scalable Inference Solution For large NLP mode
| Gongzheng liYadong XiJingzhen DingDuan WangBai LiuChangjie FanXiaoxi MaoZeng Zhao
2021-04-26
Rich Semantics Improve Few-shot Learning
| Mohamed AfhamSalman KhanMuhammad Haris KhanMuzammal NaseerFahad Shahbaz Khan
2021-04-26
Visformer: The Vision-friendly Transformer
| Zhengsu ChenLingxi XieJianwei NiuXuefeng LiuLonghui WeiQi Tian
2021-04-26
Focused Attention Improves Document-Grounded Generation
| Shrimai PrabhumoyeKazuma HashimotoYingbo ZhouAlan W BlackRuslan Salakhutdinov
2021-04-26
Phrase break prediction with bidirectional encoder representations in Japanese text-to-speech synthesis
Kosuke FutamataByeongseon ParkRyuichi YamamotoKentaro Tachibana
2021-04-26
PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
Wei ZengXiaozhe RenTeng SuHui WangYi LiaoZhiwei WangXin JiangZhenZhang YangKaisheng WangXiaoda ZhangChen LiZiyan GongYifan YaoXinjing HuangJun WangJianfeng YuQi GuoYue YuYan ZhangJin WangHengtao TaoDasen YanZexuan YiFang PengFangqing JiangHan ZhangLingfeng DengYehong ZhangZhe LinChao ZhangShaojie ZhangMingyue GuoShanzhi GuGaojun FanYaoWei WangXuefeng JinQun LiuYonghong Tian
2021-04-26
Accounting for Agreement Phenomena in Sentence Comprehension with Transformer Language Models: Effects of Similarity-based Interference on Surprisal and Attention
Soo Hyun RyuRichard L. Lewis
2021-04-26
Head-synchronous Decoding for Transformer-based Streaming ASR
Mohan LiCatalin ZorilaRama Doddipatla
2021-04-26
Transformer Meets DCFAM: A Novel Semantic Segmentation Scheme for Fine-Resolution Remote Sensing Images
Libo WangRui LiChenxi DuanShenghui Fang
2021-04-25
Visual Saliency Transformer
Nian LiuNi ZhangKaiyuan WanJunwei HanLing Shao
2021-04-25
baller2vec++: A Look-Ahead Multi-Entity Transformer For Modeling Coordinated Agents
| Michael A. AlcornAnh Nguyen
2021-04-24
Extract then Distill: Efficient and Effective Task-Agnostic BERT Distillation
Cheng ChenYichun YinLifeng ShangZhi WangXin JiangXiao ChenQun Liu
2021-04-24
Learning Passage Impacts for Inverted Indexes
Antonio MalliaOmar KhattabNicola TonellottoTorsten Suel
2021-04-24
Optimizing small BERTs trained for German NER
Jochen ZöllnerKonrad SperfeldChristoph WickRoger Labahn
2021-04-23
VidTr: Video Transformer Without Convolutions
Xinyu LiYanyi ZhangChunhui LiuBing ShuaiYi ZhuBiagio BrattoliHao ChenIvan MarsicJoseph Tighe
2021-04-23
Learning to Cluster Faces via Transformer
Jinxing YeXioajiang PengBaigui SunKai WangXiuyu SunHao LiHanqing Wu
2021-04-23
Multimodal Fusion with BERT and Attention Mechanism for Fake News Detection
Nguyen Manh Duc TuanPham Quang Nhat Minh
2021-04-23
BERT-CoQAC: BERT-based Conversational Question Answering in Context
Munazza ZaibDai Hoang TranSubhash SagarAdnan MahmoodWei E. ZhangQuan Z. Sheng
2021-04-23
Towards Trustworthy Deception Detection: Benchmarking Model Robustness across Domains, Modalities, and Languages
Maria GlenskiEllyn AytonRobin CosbeyDustin ArendtSvitlana Volkova
2021-04-23
So-ViT: Mind Visual Tokens for Vision Transformer
| Jiangtao XieRuiren ZengQilong WangZiqi ZhouPeihua Li
2021-04-22
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
| Hassan AkbariLinagzhe YuanRui QianWei-Hong ChuangShih-Fu ChangYin CuiBoqing Gong
2021-04-22
Token Labeling: Training a 85.4% Top-1 Accuracy Vision Transformer with 56M Parameters on ImageNet
| Zihang JiangQibin HouLi YuanDaquan ZhouXiaojie JinAnran WangJiashi Feng
2021-04-22
On Geodesic Distances and Contextual Embedding Compression for Text Classification
| Rishi JhaKai Mihata
2021-04-22
Carbon Emissions and Large Neural Network Training
David PattersonJoseph GonzalezQuoc LeChen LiangLluis-Miquel MunguiaDaniel RothchildDavid SoMaud TexierJeff Dean
2021-04-21
Label-Synchronous Speech-to-Text Alignment for ASR Using Forward and Backward Transformers
Yusuke KidaTatsuya KomatsuMasahito Togami
2021-04-21
Discriminative Self-training for Punctuation Prediction
Qian ChenWen WangMengzhe ChenQinglin Zhang
2021-04-21
"What's The Context?" : Long Context NLM Adaptation for ASR Rescoring in Conversational Agents
Ashish ShenoySravan BodapatiMonica SunkaraSrikanth RonankiKatrin Kirchhoff
2021-04-21
Disfluency Detection with Unlabeled Data and Small BERT Models
Johann C. RochollVicky ZayatsDaniel D. WalkerNoah B. MuradAaron SchneiderDaniel J. Liebling
2021-04-21
Efficient pre-training objectives for Transformers
Luca Di LielloMatteo GabburoAlessandro Moschitti
2021-04-20
UIT-ISE-NLP at SemEval-2021 Task 5: Toxic Spans Detection with BiLSTM-CRF and Toxic Bert Comment Classification
Son T. LuuNgan Luu-Thuy Nguyen
2021-04-20
Measuring Shifts in Attitudes Towards COVID-19 Measures in Belgium Using Multilingual BERT
| Kristen ScottPieter DelobelleBettina Berendt
2021-04-20
WASSA@IITK at WASSA 2021: Multi-task Learning and Transformer Finetuning for Emotion Classification and Empathy Prediction
Jay MundraRohan GuptaSagnik Mukherjee
2021-04-20
B-PROP: Bootstrapped Pre-training with Representative Words Prediction for Ad-hoc Retrieval
Xinyu MaJiafeng GuoRuqing ZhangYixing FanYingyan LiXueqi Cheng
2021-04-20
CATE meets ML -- The Conditional Average Treatment Effect and Machine Learning
Daniel Jacob
2021-04-20
Analyzing COVID-19 Tweets with Transformer-based Language Models
Philip FeldmanSim TiwariCharissa S. L. CheahJames R. FouldsSHimei Pan
2021-04-20
Modeling Event Plausibility with Consistent Conceptual Abstraction
Ian PoradaKaheer SulemanAdam TrischlerJackie Chi Kit Cheung
2021-04-20
Subsentence Extraction from Text Using Coverage-Based Deep Learning Language Models
| JongYoon LimInkyu SaHo Seok AhnNorina GasteigerSanghyub John LeeBruce MacDonald
2021-04-20
Improving Transformer-Kernel Ranking Model Using Conformer and Query Term Independence
Bhaskar MitraSebastian HofstatterHamed ZamaniNick Craswell
2021-04-19
Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
| Aditya PrakashKashyap ChittaAndreas Geiger
2021-04-19
A novel Time-frequency Transformer and its Application in Fault Diagnosis of Rolling Bearings
Yifei DingMinping JiaQiuhua MiaoYudong Cao
2021-04-19
BigGreen at SemEval-2021 Task 1: Lexical Complexity Prediction with Assembly Models
Aadil IslamWeicheng MaSoroush Vosoughi
2021-04-19
TransCrowd: Weakly-Supervised Crowd Counting with Transformer
| Dingkang LiangXiwu ChenWei XuYu ZhouXiang Bai
2021-04-19
Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers
Takaaki HoriNiko MoritzChiori HoriJonathan Le Roux
2021-04-19
Probing for Bridging Inference in Transformer Language Models
Onkar PanditYufang Hou
2021-04-19
Code Structure Guided Transformer for Source Code Summarization
Shuzheng GaoCuiyun GaoYulan HeJichuan ZengLun Yiu NieXin Xia
2021-04-19
Sentiment Classification in Swahili Language Using Multilingual BERT
Gati L. MartinMedard E. MswahiliYoung-Seob Jeong
2021-04-19
TeamUNCC@LT-EDI-EACL2021: Hope Speech Detection using Transfer Learning with Transformers
| Khyati MahajanErfan Al-HossamiSamira Shaikh
2021-04-19
OCTIS: Comparing and Optimizing Topic models is Simple!
| Silvia TerragniElisabetta FersiniBruno Giovanni GaluzziPietro TropeanoAntonio Candelieri
2021-04-19
ELECTRAMed: a new pre-trained language representation model for biomedical NLP
| Giacomo MioloGiulio MantoanCarlotta Orsenigo
2021-04-19
Neural Language Models with Distant Supervision to Identify Major Depressive Disorder from Clinical Notes
Bhavani Singh Agnikula KshatriyaNicolas A NunezManuel Gardea- ResendezEuijung RyuBrandon J CoombesSunyang FuMark A FryeJoanna M BiernackaYanshan Wang
2021-04-19
Extracting Temporal Event Relation with Syntactic-Guided Temporal Graph Transformer
Shuaicheng ZhangLifu HuangQiang Ning
2021-04-19
Operationalizing a National Digital Library: The Case for a Norwegian Transformer Model
Per E KummervoldJavier de la RosaFreddy WetjenSvein Arne Brygfjeld
2021-04-19
Modeling "Newsworthiness" for Lead-Generation Across Corpora
Alexander SpangherNanyun PengJonathan MayEmilio Ferrara
2021-04-19
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
Kang Min YooDongju ParkJaewook KangSang-Woo LeeWoomyeong Park
2021-04-18
FedNLP: A Research Platform for Federated Learning in Natural Language Processing
| Bill Yuchen LinChaoyang HeZihang ZengHulin WangYufen HuangMahdi SoltanolkotabiXiang RenSalman Avestimehr
2021-04-18
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Yao LuMax BartoloAlastair MooreSebastian RiedelPontus Stenetorp
2021-04-18
Natural Instructions: Benchmarking Generalization to New Tasks from Natural Language Instructions
Swaroop MishraDaniel KhashabiChitta BaralHannaneh Hajishirzi
2021-04-18
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation
Tianyu LiuYizhe ZhangChris BrockettYi MaoZhifang SuiWeizhu ChenBill Dolan
2021-04-18
Rethinking Network Pruning -- under the Pre-train and Fine-tune Paradigm
Dongkuan XuIan E. H. YenJinxi ZhaoZhibin Xiao
2021-04-18
Dual-View Distilled BERT for Sentence Embedding
Xingyi Cheng
2021-04-18
Demystifying the Better Performance of Position Encoding Variants for Transformer
Pu-Chin ChenHenry TsaiSrinadh BhojanapalliHyung Won ChungYin-Wen ChangChun-Sung Ferng
2021-04-18
Language in a (Search) Box: Grounding Language Learning in Real-World Human-Machine Interaction
Federico BianchiCiro GrecoJacopo Tagliabue
2021-04-18
On the Strengths of Cross-Attention in Pretrained Transformers for Machine Translation
Mozhdeh GheiniXiang RenJonathan May
2021-04-18
Zero-shot Cross-lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders
Guanhua ChenShuming MaYun ChenLi DongDongdong ZhangJia PanWenping WangFuru Wei
2021-04-18
CEAR: Cross-Entity Aware Reranker for Knowledge Base Completion
Keshav KolluruMayank Singh ChauhanYatin NandwaniParag SinglaMausam
2021-04-18
mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs
Zewen ChiLi DongShuming MaShaohan Huang Xian-Ling MaoHeyan HuangFuru Wei
2021-04-18
The Power of Scale for Parameter-Efficient Prompt Tuning
| Brian LesterRami Al-RfouNoah Constant
2021-04-18
When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset
| Lucia ZhengNeel GuhaBrandon R. AndersonPeter HendersonDaniel E. Ho
2021-04-18
Identifying the Limits of Cross-Domain Knowledge Transfer for Pretrained Models
| Zhengxuan WuNelson F. LiuChristopher Potts
2021-04-17
Multi-source Neural Topic Modeling in Multi-view Embedding Spaces
| Pankaj GuptaYatin ChaudharyHinrich Schütze
2021-04-17
Higher Order Recurrent Space-Time Transformer
Tsung-Ming TaiGiuseppe FiameniCheng-Kuang LeeOswald Lanz
2021-04-17
Visual Transformer Pruning
Mingjian ZhuKai HanYehui TangYunhe Wang
2021-04-17
Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training
Kuan-Hao HuangWasi Uddin AhmadNanyun PengKai-Wei Chang
2021-04-17
UPB at SemEval-2021 Task 5: Virtual Adversarial Training for Toxic Spans Detection
Andrei ParaschivDumitru-Clementin CercelMihai Dascalu
2021-04-17
The Topic Confusion Task: A Novel Scenario for Authorship Attribution
Malik H. AltakroriJackie Chi Kit CheungBenjamin C. M. Fung
2021-04-17
Frequency-based Distortions in Contextualized Word Embeddings
Kaitlyn ZhouKawin EthayarajhDan Jurafsky
2021-04-17
A multilabel approach to morphosyntactic probing
Naomi Tachikawa ShapiroAmandalynne PaulladaShane Steinert-Threlkeld
2021-04-17
Hierarchical Transformer Networks for Longitudinal Clinical Document Classification
Yuqi SiKirk Roberts
2021-04-17
ASBERT: Siamese and Triplet network embedding for open question answering
Olabanji Shonibare
2021-04-17
Co-BERT: A Context-Aware BERT Retrieval Model Incorporating Local and Query-specific Context
Xiaoyang ChenKai HuiBen HeXianpei HanLe SunZheng Ye
2021-04-17
Editing Factual Knowledge in Language Models
| Nicola De CaoWilker AzizIvan Titov
2021-04-16
Serial or Parallel? Plug-able Adapter for multilingual machine translation
Yaoming ZhuJiangtao FengChengqi ZhaoMingxuan WangLei LI
2021-04-16
Fast, Effective and Self-Supervised: Transforming Masked LanguageModels into Universal Lexical and Sentence Encoders
Fangyu LiuIvan VulićAnna KorhonenNigel Collier
2021-04-16
Is Your Language Model Ready for Dense Representation Fine-tuning?
| Luyu GaoJamie Callan
2021-04-16
An Adversarially-Learned Turing Test for Dialog Generation Models
| Xiang GaoYizhe ZhangMichel GalleyBill Dolan
2021-04-16
Towards Variable-Length Textual Adversarial Attacks
Junliang GuoZhirui ZhangLinlin ZhangLinli XuBoxing ChenEnhong ChenWeihua Luo
2021-04-16
Temporal Adaptation of BERT and Performance on Downstream Document Classification: Insights from Social Media
| Paul RöttgerJanet B. Pierrehumbert
2021-04-16
Comparison of Grammatical Error Correction Using Back-Translation Models
Aomi KoyamaKengo HotateMasahiro KanekoMamoru Komachi
2021-04-16
Probing Across Time: What Does RoBERTa Know and When?
Leo Z. LiuYizhong WangJungo KasaiHannaneh HajishirziNoah A. Smith
2021-04-16
Text2App: A Framework for Creating Android Apps from Text Descriptions
| Masum HasanKazi Sajeed MehrabWasi Uddin AhmadRifat Shahriyar
2021-04-16
An Analysis of a BERT Deep Learning Strategy on a Technology Assisted Review Task
Alexandros Ioannidis
2021-04-16
Surface Form Competition: Why the Highest Probability Answer Isn't Always Right
| Ari HoltzmanPeter WestVered SchwartzYejin ChoiLuke Zettlemoyer
2021-04-16
Membership Inference Attack Susceptibility of Clinical Language Models
Abhyuday JagannathaBhanu Pratap Singh RawatHong Yu
2021-04-16
BERT memorisation and pitfalls in low-resource scenarios
Michael TänzerSebastian RuderMarek Rei
2021-04-16
ExplaGraphs: An Explanation Graph Generation Task for Structured Commonsense Reasoning
| Swarnadeep SahaPrateek YadavLisa BauerMohit Bansal
2021-04-15
Demystify Optimization Challenges in Multilingual Transformers
Xian LiHongyu Gong
2021-04-15
A Sample-Based Training Method for Distantly Supervised Relation Extraction with Pre-Trained Transformers
Mehrdad NasserMohamad Bagher SajadiBehrouz Minaei-Bidgoli
2021-04-15
Emotion Dynamics Modeling via BERT
Haiqin YangJianping Shen
2021-04-15
Text Guide: Improving the quality of long text classification by a text selection method based on feature importance
Krzysztof FiokWaldemar KarwowskiEdgar GutierrezMohammad Reza DavahliMaciej WilamowskiTareq AhramAwad Al-JuaidJozef Zurada
2021-04-15
Self-supervised Video Object Segmentation by Motion Grouping
Charig YangHala LamdouarErika LuAndrew ZissermanWeidi Xie
2021-04-15
Vision Transformer using Low-level Chest X-ray Feature Corpus for COVID-19 Diagnosis and Severity Quantification
Sangjoon ParkGwanghyun KimYujin OhJoon Beom SeoSang Min LeeJin Hwan KimSungjun MoonJae-Kwang LimJong Chul Ye
2021-04-15
Cross-domain Speech Recognition with Unsupervised Character-level Distribution Matching
| Wenxin HouJindong WangXu TanTao QinTakahiro Shinozaki
2021-04-15
Are Multilingual BERT models robust? A Case Study on Adversarial Attacks for Multilingual Question Answering
Sara RosenthalMihaela BorneaAvirup Sil
2021-04-15
SINA-BERT: A pre-trained Language Model for Analysis of Medical Texts in Persian
Nasrin TaghizadehEhsan DoostmohammadiElham SeifossadatHamid R. RabieeMaedeh S. Tahaei
2021-04-15
Privacy-Adaptive BERT for Natural Language Understanding
Chen QuWeize KongLiu YangMingyang ZhangMichael BenderskyMarc Najork
2021-04-15
NT5?! Training T5 to Perform Numerical Reasoning
| Peng-Jian YangYing Ting ChenYuechan ChenDaniel Cer
2021-04-15
TorontoCL at CMCL 2021 Shared Task: RoBERTa with Multi-Stage Fine-Tuning for Eye-Tracking Prediction
| Bai LiFrank Rudzicz
2021-04-15
UHD-BERT: Bucketed Ultra-High Dimensional Sparse Representations for Full Ranking
Kyoung-Rok JangJunmo KangGiwon HongSung-Hyon MyaengJoohee ParkTaewon YoonHeecheol Seo
2021-04-15
Points as Queries: Weakly Semi-supervised Object Detection by Points
Liangyu ChenTong YangXiangyu ZhangWei zhangJian Sun
2021-04-15
UIT-E10dot3 at SemEval-2021 Task 5: Toxic Spans Detection with Named Entity Recognition and Question-Answering Approaches
Phu Gia HoangLuan Thanh NguyenKiet Van Nguyen
2021-04-15
BERT based Transformers lead the way in Extraction of Health Information from Social Media
Sidharth RAbhiraj TiwariParthivi ChoubeySaisha KashyapSahil KhoseKumud LakaraNishesh SinghUjjwal Verma
2021-04-15
Does BERT Pretrained on Clinical Notes Reveal Sensitive Data?
| Eric LehmanSarthak JainKarl PichottaYoav GoldbergByron C. Wallace
2021-04-15
How to Train BERT with an Academic Budget
| Peter IzsakMoshe BerchanskyOmer Levy
2021-04-15
Rethinking Text Line Recognition Models
Daniel Hernandez DiazSiyang QinReeve IngleYasuhisa FujiiAlessandro Bissacco
2021-04-15
Shoulder Implant X-Ray Manufacturer Classification: Exploring with Vision Transformer
| Meng ZhouShanglin Mo
2021-04-15
Syntax-Aware Graph-to-Graph Transformer for Semantic Role Labelling
Alireza MohammadshahiJames Henderson
2021-04-15
A Survey of Recent Abstract Summarization Techniques
Diyah Puspitaningrum
2021-04-15
NAREOR: The Narrative Reordering Problem
Varun GangalSteven Y. FengEduard HovyTeruko Mitamura
2021-04-14
Dependency Parsing based Semantic Representation Learning with Graph Neural Network for Enhancing Expressiveness of Text-to-Speech
Yixuan ZhouChanghe SongJingbei LiZhiyong WuHelen Meng
2021-04-14
An Introduction of mini-AlphaStar
| Ruo-Ze LiuWenhai WangYanjie ShenZhiqi LiYang YuTong Lu
2021-04-14
Decoupled Spatial-Temporal Transformer for Video Inpainting
| Rui LiuHanming DengYangyi HuangXiaoyu ShiLewei LuWenxiu SunXiaogang WangJifeng DaiHongsheng Li
2021-04-14
Sparse Attention with Linear Units
| Biao ZhangIvan TitovRico Sennrich
2021-04-14
Knowledge-driven Answer Generation for Conversational Search
Mariana LeiteRafael FerreiraDavid SemedoJoão Magalhães
2021-04-14
Non-autoregressive sequence-to-sequence voice conversion
Tomoki HayashiWen-Chin HuangKazuhiro KobayashiTomoki Toda
2021-04-14
VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers
| Sharif Amit KamranKhondker Fariha HossainAlireza TavakkoliStewart Lee ZuckerbrodKenton M. SandersSalah A. Baker
2021-04-14
On the Robustness of Goal Oriented Dialogue Systems to Real-world Noise
Jason KroneSailik SenguptaSaab Mansoor
2021-04-14
Disentangling Representations of Text by Masking Transformers
Xiongyi ZhangJan-Willem van de MeentByron C. Wallace
2021-04-14
An Interpretability Illusion for BERT
Tolga BolukbasiAdam PearceAnn YuanAndy CoenenEmily ReifFernanda ViégasMartin Wattenberg
2021-04-14
Static Embeddings as Efficient Knowledge Bases?
| Philipp DufterNora KassnerHinrich Schütze
2021-04-14
TWEAC: Transformer with Extendable QA Agent Classifiers
| Gregor GeigleNils ReimersAndreas RückléIryna Gurevych
2021-04-14
Demystifying BERT: Implications for Accelerator Design
Suchita PatiShaizeen AgaNuwan JayasenaMatthew D. Sinclair
2021-04-14
QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering
| Michihiro YasunagaHongyu RenAntoine BosselutPercy LiangJure Leskovec
2021-04-13
Semantic maps and metrics for science Semantic maps and metrics for science using deep transformer encoders
Brendan ChambersJames Evans
2021-04-13
Understanding Transformers for Bot Detection in Twitter
| Andres Garcia-SilvaCristian BerrioJose Manuel Gomez-Perez
2021-04-13
1-bit LAMB: Communication Efficient Large-Scale Large-Batch Training with LAMB's Convergence Speed
| Conglong LiAmmar Ahmad AwanHanlin TangSamyam RajbhandariYuxiong He
2021-04-13
Mediators in Determining what Processing BERT Performs First
| Aviv SlobodkinLeshem ChoshenOmri Abend
2021-04-13
Discourse Probing of Pretrained Language Models
Fajri KotoJey Han LauTimothy Baldwin
2021-04-13
UPB at SemEval-2021 Task 7: Adversarial Multi-Task Learning for Detecting and Rating Humor and Offense
Răzvan-Alexandru SmăduDumitru-Clementin CercelMihai Dascalu
2021-04-13
Transformer-based Methods for Recognizing Ultra Fine-grained Entities (RUFES)
Emanuela BorosAntoine Doucet
2021-04-13
MS2: Multi-Document Summarization of Medical Studies
| Jay DeYoungIz BeltagyMadeleine van ZuylenBailey KuehlLucy Lu Wang
2021-04-13
ViT-V-Net: Vision Transformer for Unsupervised Volumetric Medical Image Registration
| Junyu ChenYufan HeEric C. FreyYe LiYong Du
2021-04-13
Large-Scale Contextualised Language Modelling for Norwegian
| Andrey KutuzovJeremy BarnesErik VelldalLilja ØvrelidStephan Oepen
2021-04-13
Can a Transformer Pass the Wug Test? Tuning Copying Bias in Neural Morphological Inflection Models
Ling LiuMans Hulden
2021-04-13
WHOSe Heritage: Classification of UNESCO World Heritage "Outstanding Universal Value" Documents with Smoothed Labels
| Nan BaiRenqian LuoPirouz NourianAna Pereira Roders
2021-04-12
Fine-Tuning Transformers for Identifying Self-Reporting Potential Cases and Symptoms of COVID-19 in Tweets
| Max FlemingPriyanka DondetiCaitlin N. DreisbachAdam Poliak
2021-04-12
Multilingual Language Models Predict Human Reading Behavior
| Nora HollensteinFederico PirovanoCe ZhangLena JägerLisa Beinborn
2021-04-12
Learning to Remove: Towards Isotropic Pre-trained BERT Embedding
| Yuxin LiangRui CaoJie ZhengJie RenLing Gao
2021-04-12
Updater-Extractor Architecture for Inductive World State Representations
Arseny MoskvichevJames A. Liu
2021-04-12
Learning dynamic and hierarchical traffic spatiotemporal features with Transformer
Haoyang YanXiaolei Ma
2021-04-12
Escaping the Big Data Paradigm with Compact Transformers
| Ali HassaniSteven WaltonNikhil ShahAbulikemu AbuduweiliJiachen LiHumphrey Shi
2021-04-12
Cloth Interactive Transformer for Virtual Try-On
| Bin RenHao TangFanyang MengRunwei DingLing ShaoPhilip H. S. TorrNicu Sebe
2021-04-12
Learning to Synthesize Data for Semantic Parsing
| Bailin WangWenpeng YinXi Victoria LinCaiming Xiong
2021-04-12
Fighting the COVID-19 Infodemic with a Holistic BERT Ensemble
| Giorgos TziafasKonstantinos KogkalidisTommaso Caselli
2021-04-12
Family of Origin and Family of Choice: Massively Parallel Lexiconized Iterative Pretraining for Severely Low Resource Machine Translation
Zhong ZhouAlex Waibel
2021-04-12
On Representation Learning for Scientific News Articles Using Heterogeneous Knowledge Graphs
Angelika RomanouPanayiotis SmerosKarl Aberer
2021-04-12
Paragraph-level Simplification of Medical Texts
Ashwin DevarajIain J. MarshallByron C. WallaceJunyi Jessy Li
2021-04-12
BERT based freedom to operate patent analysis
Michael FreunekAndré Bodmer
2021-04-12
Does syntax matter? A strong baseline for Aspect-based Sentiment Analysis with RoBERTa
| Junqi DaiHang YanTianxiang SunPengFei LiuXipeng Qiu
2021-04-11
Innovative Bert-based Reranking Language Models for Speech Recognition
Shih-Hsuan ChiuBerlin Chen
2021-04-11
UniDrop: A Simple yet Effective Technique to Improve Transformer without Extra Cost
Zhen WuLijun WuQi MengYingce XiaShufang XieTao QinXinyu DaiTie-Yan Liu
2021-04-11
Fine-tuning Encoders for Improved Monolingual and Zero-shot Polylingual Neural Topic Modeling
| Aaron MuellerMark Dredze
2021-04-11
MIPT-NSU-UTMN at SemEval-2021 Task 5: Ensembling Learning with Pre-trained Language Models for Toxic Spans Detection
| Mikhail KotyushevAnna GlazkovaDmitry Morozov
2021-04-10
Meta-tuning Language Models to Answer Prompts Better
Ruiqi ZhongKristy LeeZheng ZhangDan Klein
2021-04-10
ZS-BERT: Towards Zero-Shot Relation Extraction with Attribute Representation Learning
| Chih-Yao ChenCheng-Te Li
2021-04-10
Non-autoregressive Transformer-based End-to-end ASR using BERT
Fu-Hao YuKuan-Yu Chen
2021-04-10
Know What and Know Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation
Yuankai QiZizheng PanYicong HongMing-Hsuan YangAnton Van Den HengelQi Wu
2021-04-09
Knowledge-Aware Graph-Enhanced GPT-2 for Dialogue State Tracking
Weizhe LinBo-Hsian TsengBill Byrne
2021-04-09
Text2Chart: A Multi-Staged Chart Generator from Natural Language Text
| Md. Mahinur RashidHasin Kawsar JahanAnnysha HuzzatRiyasaat Ahmed RahulTamim Bin ZakirFarhana MeemMd. Saddam Hossain MuktaSwakkhar Shatabda
2021-04-09
Deep Transformer Networks for Time Series Classification: The NPP Safety Case
Bing ZhaAlessandro VanniYassin HassanTunc AldemirAlper Yilmaz
2021-04-09
KI-BERT: Infusing Knowledge Context for Better Language and Domain Understanding
Keyur FalduAmit ShethPrashant KikaniHemang Akabari
2021-04-09
Transformers: "The End of History" for NLP?
Anton ChernyavskiyDmitry IlvovskyPreslav Nakov
2021-04-09
Lone Pine at SemEval-2021 Task 5: Fine-Grained Detection of Hate Speech Using BERToxic
Yakoob KhanWeicheng MaSoroush Vosoughi
2021-04-08
Uppsala NLP at SemEval-2021 Task 2: Multilingual Language Models for Fine-tuning and Feature Extraction in Word-in-Context Disambiguation
Huiling YouXingran ZhuSara Stymne
2021-04-08
Probing BERT in Hyperbolic Spaces
| Boli ChenYao FuGuangwei XuPengjun XieChuanqi TanMosha ChenLiping Jing
2021-04-08
Revisiting Simple Neural Probabilistic Language Models
Simeng SunMohit Iyyer
2021-04-08
Layer Reduction: Accelerating Conformer-Based Self-Supervised Model via Layer Consistency
Jinchuan TianRongzhi GuHelin WangYuexian Zou
2021-04-08
Combining Pre-trained Word Embeddings and Linguistic Features for Sequential Metaphor Identification
Rui MaoChenghua LinFrank Guerin
2021-04-07
Better Neural Machine Translation by Extracting Linguistic Information from BERT
| Hassan S. ShavaraniAnoop Sarkar
2021-04-07
Facial Attribute Transformers for Precise and Robust Makeup Transfer
Zhaoyi WanHaoran ChenJielei ZhangWentao JiangCong YaoJiebo Luo
2021-04-07
LI-Net: Large-Pose Identity-Preserving Face Reenactment Network
Jin LiuPeng ChenTao LiangZhaoxing LiCai YuShuqiao ZouJiao DaiJizhong Han
2021-04-07
Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
Zhicheng HuangZhaoyang ZengYupan HuangBei LiuDongmei FuJianlong Fu
2021-04-07
Interpreting A Pre-trained Model Is A Key For Model Architecture Optimization: A Case Study On Wav2Vec 2.0
Liu ChenMeysam Asgari
2021-04-07
Interpreting Verbal Metaphors by Paraphrasing
Rui MaoChenghua LinFrank Guerin
2021-04-07
Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs
| Sujeong ChaWangrui HouHyun JungMy PhungMichael PichenyHong-Kwang KuoSamuel ThomasEdmilson Morais
2021-04-07
MuSLCAT: Multi-Scale Multi-Level Convolutional Attention Transformer for Discriminative Music Modeling on Raw Waveforms
Kai MiddlebrookShyam SudhakaranDavid Guy Brizan
2021-04-06
hBert + BiasCorp -- Fighting Racism on the Web
Olawale OnabolaZhuang MaYang XieBenjamin AkeraAbdulrahman IbraheemJia XueDianbo LiuYoshua Bengio
2021-04-06
Attention Head Masking for Inference Time Content Selection in Abstractive Summarization
Shuyang CaoLu Wang
2021-04-06
Fourier Image Transformer
| Tim-Oliver BuchholzFlorian Jug
2021-04-06
Variational Transformer Networks for Layout Generation
Diego Martin ArroyoJanis PostelsFederico Tombari
2021-04-06
LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring
Anton MitrofanovMariya KorenevskayaIvan PodluzhnyYuri KhokhlovAleksandr LaptevAndrei AndrusenkoAleksei IlinMaxim KorenevskyIvan MedennikovAleksei Romanenko
2021-04-06
ODE Transformer: An Ordinary Differential Equation-Inspired Model for Neural Machine Translation
Bei LiQuan DuTao ZhouShuhan ZhouXin ZengTong XiaoJingbo Zhu
2021-04-06
Variable selection with missing data in both covariates and outcomes: Imputation and machine learning
| Liangyuan HuJung-Yi Joyce LinJiayi Ji
2021-04-06
CodeTrans: Towards Cracking the Language of Silicone's Code Through Self-Supervised Deep Learning and High Performance Computing
| Ahmed ElnaggarWei DingLlion JonesTom GibbsTamas FeherChristoph AngererSilvia SeveriniFlorian MatthesBurkhard Rost
2021-04-06
Efficient transfer learning for NLP with ELECTRA
| François Mercier
2021-04-06
Exploring Transformers in Emotion Recognition: a comparison of BERT, DistillBERT, RoBERTa, XLNet and ELECTRA
Diogo Cortiz
2021-04-05
What's the best place for an AI conference, Vancouver or ______: Why completing comparative questions is difficult
Avishai ZagouryEinat MinkovIdan SzpektorWilliam W. Cohen
2021-04-05
AST: Audio Spectrogram Transformer
Yuan GongYu-An ChungJames Glass
2021-04-05
Semantic Distance: A New Metric for ASR Performance Analysis Towards Spoken Language Understanding
Suyoun KimAbhinav AroraDuc LeChing-Feng YehChristian FuegenOzlem KalinliMichael L. Seltzer
2021-04-05
ReCAM@IITK at SemEval-2021 Task 4: BERT and ALBERT based Ensemble for Abstract Word Prediction
| Abhishek MittalAshutosh Modi
2021-04-04
Improving Pretrained Models for Zero-shot Multi-label Text Classification through Reinforced Label Hierarchy Reasoning
| Hui LiuDanqing ZhangBing YinXiaodan Zhu
2021-04-04
TransfoRNN: Capturing the Sequential Information in Self-Attention Representations for Language Modeling
Tze Yuang ChongXuyang WangLin YangJunjie Wang
2021-04-04
MCL@IITK at SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation using Augmented Data, Signals, and Transformers
Rohan GuptaJay MundraDeepak MahajanAshutosh Modi
2021-04-04
IITK@Detox at SemEval-2021 Task 5: Semi-Supervised Learning and Dice Loss for Toxic Spans Detection
| Archit BansalAbhay KaushikAshutosh Modi
2021-04-04
IndT5: A Text-to-Text Transformer for 10 Indigenous Languages
El Moatez Billah NagoudiWei-Rui ChenMuhammad Abdul-MageedHasan Cavusogl
2021-04-04
Unsupervised Domain Adaptation with Global and Local Graph Neural Networks in Limited Labeled Data Scenario: Application to Disaster Management
Samujjwal GhoshSubhadeep MajiMaunendra Sankar Desarkar
2021-04-03
Exploring the Role of BERT Token Representations to Explain Sentence Probing Results
Hosein MohebbiAli ModarressiMohammad Taher Pilehvar
2021-04-03
Deepfake Detection Scheme Based on Vision Transformer and Distillation
Young-Jin HeoYoung-Ju ChoiYoung-Woon LeeByung-Gyu Kim
2021-04-03
Efficient DETR: Improving End-to-End Object Detector with Dense Prior
Zhuyu YaoJiangbo AiBoxun LiChi Zhang
2021-04-03
IITK@LCP at SemEval 2021 Task 1: Classification for Lexical Complexity Regression Task
| Neil Rajiv ShirudeSagnik MukherjeeTushar ShandhilyaAnanta MukherjeeAshutosh Modi
2021-04-02
Language-based Video Editing via Multi-Modal Multi-Level Transformer
Tsu-Jui FuXin Eric WangScott T. GraftonMiguel P. EcksteinWilliam Yang Wang
2021-04-02
AAformer: Auto-Aligned Transformer for Person Re-Identification
Kuan ZhuHaiyun GuoShiliang ZhangYaoWei WangGaopan HuangHonglin QiaoJing LiuJinqiao WangMing Tang
2021-04-02
Effect of depth order on iterative nested named entity recognition models
Perceval WajsburtYoann TailléXavier Tannier
2021-04-02
The Coronavirus is a Bioweapon: Analysing Coronavirus Fact-Checked Stories
Lynnette Hui Xian NgKathleen M. Carley
2021-04-02
Using GPT-2 to Create Synthetic Data to Improve the Prediction Performance of NLP Machine Learning Classification Models
Dewayne Whitfield
2021-04-02
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis
Ajay JainMatthew TancikPieter Abbeel
2021-04-01
HLE-UPC at SemEval-2021 Task 5: Multi-Depth DistilBERT for Toxic Spans Detection
| Rafel Palliser-SansAlbert Rial-Farràs
2021-04-01
WakaVT: A Sequential Variational Transformer for Waka Generation
Yuka TakeishiMingxuan NiuJing LuoZhong JinXinyu Yang
2021-04-01
LoFTR: Detector-Free Local Feature Matching with Transformers
| Jiaming SunZehong ShenYuang WangHujun BaoXiaowei Zhou
2021-04-01
TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking
Peng ChuJiang WangQuanzeng YouHaibin LingZicheng Liu
2021-04-01
Keyword Transformer: A Self-Attention Model for Keyword Spotting
| Axel BergMark O'ConnorMiguel Tairum Cruz
2021-04-01
Next Generation Multitarget Trackers: Random Finite Set Methods vs Transformer-based Deep Learning
| Juliano PintoGeorg HessWilliam LjungberghYuxuan XiaLennart SvenssonHenk Wymeersch
2021-04-01
Going deeper with Image Transformers
| Hugo TouvronMatthieu CordAlexandre SablayrollesGabriel SynnaeveHervé Jégou
2021-03-31
Adversarial Attacks and Defenses for Speech Recognition Systems
Piotr ŻelaskoSonal JoshiYiwen ShaoJesus VillalbaJan TrmalNajim DehakSanjeev Khudanpur
2021-03-31
Learning Spatio-Temporal Transformer for Visual Tracking
| Bin YanHouwen PengJianlong FuDong WangHuchuan Lu
2021-03-31
Spatiotemporal Transformer for Video-based Person Re-identification
Tianyu ZhangLonghui WeiLingxi XieZijie ZhuangYongfei ZhangBo LiQi Tian
2021-03-30
Automatic Graph Partitioning for Very Large-scale Deep Learning
Masahiro TanakaKenjiro TauraToshihiro HanawaKentaro Torisawa
2021-03-30
Read and Attend: Temporal Localisation in Sign Language Videos
Gül VarolLiliane MomeniSamuel AlbanieTriantafyllos AfourasAndrew Zisserman
2021-03-30
Rethinking Spatial Dimensions of Vision Transformers
| Byeongho HeoSangdoo YunDongyoon HanSanghyuk ChunJunsuk ChoeSeong Joon Oh
2021-03-30
Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
| Mingchen ZhugeDehong GaoDeng-Ping FanLinbo JinBen ChenHaoming ZhouMinghui QiuLing Shao
2021-03-30
Grounding Dialogue Systems via Knowledge Graph Aware Decoding with Pre-trained Transformers
| Debanjan ChaudhuriMd Rashad Al Hasan RonyJens Lehmann
2021-03-30
An In-depth Analysis of Passage-Level Label Transfer for Contextual Document Ranking
Koustav RudraZeon Trevor FernandoAvishek Anand
2021-03-30
Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding
| Pengchuan ZhangXiyang DaiJianwei YangBin XiaoLu YuanLei ZhangJianfeng Gao
2021-03-29
CvT: Introducing Convolutions to Vision Transformers
| Haiping WuBin XiaoNoel CodellaMengchen LiuXiyang DaiLu YuanLei Zhang
2021-03-29
Transformer Tracking
| Xin ChenBin YanJiawen ZhuDong WangXiaoyun YangHuchuan Lu
2021-03-29
Retraining DistilBERT for a Voice Shopping Assistant by Using Universal Dependencies
Pratik JayaraoArpit Sharma
2021-03-29
Whitening Sentence Representations for Better Semantics and Faster Retrieval
| Jianlin SuJiarun CaoWeijie LiuYangyiwen Ou
2021-03-29
Contextual Text Embeddings for Twi
Paul AzunreSalomey OseiSalomey AddoLawrence Asamoah Adu-GyamfiStephen MooreBernard AdabankahBernard OpokuClara Asare-NyarkoSamuel NyarkoCynthia AmoabaEsther Dansoa AppiahFelix AkwerhRichard Nii Lante LawsonJoel BuduEmmanuel DebrahNana BoatengWisdom OforiEdwin Buabeng-MunkohFranklin AdjeiIsaac Kojo Essel AmpomahJoseph OtooReindorf BorkorStandylove Birago MensahLucien MensahMark Amoako MarcelAnokye Acheampong AmponsahJames Ben Hayfron-Acquah
2021-03-29
PENELOPIE: Enabling Open Information Extraction for the Greek Language through Machine Translation
| Dimitris PapadopoulosNikolaos PapadakisNikolaos Matsatsinis
2021-03-28
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
Ye JiaHeiga ZenJonathan ShenYu ZhangYonghui Wu
2021-03-28
HiT: Hierarchical Transformer with Momentum Contrast for Video-Text Retrieval
Song LiuHaoqi FanShengsheng QianYiru ChenWenkui DingZhongyuan Wang
2021-03-28
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
| Chun-Fu ChenQuanfu FanRameswar Panda
2021-03-27
Face Transformer for Recognition
| Yaoyao ZhongWeihong Deng
2021-03-27
Unsupervised Self-Training for Sentiment Analysis of Code-Switched Data
Akshat GuptaSargam MenghaniSai Krishna RallabandiAlan W Black
2021-03-27
Machine Learning Meets Natural Language Processing -- The story so far
N. -I. GalanisP. VafiadisK. -G. MirzaevG. A. Papakostas
2021-03-27
Automated radiology report generation using conditioned transformers
| Omar AlfarghalyRana KhaledAbeer ElkoranyMaha HelalAly Fahmy
2021-03-26
Leveraging neural representations for facilitating access to untranscribed speech from endangered languages
| Nay SanMartijn BarteldsMitchell BrowneLily CliffordFiona GibsonJohn MansfieldDavid NashJane SimpsonMyfany TurpinMaria VollmerSasha WilmothDan Jurafsky
2021-03-26
A Practical Survey on Faster and Lighter Transformers
Quentin FournierGaétan Marceau CaronDaniel Aloise
2021-03-26
Understanding Robustness of Transformers for Image Classification
Srinadh BhojanapalliAyan ChakrabartiDaniel GlasnerDaliang LiThomas UnterthinerAndreas Veit
2021-03-26
Gated Transformer Networks for Multivariate Time Series Classification
| Minghao LiuShengqi RenSiyuan MaJiahui JiaoYizhou ChenZhiguang WangWei Song
2021-03-26
Lifting Transformer for 3D Human Pose Estimation in Video
Wenhao LiHong LiuRunwei DingMengyuan LiuPichao Wang
2021-03-26
BART based semantic correction for Mandarin automatic speech recognition system
Yun ZhaoXuerui YangJinchao WangYongyu GaoChao YanYuanfu Zhou
2021-03-26
Predicting Directionality in Causal Relations in Text
| Pedram HosseiniDavid A. BroniatowskiMona Diab
2021-03-25
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
| Ze LiuYutong LinYue CaoHan HuYixuan WeiZheng ZhangStephen LinBaining Guo
2021-03-25
AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting
| Ye YuanXinshuo WengYanglan OuKris Kitani
2021-03-25
Visual Grounding Strategies for Text-Only Natural Language Processing
Damien Sileo
2021-03-25
Bertinho: Galician BERT Representations
David VilaresMarcos GarciaCarlos Gómez-Rodríguez
2021-03-25
Mask Attention Networks: Rethinking and Strengthen Transformer
Zhihao FanYeyun GongDayiheng LiuZhongyu WeiSiyuan WangJian JiaoNan DuanRuofei ZhangXuanjing Huang
2021-03-25
BERT4SO: Neural Sentence Ordering by Fine-tuning BERT
Yutao ZhuJian-Yun NieKun ZhouShengchao LiuYabo LingPan Du
2021-03-25
K-XLNet: A General Method for Combining Explicit Knowledge with Language Model Pretraining
Ruiqing YanLanchang SunFang WangXiaoMing Zhang
2021-03-25
Czert -- Czech BERT-like Model for Language Representation
| Jakub SidoOndřej PražákPavel PřibáňJan PašekMichal SejákMiloslav Konopík
2021-03-24
Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning Performance of GPT-2
Gregor BetzKyle RichardsonChristian Voigt
2021-03-24
FastMoE: A Fast Mixture-of-Expert Training System
| Jiaao HeJiezhong QiuAohan ZengZhilin YangJidong ZhaiJie Tang
2021-03-24
Multi-view 3D Reconstruction with Transformer
Dan WangXinrui CuiXun ChenZhengxia ZouTianyang ShiSeptimiu SalcudeanZ. Jane WangRabab Ward
2021-03-24
Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning
| Amaia SalvadorErhan GundogduLoris BazzaniMichael Donoser
2021-03-24
Vision Transformers for Dense Prediction
| René RanftlAlexey BochkovskiyVladlen Koltun
2021-03-24
Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection
Jan Philip WahleTerry RuasNorman MeuschkeBela Gipp
2021-03-23
Detecting Hate Speech with GPT-3
| Ke-Li ChiuRohan Alexander
2021-03-23
TMR: Evaluating NER Recall on Tough Mentions
Jingxuan TuConstantine Lignos
2021-03-23
Repairing Pronouns in Translation with BERT-Based Post-Editing
Reid Pryzant
2021-03-23
Variable Name Recovery in Decompiled Binary Code using Constrained Masked Language Modeling
Pratyay BanerjeeKuntal Kumar PalFish WangChitta Baral
2021-03-23
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures
Sushant SinghAusif Mahmood
2021-03-23
Open Domain Question Answering over Tables via Dense Retrieval
| Jonathan HerzigThomas MüllerSyrine KricheneJulian Martin Eisenschlos
2021-03-22
BERT: A Review of Applications in Natural Language Processing and Understanding
M. V. Koroteev
2021-03-22
Hybrid Model for Patent Classification using Augmented SBERT and KNN
| Hamid BekamiriDaniel S. HainRoman Jurowetzki
2021-03-22
Identifying Machine-Paraphrased Plagiarism
| Jan Philip WahleTerry RuasTomáš FoltýnekNorman MeuschkeBela Gipp
2021-03-22
Bridging the gap between supervised classification and unsupervised topic modelling for social-media assisted crisis management
Mikael BrunilaRosie ZhaoAndrei MirceaSam LumleyRenee Sieber
2021-03-22
Incorporating Convolution Designs into Visual Transformers
| Kun YuanShaopeng GuoZiwei LiuAojun ZhouFengwei YuWei Wu
2021-03-22
Tiny Transformers for Environmental Sound Classification at the Edge
David ElliottCarlos E. OteroSteven WyattEvan Martino
2021-03-22
End-to-End Trainable Multi-Instance Pose Estimation with Transformers
Lucas StofflMaxime VidalAlexander Mathis
2021-03-22
Paying Attention to Activation Maps in Camera Pose Regression
Yoli ShavitRon FerensYosi Keller
2021-03-21
Non-Autoregressive Translation by Learning Target Categorical Codes
Yu BaoShuJian HuangTong XiaoDongqi WangXinyu DaiJiajun Chen
2021-03-21
MaAST: Map Attention with Semantic Transformersfor Efficient Visual Navigation
Zachary SeymourKowshik ThopalliNiluthpol MithunHan-Pang ChiuSupun SamarasekeraRakesh Kumar
2021-03-21
ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques
| Yuanxin LiuZheng LinFengcheng Yuan
2021-03-21
NameRec*: Highly Accurate and Fine-grained Person Name Recognition
Rui ZhangYimeng DaiShijie Liu
2021-03-21
An Unsupervised Sampling Approach for Image-Sentence Matching Using Document-Level Structural Information
Zejun LiZhongyu WeiZhihao FanHaijun ShanXuanjing Huang
2021-03-21
Paying Attention to Multiscale Feature Maps in Multimodal Image Matching
Aviad MoreshetYosi Keller
2021-03-20
Play the Shannon Game With Language Models: A Human-Free Approach to Summary Evaluation
Nicholas EganOleg VasilyevJohn Bohannon
2021-03-19
MuRIL: Multilingual Representations for Indian Languages
Simran KhanujaDiksha BansalSarvesh MehtaniSavya KhoslaAtreyee DeyBalaji GopalanDilip Kumar MargamPooja AggarwalRajiv Teja NagipoguShachi DaveShruti GuptaSubhash Chandra Bose GaliVish SubramanianPartha Talukdar
2021-03-19
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
| Stéphane d'AscoliHugo TouvronMatthew LeavittAri MorcosGiulio BiroliLevent Sagun
2021-03-19
Cost-effective Deployment of BERT Models in Serverless Environment
Katarína BenešováAndrej ŠvecMarek Šuppa
2021-03-19
Scalable Visual Transformers with Hierarchical Pooling
Zizheng PanBohan ZhuangJing LiuHaoyu HeJianfei Cai
2021-03-19
Hopper: Multi-hop Transformer for Spatiotemporal Reasoning
| Honglu ZhouAsim KadavFarley LaiAlexandru Niculescu-MizilMartin Renqiang MinMubbasir KapadiaHans Peter Graf
2021-03-19
Transferable Model for Shape Optimization subject to Physical Constraints
Lukas HarschJohannes BurgbacherStefan Riedelbauch
2021-03-19
API2Com: On the Improvement of Automatically Generated Code Comments Using API Documentations
Ramin ShahbaziRishab SharmaFatemeh H. Fard
2021-03-19
Let Your Heart Speak in its Mother Tongue: Multilingual Captioning of Cardiac Signals
| Dani KiyassehTingting ZhuDavid Clifton
2021-03-19
GPT Understands, Too
| Xiao LiuYanan ZhengZhengxiao DuMing DingYujie QianZhilin YangJie Tang
2021-03-18
All NLP Tasks Are Generation Tasks: A General Pretraining Framework
| Zhengxiao DuYujie QianXiao LiuMing DingJiezhong QiuZhilin YangJie Tang
2021-03-18
Contextual Biasing of Language Models for Speech Recognition in Goal-Oriented Conversational Agents
Ashish ShenoySravan BodapatiKatrin Kirchhoff
2021-03-18
Enhancing Transformer for Video Understanding Using Gated Multi-Level Attention and Temporal Adversarial Training
Saurabh SahuPalash Goyal
2021-03-18
Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!
Xuanli HeLingjuan LyuQiongkai XuLichao Sun
2021-03-18
Danish Fungi 2020 -- Not Just Another Image Recognition Dataset
| Lukáš PicekMilan ŠulcJiří MatasJacob Heilmann-ClausenThomas S. JeppesenThomas LæssøeTobias Frøslev
2021-03-18
On the Role of Images for Analyzing Claims in Social Media
| Gullal S. CheemaSherzod HakimovEric Müller-BudackRalph Ewerth
2021-03-17
Trans-SVNet: Accurate Phase Recognition from Surgical Videos via Hybrid Embedding Aggregation Transformer
Xiaojie GaoYueming JinYonghao LongQi DouPheng-Ann Heng
2021-03-17
UniParma at SemEval-2021 Task 5: Toxic Spans Detection Using CharacterBERT and Bag-of-Words Model
Akbar KarimiLeonardo RossiAndrea Prati
2021-03-17
Code Word Detection in Fraud Investigations using a Deep-Learning Approach
Youri van der ZeeJan C. ScholtesMarcel WesterhoudJulien Rossi
2021-03-17
You Only Look One-level Feature
| Qiang ChenYingming WangTong YangXiangyu ZhangJian ChengJian Sun
2021-03-17
Robustly Optimized and Distilled Training for Natural Language Understanding
Haytham ElFadeelStan Peshterliev
2021-03-16
Dense Interaction Learning for Video-based Person Re-identification
Tianyu HeXin JinXu ShenJianqiang HuangZhibo ChenXian-Sheng Hua
2021-03-16
KGSynNet: A Novel Entity Synonyms Discovery Framework with Knowledge Graph
Yiying YangXi YinHaiqin YangXingjian FeiHao PengKaijie ZhouKunfeng LaiJianping Shen
2021-03-16
LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval
| Siqi SunYen-Chun ChenLinjie LiShuohang WangYuwei FangJingjing Liu
2021-03-16
Knowledge driven Description Synthesis for Floor Plan Interpretation
Shreya GoyalChiranjoy ChattopadhyayGaurav Bhatnagar
2021-03-15
SemVLP: Vision-Language Pre-training by Aligning Semantics at Multiple Levels
Chenliang LiMing YanHaiyang XuFuli LuoWei WangBin BiSongfang Huang
2021-03-14
Improving Code Summarization with Block-wise Abstract Syntax Tree Splitting
| Chen LinZhichao OuyangJunqing ZhuangJianqiang ChenHui LiRongxin Wu
2021-03-14
TransFG: A Transformer Architecture for Fine-grained Recognition
| Ju HeJie-Neng ChenShuai LiuAdam KortylewskiCheng YangYutong BaiChanghu WangAlan Yuille
2021-03-14
Embedding Calibration for Music Semantic Similarity using Auto-regressive Transformer
Xinran ZhangMaosong SunJiafeng LiuXiaobing Li
2021-03-13
Text Mining of Stocktwits Data for Predicting Stock Prices
Mukul JaggiPriyanka MandalShreya NarangUsman NaseemMatloob Khushi
2021-03-13
Bilingual Dictionary-based Language Model Pretraining for Neural Machine Translation
Yusen LinJiayong LinShuaicheng ZhangHaoying Dai
2021-03-12
Is BERT a Cross-Disciplinary Knowledge Learner? A Surprising Finding of Pre-trained Models' Transferability
Wei-Tsung KaoHung-Yi Lee
2021-03-12
Explaining and Improving BERT Performance on Lexical Semantic Change Detection
Severin LaicherSinan KurtyigitDominik SchlechtwegJonas KuhnSabine Schulte im Walde
2021-03-12
Vision Transformer for COVID-19 CXR Diagnosis using Chest X-ray Feature Corpus
Sangjoon ParkGwanghyun KimYujin OhJoon Beom SeoSang Min LeeJin Hwan KimSungjun MoonJae-Kwang LimJong Chul Ye
2021-03-12
Severity Quantification and Lesion Localization of COVID-19 on CXR using Vision Transformer
Gwanghyun KimSangjoon ParkYujin OhJoon Beom SeoSang Min LeeJin Hwan KimSungjun MoonJae-Kwang LimJong Chul Ye
2021-03-12
Sequential Random Network for Fine-grained Image Classification
Chaorong LiMalu ZhangWei HuangFengqing QinAnping ZengYuanyuan Huang
2021-03-12
Predicting the Behavior of Dealers in Over-The-Counter Corporate Bond Markets
Yusen LinJinming XueLouiqa Raschid
2021-03-12
Comparing the Performance of NLP Toolkits and Evaluation measures in Legal Tech
Muhammad Zohaib Khan
2021-03-12
Evaluation of Morphological Embeddings for the Russian Language
Vitaly RomanovAlbina Khusainova
2021-03-11
Improving Bi-encoder Document Ranking Models with Two Rankers and Multi-teacher Distillation
Jaekeol ChoiEuna JungJangwon SuhWonjong Rhee
2021-03-11
Composite Re-Ranking for Efficient Document Search with BERT
Yingrui YangYifan QiaoJinjin ShaoMayuresh AnandXifeng YanTao Yang
2021-03-11
Towards Multi-Sense Cross-Lingual Alignment of Contextual Embeddings
Linlin LiuThien Hai NguyenShafiq JotyLidong BingLuo Si
2021-03-11
LightMBERT: A Simple Yet Effective Method for Multilingual BERT Distillation
Xiaoqi JiaoYichun YinLifeng ShangXin JiangXiao ChenLinlin LiFang WangQun Liu
2021-03-11
FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders
Pengyu ChengWeituo HaoSiyang YuanShijing SiLawrence Carin
2021-03-11
Self-supervised Text-to-SQL Learning with Header Alignment Training
Donggyu KimSeanie Lee
2021-03-11
Unknown Object Segmentation from Stereo Images
Maximilian DurnerWout BoerdijkMartin SundermeyerWerner FriedlZoltan-Csaba MartonRudolph Triebel
2021-03-11
On Improving Deep Learning Trace Analysis with System Call Arguments
Quentin FournierDaniel AloiseSeyed Vahid AzhariFrançois Tetreault
2021-03-11
Continuous 3D Multi-Channel Sign Language Production via Progressive Transformers and Mixture Density Networks
Ben SaundersNecati Cihan CamgozRichard Bowden
2021-03-11
CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review
| Dan HendrycksCollin BurnsAnya ChenSpencer Ball
2021-03-10
Majority Voting with Bidirectional Pre-translation For Bitext Retrieval
| Alex JonesDerry Tanti Wijaya
2021-03-10
Hurdles to Progress in Long-form Question Answering
Kalpesh KrishnaAurko RoyMohit Iyyer
2021-03-10
CEQE: Contextualized Embeddings for Query Expansion
Shahrzad NaseriJeffrey DaltonAndrew YatesJames Allan
2021-03-09
Pretrained Transformers as Universal Computation Engines
| Kevin LuAditya GroverPieter AbbeelIgor Mordatch
2021-03-09
Language Models have a Moral Dimension
Patrick SchramowskiCigdem TuranNico AndersenConstantin RothkopfKristian Kersting
2021-03-08
Syntax-BERT: Improving Pre-trained Transformers with Syntax Trees
| Jiangang BaiYujing WangYiren ChenYaming YangJing BaiJing YuYunhai Tong
2021-03-07
TransBTS: Multimodal Brain Tumor Segmentation Using Transformer
| Wenxuan WangChen ChenMeng DingJiangyun LiHong YuSen Zha
2021-03-07
Orthogonal Attention: A Cloze-Style Approach to Negation Scope Resolution
Aditya KhandelwalVahida Attar
2021-03-07
MTLHealth: A Deep Learning System for Detecting Disturbing Content in Student Essays
Joseph ValenciaErin Yao
2021-03-07
MalBERT: Using Transformers for Cybersecurity and Malicious Software Detection
Abir RahaliMoulay A. Akhloufi
2021-03-05
Fine-tuning Pretrained Multilingual BERT Model for Indonesian Aspect-based Sentiment Analysis
Annisa Nurul AzharMasayu Leylia Khodra
2021-03-05
Non-invasive Self-attention for Side Information Fusion in Sequential Recommendation
Chang LiuXiaoguang LiGuohao CaiZhenhua DongHong ZhuLifeng Shang
2021-03-05
Measuring Mathematical Problem Solving With the MATH Dataset
| Dan HendrycksCollin BurnsSaurav KadavathAkul AroraSteven BasartEric TangDawn SongJacob Steinhardt
2021-03-05
SpecTr: Spectral Transformer for Hyperspectral Pathology Image Segmentation
| Boxiang YunYan WangJieneng ChenHuiyu WangWei ShenQingli Li
2021-03-05
Hierarchical Transformer for Multilingual Machine Translation
Albina KhusainovaAdil KhanAdín Ramírez RiveraVitaly Romanov
2021-03-05
IOT: Instance-wise Layer Reordering for Transformer Structures
| Jinhua ZhuLijun WuYingce XiaShufang XieTao QinWengang ZhouHouqiang LiTie-Yan Liu
2021-03-05
CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation
| Yutong XieJianpeng ZhangChunhua ShenYong Xia
2021-03-04
The Transformer Network for the Traveling Salesman Problem
| Xavier BressonThomas Laurent
2021-03-04
Hardware Acceleration of Fully Quantized BERT for Efficient Natural Language Processing
Zejian LiuGang LiJian Cheng
2021-03-04
End-to-end acoustic modelling for phone recognition of young readers
Lucile GelinMorgane DanielJulien PinquierThomas Pellegrini
2021-03-04
University of Copenhagen Participation in TREC Health Misinformation Track 2020
Lucas Chaves LimaDustin Brandon WrightIsabelle AugensteinMaria Maistro
2021-03-03
Few-shot Learning for Slot Tagging with Attentive Relational Network
Cennet OguzNgoc Thang Vu
2021-03-03
Dual Reinforcement-Based Specification Generation for Image De-Rendering
Ramakanth PasunuruDavid RosenbergGideon MannMohit Bansal
2021-03-02
Hate Towards the Political Opponent: A Twitter Corpus Study of the 2020 US Elections on the Basis of Offensive Speech and Stance Detection
Lara GrimmingerRoman Klinger
2021-03-02
Probing Product Description Generation via Posterior Distillation
Haolan ZhanHainan ZhangHongshen ChenLei ShenZhuoye DingYongjun BaoWeipeng YanYanyan Lan
2021-03-02
A HINT from Arithmetic: On Systematic Generalization of Perception, Syntax, and Semantics
Qing LiSiyuan HuangYining HongYixin ZhuYing Nian WuSong-Chun Zhu
2021-03-02
BERT-based knowledge extraction method of unstructured domain text
Wang ZijiaLi YeZhu Zhongkai
2021-03-01
Combat COVID-19 Infodemic Using Explainable Natural Language Processing Models
Jackie AyoubX. Jessie YangFeng Zhou
2021-03-01
Long Document Summarization in a Low Resource Setting using Pretrained Language Models
Ahsaas BajajPavitra DangatiKalpesh KrishnaPradhiksha Ashok KumarRheeya UppaalBradford WindsorEliot BrennerDominic DotterrerRajarshi DasAndrew McCallum
2021-03-01
BERT based patent novelty search by training claims to their own description
Michael FreunekAndré Bodmer
2021-03-01
CrossMap Transformer: A Crossmodal Masked Path Transformer Using Double Back-Translation for Vision-and-Language Navigation
Aly MagassoubaKomei SugiuraHisashi Kawai
2021-03-01
NLP-CUET@LT-EDI-EACL2021: Multilingual Code-Mixed Hope Speech Detection using Cross-lingual Representation Learner
| Eftekhar HossainOmar SharifMohammed Moshiul Hoque
2021-02-28
NLP-CUET@DravidianLangTech-EACL2021: Investigating Visual and Textual Features to Identify Trolls from Multimodal Social Media Memes
Eftekhar HossainOmar SharifMohammed Moshiul Hoque
2021-02-28
NLP-CUET@DravidianLangTech-EACL2021: Offensive Language Detection from Multilingual Code-Mixed Text using Transformers
| Omar SharifEftekhar HossainMohammed Moshiul Hoque
2021-02-28
Transformer in Transformer
| Kai HanAn XiaoEnhua WuJianyuan GuoChunjing XuYunhe Wang
2021-02-27
Transformers with Competitive Ensembles of Independent Mechanisms
Alex LambDi HeAnirudh GoyalGuolin KeChien-Feng LiaoMirco RavanelliYoshua Bengio
2021-02-27
Generative chemical transformer: attention makes neural machine learn molecular geometric structures via text
Hyunseung KimJonggeol NaWon Bo Lee
2021-02-27
COVID-19 Tweets Analysis through Transformer Language Models
| Abdul Hameed AzeemiAdeel Waheed
2021-02-27
Multi-task transfer learning for finding actionable information from crisis-related messages on social media
Congcong WangDavid Lillis
2021-02-26
MixSpeech: Data Augmentation for Low-resource Automatic Speech Recognition
Linghui MengJin XuXu TanJindong WangTao QinBo Xu
2021-02-25
LET: Linguistic Knowledge Enhanced Graph Transformer for Chinese Short Text Matching
| Boer LyuLu ChenSu ZhuKai Yu
2021-02-25
Sentiment Analysis of Persian-English Code-mixed Texts
| Nazanin SabriAli EdalatBehnam Bahrak
2021-02-25
LazyFormer: Self Attention with Lazy Update
Chengxuan YingGuolin KeDi HeTie-Yan Liu
2021-02-25
Emotion-Aware, Emotion-Agnostic, or Automatic: Corpus Creation Strategies to Obtain Cognitive Event Appraisal Annotations
Jan HofmannEnrica TroianoRoman Klinger
2021-02-25
PharmKE: Knowledge Extraction Platform for Pharmaceutical Texts using Transfer Learning
Nasi JofcheKostadin MishevRiste StojanovMilos JovanovikDimitar Trajanov
2021-02-25
BERT-based Acronym Disambiguation with Multiple Training Strategies
Chunguang PanBingyan SongShengguang WangZhipeng Luo
2021-02-25
Task-Specific Pre-Training and Cross Lingual Transfer for Code-Switched Data
Akshat GuptaSai Krishna RallabandiAlan Black
2021-02-24
LRG at SemEval-2021 Task 4: Improving Reading Comprehension with Abstract Words using Augmentation, Linguistic Features and Voting
| Abheesht SharmaHarshit PandeyGunjan ChhablaniYash BhartiaTirtharaj Dash
2021-02-24
NLRG at SemEval-2021 Task 5: Toxic Spans Detection Leveraging BERT-based Token Classification and Span Prediction Techniques
| Gunjan ChhablaniYash BhartiaAbheesht SharmaHarshit PandeyShan Suthaharan
2021-02-24
From Universal Language Model to Downstream Task: Improving RoBERTa-Based Vietnamese Hate Speech Detection
Quang Huu PhamViet Anh NguyenLinh Bao DoanNgoc N. TranTa Minh Thanh
2021-02-24
When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute
| Tao Lei
2021-02-24
PADA: A Prompt-based Autoregressive Approach for Adaptation to Unseen Domains
| Eyal Ben-DavidNadav OvedRoi Reichart
2021-02-24
Hopeful_Men@LT-EDI-EACL2021: Hope Speech Detection Using Indic Transliteration and Transformers
Ishan Sanjeev UpadhyayNikhil EAnshul WadhawanRadhika Mamidi
2021-02-24
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
| Wenhai WangEnze XieXiang LiDeng-Ping FanKaitao SongDing LiangTong LuPing LuoLing Shao
2021-02-24
Accurate Learning of Graph Representations with Graph Multiset Pooling
| Jinheon BaekMinki KangSung Ju Hwang
2021-02-23
Robust and Transferable Anomaly Detection in Log Data using Pre-Trained Language Models
Harold OttJasmin BogatinovskiAlexander AckerSasho NedelkoskiOdej Kao
2021-02-23
Minimally-Supervised Structure-Rich Text Categorization via Learning on Text-Rich Networks
Xinyang ZhangChenwei ZhangLuna Xin DongJingbo ShangJiawei Han
2021-02-23
VisualCheXbert: Addressing the Discrepancy Between Radiology Report Labels and Image Labels
| Saahil JainAkshay SmitSteven QH TruongChanh DT NguyenMinh-Thanh HuynhMudit JainVictoria A. YoungAndrew Y. NgMatthew P. LungrenPranav Rajpurkar
2021-02-23
Deep Deformation Detail Synthesis for Thin Shell Models
Lan ChenLin GaoJie YangShibiao XuJuntao YeXiaopeng ZhangYu-Kun Lai
2021-02-23
Do Transformer Modifications Transfer Across Implementations and Applications?
| Sharan NarangHyung Won ChungYi TayWilliam FedusThibault FevryMichael MatenaKarishma MalkanNoah FiedelNoam ShazeerZhenzhong LanYanqi ZhouWei LiNan DingJake MarcusAdam RobertsColin Raffel
2021-02-23
Using Prior Knowledge to Guide BERT's Attention in Semantic Textual Matching Tasks
| Tingyu XiaYue WangYuan TianYi Chang
2021-02-22
Evaluating Contextualized Language Models for Hungarian
| Judit ÁcsDániel LévaiDávid Márk NemeskeyAndrás Kornai
2021-02-22
Deepfake Video Detection Using Convolutional Vision Transformer
| Deressa WodajoSolomon Atnafu
2021-02-22
Generating Human Readable Transcript for Automatic Speech Recognition with Pre-trained Language Model
Junwei LiaoYu ShiMing GongLinjun ShouSefik EskimezLiyang LuHong QuMichael Zeng
2021-02-22
Position Information in Transformers: An Overview
Philipp DufterMartin SchmittHinrich Schütze
2021-02-22
Determination of Fault Location in Transmission Lines with Image Processing and Artificial Neural Networks
Serkan BudakBahadir Akbal
2021-02-22
Few Shot Learning for Information Verification
Usama KhalidMirza Omer Beg
2021-02-22
Conditional Positional Encodings for Vision Transformers
| Xiangxiang ChuZhi TianBo ZhangXinlong WangXiaolin WeiHuaxia XiaChunhua Shen
2021-02-22
UniT: Multimodal Multitask Learning with a Unified Transformer
Ronghang HuAmanpreet Singh
2021-02-22
MixUp Training Leads to Reduced Overfitting and Improved Calibration for the Transformer Architecture
Wancong ZhangIeshan Vaidya
2021-02-22
RUBERT: A Bilingual Roman Urdu BERT Using Cross Lingual Transfer Learning
Usama KhalidMirza Omer BegMuhammad Umair Arshad
2021-02-22
Parallelizing Legendre Memory Unit Training
| Narsimha ChilkuriChris Eliasmith
2021-02-22
Pre-Training BERT on Arabic Tweets: Practical Considerations
Ahmed AbdelaliSabit HassanHamdy MubarakKareem DarwishYounes Samih
2021-02-21
Web-based Application for Detecting Indonesian Clickbait Headlines using IndoBERT
Muhammad Noor FakhruzzamanSie Wildan Gunawan
2021-02-21
Medical Transformer: Gated Axial-Attention for Medical Image Segmentation
| Jeya Maria Jose ValanarasuPoojan OzaIlker HacihalilogluVishal M. Patel
2021-02-21
Towards Accurate and Compact Architectures via Neural Architecture Transformer
| Yong GuoYin ZhengMingkui TanQi ChenZhipeng LiJian ChenPeilin ZhaoJunzhou Huang
2021-02-20
Multilingual Answer Sentence Reranking via Automatically Translated Data
Thuy VuAlessandro Moschitti
2021-02-20
Learning Dynamic BERT via Trainable Gate Variables and a Bi-modal Regularizer
Seohyeong JeongNojun Kwak
2021-02-19
Towards Emotion Recognition in Hindi-English Code-Mixed Data: A Transformer Based Approach
Anshul WadhawanAkshita Aggarwal
2021-02-19
Using Transformer based Ensemble Learning to classify Scientific Articles
| Sohom GhoshAnkush Chopra
2021-02-19
Calibrate Before Use: Improving Few-Shot Performance of Language Models
| Tony Z. ZhaoEric WallaceShi FengDan KleinSameer Singh
2021-02-19
Dialect Identification in Nuanced Arabic Tweets Using Farasa Segmentation and AraBERT
Anshul Wadhawan
2021-02-19
Latent Variable Nested Set Transformers & AutoBots
Roger GirgisFlorian GolemoFelipe CodevillaJim Aldon D'SouzaSamira Ebrahimi KahouFelix HeideChristopher Pal
2021-02-19
Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer
Rafał PowalskiŁukasz BorchmannDawid JurkiewiczTomasz DwojakMichał PietruszkaGabriela Pałka
2021-02-18
UnibucKernel: Geolocating Swiss German Jodels Using Ensemble Learning
Mihaela GamanSebastian CojocariuRadu Tudor Ionescu
2021-02-18