For tasks related to distant domains with different class label sets, PLMs may memorize non-transferable knowledge for the target domain and suffer from negative transfer.
Based on continuous prompt embeddings, we propose TransPrompt, a transferable prompting framework for few-shot learning across similar tasks.
While deep learning has been widely used for video analytics, such as video classification and action detection, dense action detection with fast-moving subjects from sports videos is still challenging.
Extractive Question Answering (EQA) is one of the most important tasks in Machine Reading Comprehension (MRC), which can be solved by fine-tuning the span selecting heads of Pre-trained Language Models (PLMs).
The success of Pre-Trained Models (PTMs) has reshaped the development of Natural Language Processing (NLP).
Pre-trained Language Models (PLMs) have achieved remarkable performance for various language understanding tasks in IR systems, which require the fine-tuning process based on labeled training data.
Unified in contrastive learning, CAP enables the pruned model to learn from the pre-trained model for task-agnostic knowledge, and fine-tuned model for task-specific knowledge.
Knowledge-Enhanced Pre-trained Language Models (KEPLMs) are pre-trained models with relation triples injecting from knowledge graphs to improve language understanding abilities.
In speech enhancement, complex neural network has shown promising performance due to their effectiveness in processing complex-valued spectrum.
Our aim for the proposed coexistence network is to maximize the spectral efficiency while ensuring the strict NR-U delay requirement and the WiGig transmission performance in real time environments.
To this end, this paper designs a Multi-layered Semantic Representation Network (MSRN) which discovers both local and global semantics of labels through modeling label correlations and utilizes the label semantics to guide the semantic representations learning at multiple layers through an attention mechanism.
Ranked #5 on Multi-Label Classification on PASCAL VOC 2007
Therefore, we present CoCoNeT, with a DSL to express a program with both computation and communication.
This paper reviews the history and origin of the Abraham-Minkowski controversy and points out that it is a continuation of the controversy over the speed of light in medium.
We argue that training a teacher with transferable knowledge digested across domains can achieve better generalization capability to help knowledge distillation.
More specifically, we proposed a reinforced selector to extract useful PRF terms to enhance response candidates and a BERT based response ranker to rank the PRF-enhanced responses.
The literature has witnessed the success of leveraging Pre-trained Language Models (PLMs) and Transfer Learning (TL) algorithms to a wide range of Natural Language Processing (NLP) applications, yet it is not easy to build an easy-to-use and scalable TL toolkit for this purpose.
The intensive computation of Automatic Speech Recognition (ASR) models obstructs them from being deployed on mobile devices.
We present EasyASR, a distributed machine learning platform for training and serving large-scale Automatic Speech Recognition (ASR) models, as well as collecting and processing audio data at scale.
Existing learning based methods for text labeling task usually require a large amount of labeled examples to train a specific model for each type of document.
In this paper, we introduce a multi-target MRC task for the medical domain, whose goal is to predict answers to medical questions and the corresponding support sentences from medical information sources simultaneously, in order to ensure the high reliability of medical knowledge serving.
Building Automatic Speech Recognition (ASR) systems from scratch is significantly challenging, mostly due to the time-consuming and financially-expensive process of annotating a large amount of audio data with transcripts.
In this paper, we thus propose a novel global spatial attention mechanism in CNNs mainly for medical image classification.
In this paper, we propose an effective learning procedure named Meta Fine-Tuning (MFT), served as a meta-learner to solve a group of similar NLP tasks for neural language models.
We further combine a meta-learning process over the auxiliary task distribution and supervised learning to train the neural lexical relation classifier.
Motivated by the necessity and benefits of task-oriented BERT compression, we propose a novel compression method, AdaBERT, that leverages differentiable Neural Architecture Search to automatically compress BERT into task-adaptive small models for specific tasks.
1 code implementation • 7 Feb 2019 • Łukasz Kidziński, Carmichael Ong, Sharada Prasanna Mohanty, Jennifer Hicks, Sean F. Carroll, Bo Zhou, Hongsheng Zeng, Fan Wang, Rongzhong Lian, Hao Tian, Wojciech Jaśkowski, Garrett Andersen, Odd Rune Lykkebø, Nihat Engin Toklu, Pranav Shyam, Rupesh Kumar Srivastava, Sergey Kolesnikov, Oleksii Hrinchuk, Anton Pechenko, Mattias Ljungström, Zhen Wang, Xu Hu, Zehong Hu, Minghui Qiu, Jun Huang, Aleksei Shpilman, Ivan Sosin, Oleg Svidchenko, Aleksandra Malysheva, Daniel Kudenko, Lance Rane, Aditya Bhatt, Zhengfei Wang, Penghui Qi, Zeyang Yu, Peng Peng, Quan Yuan, Wenxin Li, Yunsheng Tian, Ruihan Yang, Pingchuan Ma, Shauharda Khadka, Somdeb Majumdar, Zach Dwiel, Yinyin Liu, Evren Tumer, Jeremy Watson, Marcel Salathé, Sergey Levine, Scott Delp
In the NeurIPS 2018 Artificial Intelligence for Prosthetics challenge, participants were tasked with building a controller for a musculoskeletal model with a goal of matching a given time-varying velocity vector.
Specifically, the data selector "acts" on the source domain data to find a subset for optimization of the TL model, and the performance of the TL model can provide "rewards" in turn to update the selector.
Product reviews, in the form of texts dominantly, significantly help consumers finalize their purchasing decisions.
Building multi-turn information-seeking conversation systems is an important and challenging research topic.
Our models and research findings provide new insights on how to utilize external knowledge with deep neural models for response selection and have implications for the design of the next generation of information-seeking conversation systems.
We present AliMe Assist, an intelligent assistant designed for creating an innovative online shopping experience in E-commerce.
In this paper, we study transfer learning for the PI and NLI problems, aiming to propose a general framework, which can effectively and efficiently adapt the shared knowledge learned from a resource-rich source domain to a resource- poor target domain.
We propose AliMe Chat, an open-domain chatbot engine that integrates the joint results of Information Retrieval (IR) and Sequence to Sequence (Seq2Seq) based generation models.