Knowledge distillation (KD) is a substantial strategy for transferring learned knowledge from one neural network model to another.
In contrast to previous works, our model splits alignment into different levels to achieve learning better correlations without needing additional data and annotations.
Most existing continual learning approaches suffer from low accuracy and performance fluctuation, especially when the distributions of old and new data are significantly different.
Knowledge distillation, Weight pruning, and Quantization are known to be the main directions in model compression.
We are the first to propose a method that works well across both OOD detection and calibration and under different types of shifts.
To this end, we theoretically derive two score functions for OOD detection, the covariate shift score and concept shift score, based on the decomposition of KL-divergence for both scores, and propose a geometrically-inspired method (Geometric ODIN) to improve OOD detection under both shifts with only in-distribution data.
DictFormer significantly reduces the redundancy in the transformer's parameters by replacing the prior transformer's parameters with compact, shared dictionary, a few unshared coefficients, and indices.
In other words, the optimization objective of SVD is not aligned with the trained model's task accuracy.
This paper proposes to train a model with only IND data while supporting both IND intent classification and OOD detection.
Modern computer vision applications suffer from catastrophic forgetting when incrementally learning new concepts over time.
In this paper, we introduce a novel multi-step spoken language understanding system based on adversarial learning that can leverage the multiround user's feedback to update slot values.
A cryptographic neural network inference service is an efficient way to allow two parties to execute neural network inference without revealing either party’s data or model.
We design a Dirichlet Prior RNN to model high-order uncertainty by degenerating as softmax layer for RNN model training.
Existing open-domain dialogue generation models are usually trained to mimic the gold response in the training set using cross-entropy loss on the vocabulary.
Text-based interactive recommendation provides richer user feedback and has demonstrated advantages over traditional interactive recommender systems.
Third, we design a private location trace release framework that pipelines the detection of location exposure, policy graph repair, and private trajectory release with customizable and rigorous location privacy.
Cryptography and Security Computers and Society
Deep neural networks have attained remarkable performance when applied to data that comes from the same distribution as that of the training set, but can significantly degrade otherwise.
Text-based interactive recommendation provides richer user preferences and has demonstrated advantages over traditional interactive recommender systems.
ProgModel consists of a novel context gate that transfers previously learned knowledge to a small size expanded component; and meanwhile enables this new component to be fast trained to learn from new data.
Recurrent neural network (RNN) based joint intent classification and slot tagging models have achieved tremendous success in recent years for building spoken language understanding and dialog systems.
We present SkillBot that takes the first step to enable end users to teach new skills in personal assistants (PA).
Under deep neural networks, a pre-defined vocabulary is required to vectorize text inputs.
Many vision and language models suffer from poor visual grounding - often falling back on easy-to-learn language priors rather than basing their decisions on visual concepts in the image.
The most effective algorithms are based on the structures of sequence to sequence models (or "encoder-decoder" models), and generate the intents and semantic tags either using separate models or a joint model.
Ranked #1 on Slot Filling on ATIS
With the recently rapid development in deep learning, deep neural networks have been widely adopted in many real-life applications.
The results show that our approach leverages such simple user information to outperform state-of-the-art approaches by 0. 25% for intent detection and 0. 31% for slot filling using standard training data.
Learning intents and slot labels from user utterances is a fundamental step in all spoken language understanding (SLU) and dialog systems.
We present a system, CRUISE, that guides ordinary software developers to build a high quality natural language understanding (NLU) engine from scratch.
The learning process is interactive, with a human expert first providing input in the form of full demonstrations along with some subgoal states.