Current practices to apply temperature scaling assume either a fixed, or a manually-crafted dynamically changing schedule.
Temperature scaling has been widely used as an effective approach to control the smoothness of a distribution, which helps the model performance in various tasks.
Despite recent success in neural task-oriented dialogue systems, developing such a real-world system involves accessing large-scale knowledge bases (KBs), which cannot be simply encoded by neural approaches, such as memory network mechanisms.
In this work, we propose a new regularization technique, Remix, that relaxes Mixup's formulation and enables the mixing factors of features and labels to be disentangled.
In this paper, we propose a noise-agnostic method to achieve robust neural network performance against any noise setting.
Label hierarchies widely exist in many vision-related problems, ranging from explicit label hierarchies existed in image classification to latent label hierarchies existed in semantic segmentation.
Adversarial robustness has emerged as an important topic in deep learning as carefully crafted attack samples can significantly disturb the performance of a model.
Although being a widely-adopted approach, using cross entropy as the primary objective exploits mostly the information from the ground-truth class for maximizing data likelihood, and largely ignores information from the complement (incorrect) classes.
Recent breakthroughs in Neural Architectural Search (NAS) have achieved state-of-the-art performance in many tasks such as image classification and language understanding.
Recent studies on neural architecture search have shown that automatically designed neural networks perform as good as expert-crafted architectures.