no code implementations • 1 Jan 2021 • Pei-Hsin Wang, Sheng-Iou Hsieh, Shih-Chieh Chang, Yu-Ting Chen, Da-Cheng Juan, Jia-Yu Pan, Wei Wei
Current practices to apply temperature scaling assume either a fixed, or a manually-crafted dynamically changing schedule.
no code implementations • 25 Dec 2020 • Pei-Hsin Wang, Sheng-Iou Hsieh, Shih-Chieh Chang, Yu-Ting Chen, Jia-Yu Pan, Wei Wei, Da-Chang Juan
Temperature scaling has been widely used as an effective approach to control the smoothness of a distribution, which helps the model performance in various tasks.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Chieh-Yang Chen, Pei-Hsin Wang, Shih-Chieh Chang, Da-Cheng Juan, Wei Wei, Jia-Yu Pan
Despite recent success in neural task-oriented dialogue systems, developing such a real-world system involves accessing large-scale knowledge bases (KBs), which cannot be simply encoded by neural approaches, such as memory network mechanisms.
1 code implementation • ICLR 2019 • Hao-Yun Chen, Pei-Hsin Wang, Chun-Hao Liu, Shih-Chieh Chang, Jia-Yu Pan, Yu-Ting Chen, Wei Wei, Da-Cheng Juan
Although being a widely-adopted approach, using cross entropy as the primary objective exploits mostly the information from the ground-truth class for maximizing data likelihood, and largely ignores information from the complement (incorrect) classes.