1 code implementation • 19 Apr 2024 • Peibo Li, Maarten de Rijke, Hao Xue, Shuang Ao, Yang song, Flora D. Salim
Our results show that the proposed framework outperforms the state-of-the-art models in all three datasets.
no code implementations • 21 Sep 2023 • Shuang Ao, Tianyi Zhou, Guodong Long, Xuan Song, Jing Jiang
Throughout long history, natural species have learned to survive by evolving their physical structures adaptive to the environment changes.
no code implementations • 6 Aug 2023 • Shuang Ao, Stefan Rueger, Advaith Siddharthan
We propose the Excess Area Under the Optimal RC Curve (E-AUoptRC), with the area in coverage from the optimal point to the full coverage.
no code implementations • 6 Aug 2023 • Shuang Ao
Although AI systems have been applied in various fields and achieved impressive performance, their safety and reliability are still a big concern.
1 code implementation • 6 Aug 2023 • Shuang Ao, Stefan Rueger, Advaith Siddharthan
Then we utilize the class-wise miscalibration score as a proxy to design a calibration technique that can tackle both over and under-confidence.
1 code implementation • 29 Jan 2023 • Shuang Ao, Stefan Rueger, Advaith Siddharthan
In this paper we integrate notions of model confidence and human confidence with label smoothing, respectively \textit{Model Confidence LS} and \textit{Human Confidence LS}, to achieve better model calibration and generalization.
1 code implementation • NeurIPS 2021 • Shuang Ao, Tianyi Zhou, Guodong Long, Qinghua Lu, Liming Zhu, Jing Jiang
Next, a bottom-up traversal of the tree trains the RL agent from easier sub-tasks with denser rewards on bottom layers to harder ones on top layers and collects its cost on each sub-task train the planner in the next episode.
no code implementations • 29 Sep 2021 • Shuang Ao, Tianyi Zhou, Jing Jiang, Guodong Long, Xuan Song, Chengqi Zhang
They are complementary in acquiring more informative feedback for RL: the planning policy provides dense reward of finishing easier sub-tasks while the environment policy modifies these sub-tasks to be adequately challenging and diverse so the RL agent can quickly adapt to different tasks/environments.
no code implementations • ICNLSP 2021 • Shuang Ao, Xeno Acharya
In this paper, we investigate the well-calibrated model for ULMFiT and self-distillation (SD) in a medical dialogue system.