Our model achieves state-of-the-art performance on three datasets, demonstrating the superiority of our work.
To simulate the way humans learn curriculum from easy to hard, we apply the idea of CL to the ERC task to progressively optimize the network parameters of ERNetCL.
RUME is applied to extract conversation-level contextual emotional cues while pulling together the data distributions between modalities; ACME is utilized to perform multimodal interaction centered on textual modality; LESM is used to model emotion shift and capture related information, thereby guide the learning of the main task.
Ranked #3 on Emotion Recognition in Conversation on IEMOCAP
How to model the context in a conversation is a central aspect and a major challenge of ERC tasks.
This paper presents translation-based knowledge graph completion method via efficient relation rotation (TransERR), a straightforward yet effective alternative to traditional translation-based knowledge graph completion models.
Ranked #14 on Link Property Prediction on ogbl-wikikg2
However, limited availability of high-quality datasets has hindered the development of deep learning methods for safety clothing and helmet detection.
Using a modern game engine, our approach renders crisp clean images and their precise depth maps, based on which high-quality hazy images can be synthesized for training dehazing models.
Existing methods mostly handcraft or search for the architecture from a large design space, due to the lack of understanding on how the architectural choice corresponds to the image.
The ablation studies confirm that our modules can effectively model emotional inertia and contagion.
Ranked #4 on Emotion Recognition in Conversation on EmoryNLP
We separately encode the last utterance and fuse it with the entire dialogue through the multi-head attention based intention fusion module to capture the speaker's intention.
In multimodal ERC, GNNs are capable of extracting both long-distance contextual information and inter-modal interactive information.
Ranked #11 on Emotion Recognition in Conversation on IEMOCAP
In this work, we present JIZHI - a Model-as-a-Service system - that per second handles hundreds of millions of online inference requests to huge deep models with more than trillions of sparse parameters, for over twenty real-time recommendation services at Baidu, Inc.
1 code implementation • 21 Apr 2021 • Ren Yang, Radu Timofte, Jing Liu, Yi Xu, Xinjian Zhang, Minyi Zhao, Shuigeng Zhou, Kelvin C. K. Chan, Shangchen Zhou, Xiangyu Xu, Chen Change Loy, Xin Li, Fanglong Liu, He Zheng, Lielin Jiang, Qi Zhang, Dongliang He, Fu Li, Qingqing Dang, Yibin Huang, Matteo Maggioni, Zhongqian Fu, Shuai Xiao, Cheng Li, Thomas Tanay, Fenglong Song, Wentao Chao, Qiang Guo, Yan Liu, Jiang Li, Xiaochao Qu, Dewang Hou, Jiayu Yang, Lyn Jiang, Di You, Zhenyu Zhang, Chong Mou, Iaroslav Koshelev, Pavel Ostyakov, Andrey Somov, Jia Hao, Xueyi Zou, Shijie Zhao, Xiaopeng Sun, Yiting Liao, Yuanzhi Zhang, Qing Wang, Gen Zhan, Mengxi Guo, Junlin Li, Ming Lu, Zhan Ma, Pablo Navarrete Michelini, Hai Wang, Yiyun Chen, Jingyu Guo, Liliang Zhang, Wenming Yang, Sijung Kim, Syehoon Oh, Yucong Wang, Minjie Cai, Wei Hao, Kangdi Shi, Liangyan Li, Jun Chen, Wei Gao, Wang Liu, XiaoYu Zhang, Linjie Zhou, Sixin Lin, Ru Wang
This paper reviews the first NTIRE challenge on quality enhancement of compressed video, with a focus on the proposed methods and results.
Theoretical analysis~(see Appendix A) and accuracy evaluation on various DNN models of different tasks show the effectiveness and generalization of AUSN.
We compared performances of the new data representations with the original EEG features for engagement assessment.