Experiments on English-German and English-Romanian translation show that: (i) Epi-Curriculum improves both model's robustness and adaptability in seen and unseen domains; (ii) Our episodic training framework enhances the encoder and decoder's robustness to domain shift.
1 code implementation • • Wei Cheng, Ruixiang Chen, Wanqi Yin, Siming Fan, Keyu Chen, Honglin He, Huiwen Luo, Zhongang Cai, Jingbo Wang, Yang Gao, Zhengming Yu, Zhengyu Lin, Daxuan Ren, Lei Yang, Ziwei Liu, Chen Change Loy, Chen Qian, Wayne Wu, Dahua Lin, Bo Dai, Kwan-Yee Lin
Realistic human-centric rendering plays a key role in both computer vision and computer graphics.
Based on existing efforts, this work has two observations: (1) For different annotators, labeling highlight has uncertainty, which leads to inaccurate and time-consuming annotations.
no code implementations • • Haoqian Wu, Keyu Chen, Haozhe Liu, Mingchen Zhuge, Bing Li, Ruizhi Qiao, Xiujun Shu, Bei Gan, Liangsheng Xu, Bo Ren, Mengmeng Xu, Wentian Zhang, Raghavendra Ramachandra, Chia-Wen Lin, Bernard Ghanem
Temporal video segmentation is the get-to-go automatic video analysis, which decomposes a long-form video into smaller components for the following-up understanding tasks.
To explore the fine-grained alignment, we further propose two implicit semantic alignment paradigms: multi-level alignment (MLA) and bidirectional mask modeling (BMM).
For unbalanced Twitter data, XGboost outperformed the LSTM, AutoGluon, and ULMFiT models on hate speech detection with an F1 score of 0. 75 compared to 0. 38 and 0. 37, and 0. 38 respectively.
Additionally, we suggest a more fair and reasonable benchmark to evaluate the performance of Video Scene Segmentation methods.
The results show that our two-stage training strategy effectively addresses the class imbalance classification problem, and significantly improves existing works in terms of F1-score and AUC score, resulting in state-of-the-art performance.
Automatic affective recognition has been an important research topic in human computer interaction (HCI) area.
The facial expression analysis requires a compact and identity-ignored expression representation.
Generating high-fidelity talking head video by fitting with the input audio sequence is a challenging problem that receives considerable attentions recently.
The generalization capability of machine learning models, which refers to generalizing the knowledge for an "unseen" domain via learning from one or multiple seen domain(s), is of great importance to develop and deploy machine learning applications in the real-world conditions.
Ranked #23 on Domain Generalization on VLCS
The problem of deforming an artist-drawn caricature according to a given normal face expression is of interest in applications such as social media, animation and entertainment.
Since the skin lesion datasets are usually limited and statistically biased, while designing an effective fusion approach, it is important to consider not only the performance of each classifier on the training/validation dataset, but also the relative discriminative power (e. g., confidence) of each classifier regarding an individual sample in the testing phase, which calls for an active fusion approach.
Hence, most of the mobile healthcare systems leverage the cloud computing infrastructure, where the data collected by the mobile and IoT devices would be transmitted to the cloud computing platforms for analysis.
Four kinds of SOTA robust loss functions are attempted to be integrated into our algorithm, and comprehensive experiments substantiate the general availability and effectiveness of the proposed method in both its accuracy and generalization performance, as compared with conventional hyperparameter tuning strategy, even with carefully tuned hyperparameters.
By this means, the discriminativeness of the face representations can be enhanced.
no code implementations • 17 Nov 2017 • Tiezheng Ge, Liqin Zhao, Guorui Zhou, Keyu Chen, Shuying Liu, Huimin Yi, Zelin Hu, Bochao Liu, Peng Sun, Haoyu Liu, Pengtao Yi, Sui Huang, Zhiqiang Zhang, Xiaoqiang Zhu, Yu Zhang, Kun Gai
So we propose to model user preference jointly with user behavior ID features and behavior images.