Semi-supervised Gender Classification with Joint Textual and Social Modeling
In gender classification, labeled data is often limited while unlabeled data is ample. This motivates semi-supervised learning for gender classification to improve the performance by exploring the knowledge in both labeled and unlabeled data. In this paper, we propose a semi-supervised approach to gender classification by leveraging textual features and a specific kind of indirect links among the users which we call {``}same-interest{''} links. Specifically, we propose a factor graph, namely Textual and Social Factor Graph (TSFG), to model both the textual and the {``}same-interest{''} link information. Empirical studies demonstrate the effectiveness of the proposed approach to semi-supervised gender classification.
PDF Abstract COLING 2016 PDF COLING 2016 Abstract