Open-World Social Event Classification

With the rapid development of Internet and the expanding scale of social media, social event classification has attracted increasing attention. The key to social event classification is effectively leveraging the visual and textual semantics for classification. However, most of the existing approaches may suffer from the following limitations: (1) Most of them just simply concatenate the image features and text features to get the multimodal features and ignore the fine-grained semantic relationship between modalities. (2) The majority of them hold the closed-world assumption that all classes in test are already seen in training, while this assumption can be easily broken in real-world applications. In practice, new events on Internet may not belong to any existing/seen class, and therefore cannot be correctly identified by closed-world learning algorithms. To tackle these challenges, we propose an Open-World Social Event Classifier (OWSEC) model in this paper. Firstly, we design a multimodal mask transformer network to capture cross-modal semantic relations and fuse fine-grained multimodal features of social events while masking redundant information. Secondly, we design an open-world classifier and propose a cross-modal event mixture mechanism with a novel open-world classification loss to capture the potential distribution space of the unseen class. Extensive experiments on two public datasets demonstrate the superiority of our proposed OWSEC model for open-world social event classification.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here