Chinese Synesthesia Detection: New Dataset and Models

In this paper, we introduce a new task called synesthesia detection, which aims to extract the sensory word of a sentence, and to predict the original and synesthetic sensory modalities of the corresponding sensory word. Synesthesia refers to the description of perceptions in one sensory modality through concepts from other modalities. It involves not only a linguistic phenomenon, but also a cognitive phenomenon structuring human thought and action, which makes it become a bridge between figurative linguistic phenomenon and abstract cognition, and thus be helpful to understand the deep semantics. To address this, we construct a large-scale human-annotated Chinese synesthesia dataset, which contains 7,217 annotated sentences accompanied by 187 sensory words. Based on this dataset, we propose a family of strong and representative baseline models. Upon these baselines, we further propose a radical-based neural network model to identify the boundary of the sensory word, and to jointly detect the original and synesthetic sensory modalities for the word. Through extensive experiments, we observe that the importance of the proposed task and dataset can be verified by the statistics and progressive performances. In addition, our proposed model achieves state-of-the-art results on the synesthesia dataset.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here