no code implementations • 22 May 2017 • Zhong Ji, Yunxin Sun, Yulong Yu, Jichang Guo, Yanwei Pang
However, the visual features and the class semantic descriptors locate in different structural spaces, a linear or bilinear model can not capture the semantic interactions between different modalities well.