Order-Embeddings of Images and Language

19 Nov 2015  ·  Ivan Vendrov, Ryan Kiros, Sanja Fidler, Raquel Urtasun ·

Hypernymy, textual entailment, and image captioning can be seen as special cases of a single visual-semantic hierarchy over words, sentences, and images. In this paper we advocate for explicitly modeling the partial order structure of this hierarchy. Towards this goal, we introduce a general method for learning ordered representations, and show how it can be applied to a variety of tasks involving images and language. We show that the resulting representations improve performance over current approaches for hypernym prediction and image-caption retrieval.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Natural Language Inference SNLI 1024D GRU encoders w/ unsupervised 'skip-thoughts' pre-training % Test Accuracy 81.4 # 89
% Train Accuracy 98.8 # 3
Parameters 15m # 4

Methods


No methods listed for this paper. Add relevant methods here