Learning to Read Maps: Understanding Natural Language Instructions from Unseen Maps

ACL (splurobonlp) 2021 · Miltiadis Marios Katsakioris, Ioannis Konstas, Pierre Yves Mignotte, Helen Hastie ·

Robust situated dialog requires the ability to process instructions based on spatial information, which may or may not be available. We propose a model, based on LXMERT, that can extract spatial information from text instructions and attend to landmarks on OpenStreetMap (OSM) referred to in a natural language instruction. Whilst, OSM is a valuable resource, as with any open-sourced data, there is noise and variation in the names referred to on the map, as well as, variation in natural language instructions, hence the need for data-driven methods over rule-based systems. This paper demonstrates that the gold GPS location can be accurately predicted from the natural language instruction and metadata with 72% accuracy for previously seen maps and 64% for unseen maps.

PDF Abstract