1 code implementation • 4 Apr 2021 • Hyounghun Kim, Abhay Zala, Graham Burri, Mohit Bansal
During the correctional-captioning task, models must generate descriptions of how to move from the current to target pose image, whereas in the retrieval task, models should select the correct target pose given the initial pose and correctional description.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Hyounghun Kim, Abhay Zala, Graham Burri, Hao Tan, Mohit Bansal
During this task, the agent (similar to a PokeMON GO player) is asked to find and collect different target objects one-by-one by navigating based on natural language instructions in a complex, realistic outdoor environment, but then also ARRAnge the collected objects part-by-part in an egocentric grid-layout environment.