The development of recommender systems that optimize multi-turn interaction with users, and model the interactions of different agents (e. g., users, content providers, vendors) in the recommender ecosystem have drawn increasing attention in recent years.
Vision-and-Language Navigation wayfinding agents can be enhanced by exploiting automatically generated navigation instructions.
Identifying a short segment in a long video that semantically matches a text query is a challenging task that has important application potentials in language-based video search, browsing, and navigation.
Recent progresses have leveraged the ideas of pre-training (from language modeling) and attention layers in Transformers to learn representation from datasets containing images aligned with linguistic expressions that describe the images.
To this end, we propose BabyWalk, a new VLN agent that is learned to navigate by decomposing long instructions into shorter ones (BabySteps) and completing them sequentially.
Recent research efforts enable study for natural language grounded navigation in photo-realistic environments, e. g., following natural language instructions or dialog.
VALAN is a lightweight and scalable software framework for deep reinforcement learning based on the SEED RL architecture.
We propose RecSim, a configurable platform for authoring simulation environments for recommender systems (RSs) that naturally supports sequential interaction with users.
Vision-and-Language Navigation (VLN) tasks such as Room-to-Room (R2R) require machine agents to interpret natural language instructions and learn to act in visually realistic environments to achieve navigation goals.
Ranked #75 on Vision and Language Navigation on VLN Challenge
We address fundamental flaws in previously used metrics and show how Dynamic Time Warping (DTW), a long known method of measuring similarity between two time series, can be used for evaluation of navigation agents.
Vision-and-Language Navigation (VLN) is a natural language grounding task where agents have to interpret natural language instructions in the context of visual scenes in a dynamic environment to achieve prescribed navigation goals.
We also show that the existing paths in the dataset are not ideal for evaluating instruction following because they are direct-to-goal shortest paths.
no code implementations • 29 May 2019 • Eugene Ie, Vihan Jain, Jing Wang, Sanmit Narvekar, Ritesh Agarwal, Rui Wu, Heng-Tze Cheng, Morgane Lustman, Vince Gatto, Paul Covington, Jim McFadden, Tushar Chandra, Craig Boutilier
(i) We develop SLATEQ, a decomposition of value-based temporal-difference and Q-learning that renders RL tractable with slates.
25 code implementations • 24 Jun 2016 • Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, Hemal Shah
Memorization of feature interactions through a wide set of cross-product feature transformations are effective and interpretable, while generalization requires more feature engineering effort.
Ranked #3 on Click-Through Rate Prediction on Bing News