Search Results for author: Ruta Desai

Found 12 papers, 3 papers with code

Human-Centered Planning

no code implementations8 Nov 2023 Yuliang Li, Nitin Kamra, Ruta Desai, Alon Halevy

The vision of creating AI-powered personal assistants also involves creating structured outputs, such as a plan for one's day, or for an overseas trip.

Adaptive Coordination in Social Embodied Rearrangement

no code implementations31 May 2023 Andrew Szot, Unnat Jain, Dhruv Batra, Zsolt Kira, Ruta Desai, Akshara Rai

We present the task of "Social Rearrangement", consisting of cooperative everyday tasks like setting up the dinner table, tidying a house or unpacking groceries in a simulated multi-agent environment.

Pretrained Language Models as Visual Planners for Human Assistance

1 code implementation ICCV 2023 Dhruvesh Patel, Hamid Eghbalzadeh, Nitin Kamra, Michael Louis Iuzzolino, Unnat Jain, Ruta Desai

Given a succinct natural language goal, e. g., "make a shelf", and a video of the user's progress so far, the aim of VPA is to devise a plan, i. e., a sequence of actions such as "sand shelf", "paint shelf", etc.

Action Segmentation Language Modelling

EgoTV: Egocentric Task Verification from Natural Language Task Descriptions

1 code implementation ICCV 2023 Rishi Hazra, Brian Chen, Akshara Rai, Nitin Kamra, Ruta Desai

The goal in EgoTV is to verify the execution of tasks from egocentric videos based on the natural language description of these tasks.

Effective Baselines for Multiple Object Rearrangement Planning in Partially Observable Mapped Environments

no code implementations24 Jan 2023 Engin Tekin, Elaheh Barati, Nitin Kamra, Ruta Desai

This requires efficient trade-offs between exploration of the environment and planning for rearrangement, which is challenging because of long-horizon nature of the problem.

Object Object Recognition

Action Dynamics Task Graphs for Learning Plannable Representations of Procedural Tasks

no code implementations11 Jan 2023 Weichao Mao, Ruta Desai, Michael Louis Iuzzolino, Nitin Kamra

Given video demonstrations and paired narrations of an at-home procedural task such as changing a tire, we present an approach to extract the underlying task structure -- relevant actions and their temporal dependencies -- via action-centric task graphs.

Task 2

Cross-Domain Transfer via Semantic Skill Imitation

no code implementations14 Dec 2022 Karl Pertsch, Ruta Desai, Vikash Kumar, Franziska Meier, Joseph J. Lim, Dhruv Batra, Akshara Rai

We propose an approach for semantic imitation, which uses demonstrations from a source domain, e. g. human videos, to accelerate reinforcement learning (RL) in a different target domain, e. g. a robotic manipulator in a simulated kitchen.

Reinforcement Learning (RL) Robot Manipulation

Learning a Visually Grounded Memory Assistant

no code implementations7 Oct 2022 Meera Hahn, Kevin Carlberg, Ruta Desai, James Hillis

We introduce a novel interface for large scale collection of human memory and assistance.

Episodic Memory Question Answering

no code implementations CVPR 2022 Samyak Datta, Sameer Dharur, Vincent Cartillier, Ruta Desai, Mukul Khanna, Dhruv Batra, Devi Parikh

Towards that end, we introduce (1) a new task - Episodic Memory Question Answering (EMQA) wherein an egocentric AI assistant is provided with a video sequence (the tour) and a question as an input and is asked to localize its answer to the question within the tour, (2) a dataset of grounded questions designed to probe the agent's spatio-temporal understanding of the tour, and (3) a model for the task that encodes the scene as an allocentric, top-down semantic feature map and grounds the question into the map to localize the answer.

Question Answering

How You Move Your Head Tells What You Do: Self-supervised Video Representation Learning with Egocentric Cameras and IMU Sensors

no code implementations4 Oct 2021 Satoshi Tsutsui, Ruta Desai, Karl Ridgeway

We are particularly interested in learning egocentric video representations benefiting from the head-motion generated by users' daily activities, which can be easily obtained from IMU sensors embedded in AR/VR devices.

Representation Learning Self-Supervised Learning

Optimal Assistance for Object-Rearrangement Tasks in Augmented Reality

no code implementations14 Oct 2020 Benjamin Newman, Kevin Carlberg, Ruta Desai

We introduce a novel framework for computing and displaying AR assistance that consists of (1) associating an optimal action sequence with the policy of an embodied agent and (2) presenting this sequence to the user as suggestions in the AR system's heads-up display.

Cannot find the paper you are looking for? You can Submit a new open access paper.