Search Results for author: Tianmin Shu

Found 40 papers, 13 papers with code

RealWebAssist: A Benchmark for Long-Horizon Web Assistance with Real-World Users

no code implementations14 Apr 2025 Suyu Ye, Haojun Shi, Darren Shih, Hyokun Yun, Tanya Roosta, Tianmin Shu

For instance, real-world human instructions can be ambiguous, require different levels of AI assistance, and may evolve over time, reflecting changes in the user's mental state.

Instruction Following

AutoToM: Automated Bayesian Inverse Planning and Model Discovery for Open-ended Theory of Mind

2 code implementations21 Feb 2025 Zhining Zhang, Chuanyang Jin, Mung Yao Jia, Tianmin Shu

Based on the uncertainty of the inference, it iteratively refines the model, by introducing additional mental variables and/or incorporating more timesteps in the context.

Model Discovery

GenEx: Generating an Explorable World

no code implementations12 Dec 2024 Taiming Lu, Tianmin Shu, Junfei Xiao, Luoxin Ye, Jiahao Wang, Cheng Peng, Chen Wei, Daniel Khashabi, Rama Chellappa, Alan Yuille, Jieneng Chen

In this work, we take a step toward this goal by introducing GenEx, a system capable of planning complex embodied world exploration, guided by its generative imagination that forms priors (expectations) about the surrounding environments.

Generative World Explorer

no code implementations18 Nov 2024 Taiming Lu, Tianmin Shu, Alan Yuille, Daniel Khashabi, Jieneng Chen

Our experimental results demonstrate that (1) $\textit{Genex}$ can generate high-quality and consistent observations during long-horizon exploration of a large virtual physical world and (2) the beliefs updated with the generated observations can inform an existing decision-making model (e. g., an LLM agent) to make better plans.

Few-Shot Task Learning through Inverse Generative Modeling

no code implementations7 Nov 2024 Aviv Netanyahu, Yilun Du, Antonia Bronars, Jyothish Pari, Joshua Tenenbaum, Tianmin Shu, Pulkit Agrawal

Then, given a few demonstrations of a new concept (such as a new goal or a new action), our method learns the underlying concepts through backpropagation without updating the model weights, thanks to the invertibility of the generative model.

Autonomous Driving Novel Concepts +1

Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge

1 code implementation4 Nov 2024 Weihua Du, Qiushi Lyu, Jiaming Shan, Zhenting Qi, Hongxin Zhang, Sunli Chen, Andi Peng, Tianmin Shu, Kwonjoon Lee, Behzad Dariush, Chuang Gan

In CHAIC, the goal is for an embodied agent equipped with egocentric observations to assist a human who may be operating under physical constraints -- e. g., unable to reach high places or confined to a wheelchair -- in performing common household or outdoor tasks as efficiently as possible.

SIFToM: Robust Spoken Instruction Following through Theory of Mind

no code implementations17 Sep 2024 Lance Ying, Jason Xinyu Liu, Shivam Aarya, Yizirui Fang, Stefanie Tellex, Joshua B. Tenenbaum, Tianmin Shu

We present a cognitively inspired model, Speech Instruction Following through Theory of Mind (SIFToM), to enable robots to pragmatically follow human instructions under diverse speech conditions by inferring the human's goal and joint plan as prior for speech perception and understanding.

Instruction Following Task Planning

MuMA-ToM: Multi-modal Multi-Agent Theory of Mind

2 code implementations22 Aug 2024 Haojun Shi, Suyu Ye, Xinyu Fang, Chuanyang Jin, Leyla Isik, Yen-Ling Kuo, Tianmin Shu

To truly understand how and why people interact with one another, we must infer the underlying mental states that give rise to the social interactions, i. e., Theory of Mind reasoning in multi-agent interactions.

Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input

no code implementations23 May 2024 Andi Peng, Yuying Sun, Tianmin Shu, David Abel

We derive an approach for learning from these feature-level preferences, both for cases where users specify which features are reward-relevant, and when users do not.

COMBO: Compositional World Models for Embodied Multi-Agent Cooperation

no code implementations16 Apr 2024 Hongxin Zhang, Zeyuan Wang, Qiushi Lyu, Zheyuan Zhang, Sunli Chen, Tianmin Shu, Behzad Dariush, Kwonjoon Lee, Yilun Du, Chuang Gan

To effectively plan in this setting, in contrast to learning world dynamics in a single-agent scenario, we must simulate world dynamics conditioned on an arbitrary number of agents' actions given only partial egocentric visual observations of the world.

GOMA: Proactive Embodied Cooperative Communication via Goal-Oriented Mental Alignment

no code implementations17 Mar 2024 Lance Ying, Kunal Jha, Shivam Aarya, Joshua B. Tenenbaum, Antonio Torralba, Tianmin Shu

GOMA formulates verbal communication as a planning problem that minimizes the misalignment between the parts of agents' mental states that are relevant to the goals.

MMToM-QA: Multimodal Theory of Mind Question Answering

1 code implementation16 Jan 2024 Chuanyang Jin, Yutong Wu, Jing Cao, Jiannan Xiang, Yen-Ling Kuo, Zhiting Hu, Tomer Ullman, Antonio Torralba, Joshua B. Tenenbaum, Tianmin Shu

To engineer multimodal ToM capacity, we propose a novel method, BIP-ALM (Bayesian Inverse Planning Accelerated by Language Models).

Question Answering Theory of Mind Modeling

Language Models, Agent Models, and World Models: The LAW for Machine Reasoning and Planning

no code implementations8 Dec 2023 Zhiting Hu, Tianmin Shu

Despite their tremendous success in many applications, large language models often fall short of consistent reasoning and planning in various (language, embodied, and social) scenarios, due to inherent limitations in their inference, learning, and modeling capabilities.

Neural Amortized Inference for Nested Multi-agent Reasoning

1 code implementation21 Aug 2023 Kunal Jha, Tuan Anh Le, Chuanyang Jin, Yen-Ling Kuo, Joshua B. Tenenbaum, Tianmin Shu

Multi-agent interactions, such as communication, teaching, and bluffing, often rely on higher-order social inference, i. e., understanding how others infer oneself.

Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for Test-Time Policy Adaptation

no code implementations12 Jul 2023 Andi Peng, Aviv Netanyahu, Mark Ho, Tianmin Shu, Andreea Bobu, Julie Shah, Pulkit Agrawal

Policies often fail due to distribution shift -- changes in the state and reward that occur when a policy is deployed in new environments.

continuous-control Continuous Control +2

Building Cooperative Embodied Agents Modularly with Large Language Models

1 code implementation5 Jul 2023 Hongxin Zhang, Weihua Du, Jiaming Shan, Qinhong Zhou, Yilun Du, Joshua B. Tenenbaum, Tianmin Shu, Chuang Gan

In this work, we address challenging multi-agent cooperation problems with decentralized control, raw sensory observations, costly communication, and multi-objective tasks instantiated in various embodied environments.

Text Generation

Language Models Meet World Models: Embodied Experiences Enhance Language Models

1 code implementation NeurIPS 2023 Jiannan Xiang, Tianhua Tao, Yi Gu, Tianmin Shu, ZiRui Wang, Zichao Yang, Zhiting Hu

While large language models (LMs) have shown remarkable capabilities across numerous tasks, they often struggle with simple reasoning and planning in physical environments, such as understanding object permanence or planning household activities.

NOPA: Neurally-guided Online Probabilistic Assistance for Building Socially Intelligent Home Assistants

no code implementations12 Jan 2023 Xavier Puig, Tianmin Shu, Joshua B. Tenenbaum, Antonio Torralba

Experiments show that our helper agent robustly updates its goal inference and adapts its helping plans to the changing level of uncertainty.

Discovering Generalizable Spatial Goal Representations via Graph-based Active Reward Learning

no code implementations24 Nov 2022 Aviv Netanyahu, Tianmin Shu, Joshua Tenenbaum, Pulkit Agrawal

To address this, we propose a reward learning approach, Graph-based Equivalence Mappings (GEM), that can discover spatial goal representations that are aligned with the intended goal specification, enabling successful generalization in unseen environments.

AI Agent Imitation Learning +1

Stateful active facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning

2 code implementations4 Oct 2022 Dianbo Liu, Vedant Shah, Oussama Boussif, Cristian Meo, Anirudh Goyal, Tianmin Shu, Michael Mozer, Nicolas Heess, Yoshua Bengio

We formalize the notions of coordination level and heterogeneity level of an environment and present HECOGrid, a suite of multi-agent RL environments that facilitates empirical evaluation of different MARL approaches across different levels of coordination and environmental heterogeneity by providing a quantitative control over coordination and heterogeneity levels of the environment.

Multi-agent Reinforcement Learning reinforcement-learning +1

Show Me What You Can Do: Capability Calibration on Reachable Workspace for Human-Robot Collaboration

no code implementations6 Mar 2021 Xiaofeng Gao, Luyao Yuan, Tianmin Shu, Hongjing Lu, Song-Chun Zhu

Our experiments with human participants demonstrate that a short calibration using REMP can effectively bridge the gap between what a non-expert user thinks a robot can reach and the ground truth.

Motion Planning

PHASE: PHysically-grounded Abstract Social Events for Machine Social Perception

no code implementations NeurIPS Workshop SVRHM 2020 Aviv Netanyahu, Tianmin Shu, Boris Katz, Andrei Barbu, Joshua B. Tenenbaum

The ability to perceive and reason about social interactions in the context of physical environments is core to human social intelligence and human-machine cooperation.

AGENT: A Benchmark for Core Psychological Reasoning

no code implementations24 Feb 2021 Tianmin Shu, Abhishek Bhandwaldar, Chuang Gan, Kevin A. Smith, Shari Liu, Dan Gutfreund, Elizabeth Spelke, Joshua B. Tenenbaum, Tomer D. Ullman

For machine agents to successfully interact with humans in real-world settings, they will need to develop an understanding of human mental life.

Core Psychological Reasoning

Joint Mind Modeling for Explanation Generation in Complex Human-Robot Collaborative Tasks

no code implementations24 Jul 2020 Xiaofeng Gao, Ran Gong, Yizhou Zhao, Shu Wang, Tianmin Shu, Song-Chun Zhu

Thus, in this paper, we propose a novel explainable AI (XAI) framework for achieving human-like communication in human-robot collaborations, where the robot builds a hierarchical mind model of the human user and generates explanations of its own mind as a form of communications based on its online Bayesian inference of the user's mental state.

Bayesian Inference Explainable Artificial Intelligence (XAI) +1

Active Visual Information Gathering for Vision-Language Navigation

1 code implementation ECCV 2020 Hanqing Wang, Wenguan Wang, Tianmin Shu, Wei Liang, Jianbing Shen

Vision-language navigation (VLN) is the task of entailing an agent to carry out navigational instructions inside photo-realistic environments.

Vision-Language Navigation

M^3RL: Mind-aware Multi-agent Management Reinforcement Learning

no code implementations ICLR 2019 Tianmin Shu, Yuandong Tian

Most of the prior work on multi-agent reinforcement learning (MARL) achieves optimal collaboration by directly controlling the agents to maximize a common reward.

Management Multi-agent Reinforcement Learning +3

VRKitchen: an Interactive 3D Virtual Environment for Task-oriented Learning

1 code implementation13 Mar 2019 Xiaofeng Gao, Ran Gong, Tianmin Shu, Xu Xie, Shu Wang, Song-Chun Zhu

One of the main challenges of advancing task-oriented learning such as visual task planning and reinforcement learning is the lack of realistic and standardized environments for training and testing AI agents.

reinforcement-learning Reinforcement Learning +2

Interactive Agent Modeling by Learning to Probe

no code implementations1 Oct 2018 Tianmin Shu, Caiming Xiong, Ying Nian Wu, Song-Chun Zhu

In particular, the probing agent (i. e. a learner) learns to interact with the environment and with a target agent (i. e., a demonstrator) to maximize the change in the observed behaviors of that agent.

Imitation Learning Reinforcement Learning

M$^3$RL: Mind-aware Multi-agent Management Reinforcement Learning

1 code implementation ICLR 2019 Tianmin Shu, Yuandong Tian

Most of the prior work on multi-agent reinforcement learning (MARL) achieves optimal collaboration by directly controlling the agents to maximize a common reward.

Management Multi-agent Reinforcement Learning +3

Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement Learning

no code implementations ICLR 2018 Tianmin Shu, Caiming Xiong, Richard Socher

In order to help the agent learn the complex temporal dependencies necessary for the hierarchical policy, we provide it with a stochastic temporal grammar that modulates when to rely on previously learned skills and when to execute new skills.

Minecraft reinforcement-learning +2

CERN: Confidence-Energy Recurrent Network for Group Activity Recognition

no code implementations CVPR 2017 Tianmin Shu, Sinisa Todorovic, Song-Chun Zhu

This work is about recognizing human activities occurring in videos at distinct semantic levels, including individual actions, interactions, and group activities.

Group Activity Recognition

Learning Social Affordance Grammar from Videos: Transferring Human Interactions to Human-Robot Interactions

no code implementations1 Mar 2017 Tianmin Shu, Xiaofeng Gao, Michael S. Ryoo, Song-Chun Zhu

In this paper, we present a general framework for learning social affordance grammar as a spatiotemporal AND-OR graph (ST-AOG) from RGB-D videos of human interactions, and transfer the grammar to humanoids to enable a real-time motion inference for human-robot interaction (HRI).

Modeling and Inferring Human Intents and Latent Functional Objects for Trajectory Prediction

no code implementations24 Jun 2016 Dan Xie, Tianmin Shu, Sinisa Todorovic, Song-Chun Zhu

This paper is about detecting functional objects and inferring human intentions in surveillance videos of public spaces.

Clustering Trajectory Prediction

Learning Social Affordance for Human-Robot Interaction

no code implementations13 Apr 2016 Tianmin Shu, M. S. Ryoo, Song-Chun Zhu

In this paper, we present an approach for robot learning of social affordance from human activity videos.

Weakly-supervised Learning

Joint Inference of Groups, Events and Human Roles in Aerial Videos

no code implementations CVPR 2015 Tianmin Shu, Dan Xie, Brandon Rothrock, Sinisa Todorovic, Song-Chun Zhu

This paper addresses a new problem of parsing low-resolution aerial videos of large spatial areas, in terms of 1) grouping, 2) recognizing events and 3) assigning roles to people engaged in events.

Cannot find the paper you are looking for? You can Submit a new open access paper.