no code implementations • 14 Apr 2025 • Suyu Ye, Haojun Shi, Darren Shih, Hyokun Yun, Tanya Roosta, Tianmin Shu
For instance, real-world human instructions can be ambiguous, require different levels of AI assistance, and may evolve over time, reflecting changes in the user's mental state.
2 code implementations • 21 Feb 2025 • Zhining Zhang, Chuanyang Jin, Mung Yao Jia, Tianmin Shu
Based on the uncertainty of the inference, it iteratively refines the model, by introducing additional mental variables and/or incorporating more timesteps in the context.
no code implementations • 12 Dec 2024 • Taiming Lu, Tianmin Shu, Junfei Xiao, Luoxin Ye, Jiahao Wang, Cheng Peng, Chen Wei, Daniel Khashabi, Rama Chellappa, Alan Yuille, Jieneng Chen
In this work, we take a step toward this goal by introducing GenEx, a system capable of planning complex embodied world exploration, guided by its generative imagination that forms priors (expectations) about the surrounding environments.
no code implementations • 18 Nov 2024 • Taiming Lu, Tianmin Shu, Alan Yuille, Daniel Khashabi, Jieneng Chen
Our experimental results demonstrate that (1) $\textit{Genex}$ can generate high-quality and consistent observations during long-horizon exploration of a large virtual physical world and (2) the beliefs updated with the generated observations can inform an existing decision-making model (e. g., an LLM agent) to make better plans.
no code implementations • 7 Nov 2024 • Aviv Netanyahu, Yilun Du, Antonia Bronars, Jyothish Pari, Joshua Tenenbaum, Tianmin Shu, Pulkit Agrawal
Then, given a few demonstrations of a new concept (such as a new goal or a new action), our method learns the underlying concepts through backpropagation without updating the model weights, thanks to the invertibility of the generative model.
1 code implementation • 4 Nov 2024 • Weihua Du, Qiushi Lyu, Jiaming Shan, Zhenting Qi, Hongxin Zhang, Sunli Chen, Andi Peng, Tianmin Shu, Kwonjoon Lee, Behzad Dariush, Chuang Gan
In CHAIC, the goal is for an embodied agent equipped with egocentric observations to assist a human who may be operating under physical constraints -- e. g., unable to reach high places or confined to a wheelchair -- in performing common household or outdoor tasks as efficiently as possible.
no code implementations • 17 Sep 2024 • Lance Ying, Jason Xinyu Liu, Shivam Aarya, Yizirui Fang, Stefanie Tellex, Joshua B. Tenenbaum, Tianmin Shu
We present a cognitively inspired model, Speech Instruction Following through Theory of Mind (SIFToM), to enable robots to pragmatically follow human instructions under diverse speech conditions by inferring the human's goal and joint plan as prior for speech perception and understanding.
2 code implementations • 22 Aug 2024 • Haojun Shi, Suyu Ye, Xinyu Fang, Chuanyang Jin, Leyla Isik, Yen-Ling Kuo, Tianmin Shu
To truly understand how and why people interact with one another, we must infer the underlying mental states that give rise to the social interactions, i. e., Theory of Mind reasoning in multi-agent interactions.
no code implementations • 23 May 2024 • Andi Peng, Yuying Sun, Tianmin Shu, David Abel
We derive an approach for learning from these feature-level preferences, both for cases where users specify which features are reward-relevant, and when users do not.
no code implementations • 16 Apr 2024 • Hongxin Zhang, Zeyuan Wang, Qiushi Lyu, Zheyuan Zhang, Sunli Chen, Tianmin Shu, Behzad Dariush, Kwonjoon Lee, Yilun Du, Chuang Gan
To effectively plan in this setting, in contrast to learning world dynamics in a single-agent scenario, we must simulate world dynamics conditioned on an arbitrary number of agents' actions given only partial egocentric visual observations of the world.
no code implementations • 17 Mar 2024 • Lance Ying, Kunal Jha, Shivam Aarya, Joshua B. Tenenbaum, Antonio Torralba, Tianmin Shu
GOMA formulates verbal communication as a planning problem that minimizes the misalignment between the parts of agents' mental states that are relevant to the goals.
1 code implementation • 16 Jan 2024 • Chuanyang Jin, Yutong Wu, Jing Cao, Jiannan Xiang, Yen-Ling Kuo, Zhiting Hu, Tomer Ullman, Antonio Torralba, Joshua B. Tenenbaum, Tianmin Shu
To engineer multimodal ToM capacity, we propose a novel method, BIP-ALM (Bayesian Inverse Planning Accelerated by Language Models).
no code implementations • 8 Dec 2023 • Zhiting Hu, Tianmin Shu
Despite their tremendous success in many applications, large language models often fall short of consistent reasoning and planning in various (language, embodied, and social) scenarios, due to inherent limitations in their inference, learning, and modeling capabilities.
no code implementations • 28 Aug 2023 • Chuanyang Jin, Songyang Zhang, Tianmin Shu, Zhihan Cui
In our research, we probed the cultural cognitive traits of ChatGPT.
1 code implementation • 21 Aug 2023 • Kunal Jha, Tuan Anh Le, Chuanyang Jin, Yen-Ling Kuo, Joshua B. Tenenbaum, Tianmin Shu
Multi-agent interactions, such as communication, teaching, and bluffing, often rely on higher-order social inference, i. e., understanding how others infer oneself.
no code implementations • 12 Jul 2023 • Andi Peng, Aviv Netanyahu, Mark Ho, Tianmin Shu, Andreea Bobu, Julie Shah, Pulkit Agrawal
Policies often fail due to distribution shift -- changes in the state and reward that occur when a policy is deployed in new environments.
1 code implementation • 5 Jul 2023 • Hongxin Zhang, Weihua Du, Jiaming Shan, Qinhong Zhou, Yilun Du, Joshua B. Tenenbaum, Tianmin Shu, Chuang Gan
In this work, we address challenging multi-agent cooperation problems with decentralized control, raw sensory observations, costly communication, and multi-objective tasks instantiated in various embodied environments.
1 code implementation • NeurIPS 2023 • Jiannan Xiang, Tianhua Tao, Yi Gu, Tianmin Shu, ZiRui Wang, Zichao Yang, Zhiting Hu
While large language models (LMs) have shown remarkable capabilities across numerous tasks, they often struggle with simple reasoning and planning in physical environments, such as understanding object permanence or planning household activities.
no code implementations • 12 Jan 2023 • Xavier Puig, Tianmin Shu, Joshua B. Tenenbaum, Antonio Torralba
Experiments show that our helper agent robustly updates its goal inference and adapts its helping plans to the changing level of uncertainty.
no code implementations • 24 Nov 2022 • Aviv Netanyahu, Tianmin Shu, Joshua Tenenbaum, Pulkit Agrawal
To address this, we propose a reward learning approach, Graph-based Equivalence Mappings (GEM), that can discover spatial goal representations that are aligned with the intended goal specification, enabling successful generalization in unseen environments.
2 code implementations • 4 Oct 2022 • Dianbo Liu, Vedant Shah, Oussama Boussif, Cristian Meo, Anirudh Goyal, Tianmin Shu, Michael Mozer, Nicolas Heess, Yoshua Bengio
We formalize the notions of coordination level and heterogeneity level of an environment and present HECOGrid, a suite of multi-agent RL environments that facilitates empirical evaluation of different MARL approaches across different levels of coordination and environmental heterogeneity by providing a quantitative control over coordination and heterogeneity levels of the environment.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
1 code implementation • 25 May 2022 • Mingkai Deng, Jianyu Wang, Cheng-Ping Hsieh, Yihan Wang, Han Guo, Tianmin Shu, Meng Song, Eric P. Xing, Zhiting Hu
RLPrompt formulates a parameter-efficient policy network that generates the desired discrete prompt after training with reward.
no code implementations • 21 May 2022 • Dianbo Liu, Vedant Shah, Oussama Boussif, Cristian Meo, Anirudh Goyal, Tianmin Shu, Michael Mozer, Nicolas Heess, Yoshua Bengio
In Multi-Agent Reinforcement Learning (MARL), specialized channels are often introduced that allow agents to communicate directly with one another.
Intelligent Communication
Multi-agent Reinforcement Learning
+2
no code implementations • 6 Mar 2021 • Xiaofeng Gao, Luyao Yuan, Tianmin Shu, Hongjing Lu, Song-Chun Zhu
Our experiments with human participants demonstrate that a short calibration using REMP can effectively bridge the gap between what a non-expert user thinks a robot can reach and the ground truth.
no code implementations • NeurIPS Workshop SVRHM 2020 • Aviv Netanyahu, Tianmin Shu, Boris Katz, Andrei Barbu, Joshua B. Tenenbaum
The ability to perceive and reason about social interactions in the context of physical environments is core to human social intelligence and human-machine cooperation.
no code implementations • 24 Feb 2021 • Tianmin Shu, Abhishek Bhandwaldar, Chuang Gan, Kevin A. Smith, Shari Liu, Dan Gutfreund, Elizabeth Spelke, Joshua B. Tenenbaum, Tomer D. Ullman
For machine agents to successfully interact with humans in real-world settings, they will need to develop an understanding of human mental life.
Ranked #1 on
Core Psychological Reasoning
on AGENT
1 code implementation • ICLR 2021 • Xavier Puig, Tianmin Shu, Shuang Li, Zilin Wang, Yuan-Hong Liao, Joshua B. Tenenbaum, Sanja Fidler, Antonio Torralba
In this paper, we introduce Watch-And-Help (WAH), a challenge for testing social intelligence in agents.
no code implementations • 24 Jul 2020 • Xiaofeng Gao, Ran Gong, Yizhou Zhao, Shu Wang, Tianmin Shu, Song-Chun Zhu
Thus, in this paper, we propose a novel explainable AI (XAI) framework for achieving human-like communication in human-robot collaborations, where the robot builds a hierarchical mind model of the human user and generates explanations of its own mind as a form of communications based on its online Bayesian inference of the user's mental state.
Bayesian Inference
Explainable Artificial Intelligence (XAI)
+1
1 code implementation • ECCV 2020 • Hanqing Wang, Wenguan Wang, Tianmin Shu, Wei Liang, Jianbing Shen
Vision-language navigation (VLN) is the task of entailing an agent to carry out navigational instructions inside photo-realistic environments.
no code implementations • ICLR 2019 • Tianmin Shu, Yuandong Tian
Most of the prior work on multi-agent reinforcement learning (MARL) achieves optimal collaboration by directly controlling the agents to maximize a common reward.
1 code implementation • 13 Mar 2019 • Xiaofeng Gao, Ran Gong, Tianmin Shu, Xu Xie, Shu Wang, Song-Chun Zhu
One of the main challenges of advancing task-oriented learning such as visual task planning and reinforcement learning is the lack of realistic and standardized environments for training and testing AI agents.
no code implementations • 1 Oct 2018 • Tianmin Shu, Caiming Xiong, Ying Nian Wu, Song-Chun Zhu
In particular, the probing agent (i. e. a learner) learns to interact with the environment and with a target agent (i. e., a demonstrator) to maximize the change in the observed behaviors of that agent.
1 code implementation • ICLR 2019 • Tianmin Shu, Yuandong Tian
Most of the prior work on multi-agent reinforcement learning (MARL) achieves optimal collaboration by directly controlling the agents to maximize a common reward.
no code implementations • CVPR 2018 • Ping Wei, Yang Liu, Tianmin Shu, Nanning Zheng, Song-Chun Zhu
We built a new video dataset of tasks, intentions, and attention.
no code implementations • ICLR 2018 • Tianmin Shu, Caiming Xiong, Richard Socher
In order to help the agent learn the complex temporal dependencies necessary for the hierarchical policy, we provide it with a stochastic temporal grammar that modulates when to rely on previously learned skills and when to execute new skills.
no code implementations • CVPR 2017 • Tianmin Shu, Sinisa Todorovic, Song-Chun Zhu
This work is about recognizing human activities occurring in videos at distinct semantic levels, including individual actions, interactions, and group activities.
Ranked #12 on
Group Activity Recognition
on Volleyball
no code implementations • 1 Mar 2017 • Tianmin Shu, Xiaofeng Gao, Michael S. Ryoo, Song-Chun Zhu
In this paper, we present a general framework for learning social affordance grammar as a spatiotemporal AND-OR graph (ST-AOG) from RGB-D videos of human interactions, and transfer the grammar to humanoids to enable a real-time motion inference for human-robot interaction (HRI).
no code implementations • 24 Jun 2016 • Dan Xie, Tianmin Shu, Sinisa Todorovic, Song-Chun Zhu
This paper is about detecting functional objects and inferring human intentions in surveillance videos of public spaces.
no code implementations • 13 Apr 2016 • Tianmin Shu, M. S. Ryoo, Song-Chun Zhu
In this paper, we present an approach for robot learning of social affordance from human activity videos.
no code implementations • CVPR 2015 • Tianmin Shu, Dan Xie, Brandon Rothrock, Sinisa Todorovic, Song-Chun Zhu
This paper addresses a new problem of parsing low-resolution aerial videos of large spatial areas, in terms of 1) grouping, 2) recognizing events and 3) assigning roles to people engaged in events.