no code implementations • 5 Feb 2024 • Andi Peng, Andreea Bobu, Belinda Z. Li, Theodore R. Sumers, Ilia Sucholutsky, Nishanth Kumar, Thomas L. Griffiths, Julie A. Shah
We observe that how humans behave reveals how they see the world.
no code implementations • 18 Oct 2023 • Ilia Sucholutsky, Lukas Muttenthaler, Adrian Weller, Andi Peng, Andreea Bobu, Been Kim, Bradley C. Love, Erin Grant, Iris Groen, Jascha Achterberg, Joshua B. Tenenbaum, Katherine M. Collins, Katherine L. Hermann, Kerem Oktar, Klaus Greff, Martin N. Hebart, Nori Jacoby, Qiuyi Zhang, Raja Marjieh, Robert Geirhos, Sherol Chen, Simon Kornblith, Sunayana Rane, Talia Konkle, Thomas P. O'Connell, Thomas Unterthiner, Andrew K. Lampinen, Klaus-Robert Müller, Mariya Toneva, Thomas L. Griffiths
Finally, we lay out open problems in representational alignment where progress can benefit all three of these fields.
no code implementations • 12 Jul 2023 • Andi Peng, Aviv Netanyahu, Mark Ho, Tianmin Shu, Andreea Bobu, Julie Shah, Pulkit Agrawal
Policies often fail due to distribution shift -- changes in the state and reward that occur when a policy is deployed in new environments.
no code implementations • 3 Feb 2023 • Andreea Bobu, Andi Peng, Pulkit Agrawal, Julie Shah, Anca D. Dragan
To act in the world, robots rely on a representation of salient task aspects: for example, to carry a coffee mug, a robot may consider movement efficiency or mug orientation in its behavior.
no code implementations • 2 Jan 2023 • Andreea Bobu, Yi Liu, Rohin Shah, Daniel S. Brown, Anca D. Dragan
This, in turn, is what enables the robot to disambiguate between what needs to go into the representation versus what is spurious, as well as what aspects of behavior can be compressed together versus not.
no code implementations • 30 Nov 2022 • David Zhang, Micah Carroll, Andreea Bobu, Anca Dragan
One of the most successful paradigms for reward learning uses human feedback in the form of comparisons.
no code implementations • 15 May 2022 • Andreea Bobu, Andi Peng
As robots are increasingly deployed in real-world scenarios, a key question is how to best transfer knowledge learned in one environment to another, where shifting constraints and human preferences render adaptation challenging.
no code implementations • 4 Mar 2022 • Arjun Sripathy, Andreea Bobu, Zhongyu Li, Koushil Sreenath, Daniel S. Brown, Anca D. Dragan
As a result 1) all user feedback can contribute to learning about every emotion; 2) the robot can generate trajectories for any emotion in the space instead of only a few predefined ones; and 3) the robot can respond emotively to user-generated natural language by mapping it to a target VAD.
1 code implementation • 18 Jan 2022 • Andreea Bobu, Marius Wiggert, Claire Tomlin, Anca D. Dragan
To get around this issue, recent deep Inverse Reinforcement Learning (IRL) methods learn rewards directly from the raw state but this is challenging because the robot has to implicitly learn the features that are important and how to combine them, simultaneously.
no code implementations • 9 Nov 2021 • Andreea Bobu, Chris Paxton, Wei Yang, Balakumar Sundaralingam, Yu-Wei Chao, Maya Cakmak, Dieter Fox
Second, we treat this low-dimensional concept as an automatic labeler to synthesize a large-scale high-dimensional data set with the simulator.
no code implementations • 14 Apr 2021 • Matthew Zurek, Andreea Bobu, Daniel S. Brown, Anca D. Dragan
Shared autonomy enables robots to infer user intent and assist in accomplishing it.
no code implementations • 13 Mar 2021 • Arjun Sripathy, Andreea Bobu, Daniel S. Brown, Anca D. Dragan
As environments involving both robots and humans become increasingly common, so does the need to account for people during planning.
1 code implementation • 23 Jun 2020 • Andreea Bobu, Marius Wiggert, Claire Tomlin, Anca D. Dragan
When the correction cannot be explained by these features, recent work in deep Inverse Reinforcement Learning (IRL) suggests that the robot could ask for task demonstrations and recover a reward defined over the raw state space.
no code implementations • 3 Feb 2020 • Andreea Bobu, Andrea Bajcsy, Jaime F. Fisac, Sampada Deglurkar, Anca D. Dragan
Recent work focuses on how robots can use such input - like demonstrations or corrections - to learn intended objectives.
no code implementations • 13 Jan 2020 • Andreea Bobu, Dexter R. R. Scobee, Jaime F. Fisac, S. Shankar Sastry, Anca D. Dragan
A common model is the Boltzmann noisily-rational decision model, which assumes people approximately optimize a reward function and choose trajectories in proportion to their exponentiated reward.
1 code implementation • 11 Oct 2018 • Andreea Bobu, Andrea Bajcsy, Jaime F. Fisac, Anca D. Dragan
Learning robot objective functions from human input has become increasingly important, but state-of-the-art techniques assume that the human's desired objective lies within the robot's hypothesis space.