Search Results for author: Thomas Kollar

Found 14 papers, 9 papers with code

NeO 360: Neural Fields for Sparse View Synthesis of Outdoor Scenes

1 code implementation ICCV 2023 Muhammad Zubair Irshad, Sergey Zakharov, Katherine Liu, Vitor Guizilini, Thomas Kollar, Adrien Gaidon, Zsolt Kira, Rares Ambrus

NeO 360's representation allows us to learn from a large collection of unbounded 3D scenes while offering generalizability to new views and novel scenes from as few as a single image during inference.

Generalizable Novel View Synthesis Novel View Synthesis

Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models

2 code implementations12 Feb 2024 Siddharth Karamcheti, Suraj Nair, Ashwin Balakrishna, Percy Liang, Thomas Kollar, Dorsa Sadigh

Visually-conditioned language models (VLMs) have seen growing adoption in applications such as visual dialogue, scene understanding, and robotic task planning; adoption that has fueled a wealth of new models such as LLaVa, InstructBLIP, and PaLI-3.

Hallucination Object Localization +3

Language-Driven Representation Learning for Robotics

2 code implementations24 Feb 2023 Siddharth Karamcheti, Suraj Nair, Annie S. Chen, Thomas Kollar, Chelsea Finn, Dorsa Sadigh, Percy Liang

First, we demonstrate that existing representations yield inconsistent results across these tasks: masked autoencoding approaches pick up on low-level spatial features at the cost of high-level semantics, while contrastive learning approaches capture the opposite.

Contrastive Learning Imitation Learning +2

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo

1 code implementation30 Jun 2021 Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan, Mark Tjersland

However, the RGB-D baseline only grasps 35% of the hard (e. g., transparent) objects, while SimNet grasps 95%, suggesting that SimNet can enable robust manipulation of unknown objects, including transparent objects, in unknown environments.

Keypoint Detection Object +5

A Critical Evaluation of AI Feedback for Aligning Large Language Models

1 code implementation19 Feb 2024 Archit Sharma, Sedrick Keh, Eric Mitchell, Chelsea Finn, Kushal Arora, Thomas Kollar

RLAIF first performs supervised fine-tuning (SFT) using demonstrations from a teacher model and then further fine-tunes the model with reinforcement learning (RL), using feedback from a critic model.

Instruction Following reinforcement-learning +1

Jointly Learning to Parse and Perceive: Connecting Natural Language to the Physical World

no code implementations TACL 2013 Jayant Krishnamurthy, Thomas Kollar

LSP learns physical representations for both categorical ({``}blue,{''} {``}mug{''}) and relational ({``}on{''}) language, and also learns to compose these representations to produce the referents of entire statements.

Language Acquisition Question Answering +1

A Mobile Manipulation System for One-Shot Teaching of Complex Tasks in Homes

no code implementations30 Sep 2019 Max Bajracharya, James Borders, Dan Helmick, Thomas Kollar, Michael Laskey, John Leichty, Jeremy Ma, Umashankar Nagarajan, Akiyoshi Ochiai, Josh Petersen, Krishna Shankar, Kevin Stone, Yutaka Takaoka

We describe a mobile manipulation hardware and software system capable of autonomously performing complex human-level tasks in real homes, after being taught the task with a single demonstration from a person in virtual reality.

Position

Cannot find the paper you are looking for? You can Submit a new open access paper.