no code implementations • 11 Dec 2024 • Jimmy Wu, William Chong, Robert Holmberg, Aaditya Prasad, Yihuai Gao, Oussama Khatib, Shuran Song, Szymon Rusinkiewicz, Jeannette Bohg
In our experiments, we use this interface to collect data and show that the resulting learned policies can successfully perform a variety of common household mobile manipulation tasks.
no code implementations • 17 Oct 2024 • Ruoshi Liu, Alper Canberk, Shuran Song, Carl Vondrick
Vision foundation models trained on massive amounts of visual data have shown unprecedented reasoning and planning skills in open-world settings.
no code implementations • 2 Sep 2024 • Zoey Chen, Zhao Mandi, Homanga Bharadhwaj, Mohit Sharma, Shuran Song, Abhishek Gupta, Vikash Kumar
By demonstrating the effectiveness of image-text generative models in diverse real-world robotic applications, our generative augmentation framework provides a scalable and efficient path for boosting generalization in robot learning at no extra human cost.
no code implementations • 21 Jul 2024 • Mengda Xu, Zhenjia Xu, Yinghao Xu, Cheng Chi, Gordon Wetzstein, Manuela Veloso, Shuran Song
By leveraging real-world human videos and simulated robot play data, we bypass the challenges of teleoperating physical robots in the real world, resulting in a scalable system for diverse tasks.
no code implementations • 1 Jul 2024 • Jingyun Yang, Zi-ang Cao, Congyue Deng, Rika Antonova, Shuran Song, Jeannette Bohg
Building effective imitation learning methods that enable robots to learn from limited data and still generalize across diverse real-world environments is a long-standing problem in robot learning.
no code implementations • 27 Jun 2024 • Zeyi Liu, Cheng Chi, Eric Cousineau, Naveen Kuppuswamy, Benjamin Burchfiel, Shuran Song
Audio signals provide rich information for the robot interaction and object properties through contact.
no code implementations • 24 Jun 2024 • Junbang Liang, Ruoshi Liu, Ege Ozguroglu, Sruthi Sudhakar, Achal Dave, Pavel Tokmakov, Shuran Song, Carl Vondrick
A key challenge in manipulation is learning a policy that can robustly generalize to diverse visual environments.
2 code implementations • 17 Jun 2024 • Jeffrey Li, Alex Fang, Georgios Smyrnis, Maor Ivgi, Matt Jordan, Samir Gadre, Hritik Bansal, Etash Guha, Sedrick Keh, Kushal Arora, Saurabh Garg, Rui Xin, Niklas Muennighoff, Reinhard Heckel, Jean Mercat, Mayee Chen, Suchin Gururangan, Mitchell Wortsman, Alon Albalak, Yonatan Bitton, Marianna Nezhurina, Amro Abbas, Cheng-Yu Hsieh, Dhruba Ghosh, Josh Gardner, Maciej Kilian, HANLIN ZHANG, Rulin Shao, Sarah Pratt, Sunny Sanyal, Gabriel Ilharco, Giannis Daras, Kalyani Marathe, Aaron Gokaslan, Jieyu Zhang, Khyathi Chandu, Thao Nguyen, Igor Vasiljevic, Sham Kakade, Shuran Song, Sujay Sanghavi, Fartash Faghri, Sewoong Oh, Luke Zettlemoyer, Kyle Lo, Alaaeldin El-Nouby, Hadi Pouransari, Alexander Toshev, Stephanie Wang, Dirk Groeneveld, Luca Soldaini, Pang Wei Koh, Jenia Jitsev, Thomas Kollar, Alexandros G. Dimakis, Yair Carmon, Achal Dave, Ludwig Schmidt, Vaishaal Shankar
We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models.
no code implementations • 12 Jun 2024 • Zhao Mandi, Yijia Weng, Dominik Bauer, Shuran Song
By leveraging pre-trained vision and language models, our approach scales elegantly with the number of articulated parts, and generalizes from synthetic training data to real world objects in unstructured environments.
no code implementations • 18 Apr 2024 • Dominik Bauer, Zhenjia Xu, Shuran Song
Manipulation of elastoplastic objects like dough often involves topological changes such as splitting and merging.
no code implementations • 1 Apr 2024 • Zixi Wang, Zeyi Liu, Nicolas Ouporov, Shuran Song
We propose ContactHandover, a robot to human handover system that consists of two phases: a contact-guided grasping phase and an object delivery phase.
1 code implementation • 13 Mar 2024 • Samir Yitzhak Gadre, Georgios Smyrnis, Vaishaal Shankar, Suchin Gururangan, Mitchell Wortsman, Rulin Shao, Jean Mercat, Alex Fang, Jeffrey Li, Sedrick Keh, Rui Xin, Marianna Nezhurina, Igor Vasiljevic, Jenia Jitsev, Luca Soldaini, Alexandros G. Dimakis, Gabriel Ilharco, Pang Wei Koh, Shuran Song, Thomas Kollar, Yair Carmon, Achal Dave, Reinhard Heckel, Niklas Muennighoff, Ludwig Schmidt
Second, we relate the perplexity of a language model to its downstream task performance by proposing a power law.
no code implementations • 23 Feb 2024 • Xiaomeng Xu, Huy Ha, Shuran Song
The design objective constructed from the target and predicted interaction profiles provides a gradient to guide the refinement of finger geometry for the task.
no code implementations • 7 Feb 2024 • Jingxi Xu, Yinsen Jia, Dongxiao Yang, Patrick Meng, Xinyue Zhu, Zihan Guo, Shuran Song, Matei Ciocarlie
We also introduce a training curriculum that enables learning these behaviors in simulation, followed by zero-shot transfer to real hardware.
no code implementations • 30 Nov 2023 • Bardienus P. Duisterhof, Zhao Mandi, Yunchao Yao, Jia-Wei Liu, Jenny Seidenschwarz, Mike Zheng Shou, Deva Ramanan, Shuran Song, Stan Birchfield, Bowen Wen, Jeffrey Ichnowski
Just as object pose estimation and tracking have aided robots for rigid manipulation, dense 3D tracking (scene flow) of highly deformable objects will enable new applications in robotics while aiding existing approaches, such as imitation learning or creating digital twins with real2sim transfer.
no code implementations • 21 Jul 2023 • Isaac Kasahara, Shubham Agrawal, Selim Engin, Nikhil Chavan-Dafle, Shuran Song, Volkan Isler
General scene reconstruction refers to the task of estimating the full 3D geometry and texture of a scene containing previously unseen objects.
1 code implementation • 19 Jul 2023 • Mengda Xu, Zhenjia Xu, Cheng Chi, Manuela Veloso, Shuran Song
Human demonstration videos are a widely available data source for robot learning and an intuitive user interface for expressing desired behavior.
1 code implementation • 10 Jul 2023 • Zhao Mandi, Shreeya Jain, Shuran Song
We propose a novel approach to multi-robot collaboration that harnesses the power of pre-trained large language models (LLMs) for both high-level communication and low-level path planning.
no code implementations • 1 Jul 2023 • Yulong Li, Andy Zeng, Shuran Song
Most successes in autonomous robotic assembly have been restricted to single target or category.
1 code implementation • 27 Jun 2023 • Zeyi Liu, Arpit Bahety, Shuran Song
The ability to detect and analyze failed executions automatically is crucial for an explainable and robust robotic system.
1 code implementation • 9 May 2023 • Jimmy Wu, Rika Antonova, Adam Kan, Marion Lepert, Andy Zeng, Shuran Song, Jeannette Bohg, Szymon Rusinkiewicz, Thomas Funkhouser
For a robot to personalize physical assistance effectively, it must learn user preferences that can be generally reapplied to future scenarios.
3 code implementations • NeurIPS 2023 • Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, Eyal Orgad, Rahim Entezari, Giannis Daras, Sarah Pratt, Vivek Ramanujan, Yonatan Bitton, Kalyani Marathe, Stephen Mussmann, Richard Vencu, Mehdi Cherti, Ranjay Krishna, Pang Wei Koh, Olga Saukh, Alexander Ratner, Shuran Song, Hannaneh Hajishirzi, Ali Farhadi, Romain Beaumont, Sewoong Oh, Alex Dimakis, Jenia Jitsev, Yair Carmon, Vaishaal Shankar, Ludwig Schmidt
Multimodal datasets are a critical component in recent breakthroughs such as Stable Diffusion and GPT-4, yet their design does not receive the same research attention as model architectures or training algorithms.
no code implementations • 12 Mar 2023 • Siddharth Singi, Zhanpeng He, Alvin Pan, Sandip Patel, Gunnar A. Sigurdsson, Robinson Piramuthu, Shuran Song, Matei Ciocarlie
In a Human-in-the-Loop paradigm, a robotic agent is able to act mostly autonomously in solving a task, but can request help from an external expert when needed.
1 code implementation • 12 Dec 2022 • Zhao Mandi, Homanga Bharadhwaj, Vincent Moens, Shuran Song, Aravind Rajeswaran, Vikash Kumar
On a real robot setup, CACTI enables efficient training of a single policy that can perform 10 manipulation tasks involving kitchen objects, and is robust to varying layouts of distractors.
no code implementations • 30 Sep 2022 • Mengda Xu, Manuela Veloso, Shuran Song
We introduce ASPiRe (Adaptive Skill Prior for RL), a new approach that leverages prior experience to accelerate reinforcement learning.
no code implementations • 24 Sep 2022 • Jiayi Chen, Mi Yan, Jiazhao Zhang, Yinzhen Xu, Xiaolong Li, Yijia Weng, Li Yi, Shuran Song, He Wang
We for the first time propose a point cloud based hand joint tracking network, HandTrackNet, to estimate the inter-frame hand joint motion.
no code implementations • 19 Sep 2022 • Jingxi Xu, Han Lin, Shuran Song, Matei Ciocarlie
In this work, we propose TANDEM3D, a method that applies a co-training framework for exploration and decision making to 3D object recognition with tactile signals.
1 code implementation • 10 Aug 2022 • Gabriel Ilharco, Mitchell Wortsman, Samir Yitzhak Gadre, Shuran Song, Hannaneh Hajishirzi, Simon Kornblith, Ali Farhadi, Ludwig Schmidt
We study model patching, where the goal is to improve accuracy on specific tasks without degrading accuracy on tasks where performance is already adequate.
1 code implementation • 23 Jul 2022 • Huy Ha, Shuran Song
We study open-world 3D scene understanding, a family of tasks that require agents to reason about their 3D environment with an open-set vocabulary and out-of-domain visual inputs - a critical skill for robots to operate in the unstructured 3D world.
no code implementations • 19 Jul 2022 • Neil Nie, Samir Yitzhak Gadre, Kiana Ehsani, Shuran Song
We introduce Structure from Action (SfA), a framework to discover 3D part geometry and joint parameters of unseen articulated objects via a sequence of inferred interactions.
1 code implementation • 17 Jul 2022 • Zeyi Liu, Zhenjia Xu, Shuran Song
We introduce BusyBoard, a toy-inspired robot learning environment that leverages a diverse set of articulated objects and inter-object functional relations to provide rich visual feedback for robot interactions.
1 code implementation • 5 Apr 2022 • Jimmy Wu, Xingyuan Sun, Andy Zeng, Shuran Song, Szymon Rusinkiewicz, Thomas Funkhouser
We investigate pneumatic non-prehensile manipulation (i. e., blowing) as a means of efficiently moving scattered objects into a target receptacle.
no code implementations • CVPR 2022 • Samir Yitzhak Gadre, Kiana Ehsani, Shuran Song, Roozbeh Mottaghi
Our method captures feature relationships between objects, composes them into a graph structure on-the-fly, and situates an embodied agent within the representation.
1 code implementation • CVPR 2023 • Samir Yitzhak Gadre, Mitchell Wortsman, Gabriel Ilharco, Ludwig Schmidt, Shuran Song
To better evaluate L-ZSON, we introduce the Pasture benchmark, which considers finding uncommon objects, objects described by spatial and appearance attributes, and hidden objects described relative to visible objects.
no code implementations • 1 Mar 2022 • Jingxi Xu, Shuran Song, Matei Ciocarlie
Inspired by the human ability to perform complex manipulation in the complete absence of vision (like retrieving an object from a pocket), the robotic manipulation field is motivated to develop new methods for tactile-based object interaction.
no code implementations • NeurIPS 2021 • Xiaolong Li, Yijia Weng, Li Yi, Leonidas Guibas, A. Lynn Abbott, Shuran Song, He Wang
Category-level object pose estimation aims to find 6D object poses of previously unseen object instances from known categories without access to object CAD models.
1 code implementation • 9 Oct 2021 • Yulong Li, Shubham Agrawal, Jen-Shuo Liu, Steven K. Feiner, Shuran Song
To make teleoperation accessible to non-expert users, we propose the framework "Scene Editing as Teleoperation" (SEaT), where the key idea is to transform the traditional "robot-centric" interface into a "scene-centric" interface -- instead of controlling the robot, users focus on specifying the task's goal by manipulating digital twins of the real-world objects.
no code implementations • 13 Sep 2021 • Zhenjia Xu, Zhanpeng He, Shuran Song
We introduce the Universal Manipulation Policy Network (UMPNet) -- a single image-based policy network that infers closed-loop action sequences for manipulating arbitrary articulated objects.
no code implementations • 1 Jul 2021 • Lin Yen-Chen, Andy Zeng, Shuran Song, Phillip Isola, Tsung-Yi Lin
With just a small amount of robotic experience, we can further fine-tune the affordance model to achieve better results.
no code implementations • NeurIPS 2021 • Xiaolong Li, Yijia Weng, Li Yi, Leonidas Guibas, A. Lynn Abbott, Shuran Song, He Wang
To reduce the huge amount of pose annotations needed for category-level learning, we propose for the first time a self-supervised learning framework to estimate category-level 6D object pose from single 3D point clouds.
no code implementations • 11 May 2021 • Boyuan Chen, Yuhang Hu, Robert Kwiatkowski, Shuran Song, Hod Lipson
We suggest that visual behavior modeling and perspective taking skills will play a critical role in the ability of physical robots to fully integrate into real-world multi-agent activities.
no code implementations • ICCV 2021 • Samir Yitzhak Gadre, Kiana Ehsani, Shuran Song
People often use physical intuition when manipulating articulated objects, irrespective of object semantics.
no code implementations • ICCV 2021 • Cheng Chi, Shuran Song
By mapping the observed partial surface to the canonical space and completing it in this space, the output representation describes the garment's full configuration using a complete 3D mesh with the per-vertex canonical coordinate label.
1 code implementation • 23 Mar 2021 • Jimmy Wu, Xingyuan Sun, Andy Zeng, Shuran Song, Szymon Rusinkiewicz, Thomas Funkhouser
The ability to communicate intention enables decentralized multi-agent robots to collaborate while performing physical tasks.
1 code implementation • 8 Dec 2020 • Yiqing Liang, Boyuan Chen, Shuran Song
We introduce SSCNav, an algorithm that explicitly models scene priors using a confidence-aware semantic scene completion module to complete the scene and guide the agent's navigation planning.
no code implementations • 7 Dec 2020 • Sebastian Höfer, Kostas Bekris, Ankur Handa, Juan Camilo Gamboa, Florian Golemo, Melissa Mozifian, Chris Atkeson, Dieter Fox, Ken Goldberg, John Leonard, C. Karen Liu, Jan Peters, Shuran Song, Peter Welinder, Martha White
This report presents the debates, posters, and discussions of the Sim2Real workshop held in conjunction with the 2020 edition of the "Robotics: Science and System" conference.
1 code implementation • 28 Nov 2020 • Zhenjia Xu, Beichun Qi, Shubham Agrawal, Shuran Song
We propose AdaGrasp, a method to learn a single grasping policy that generalizes to novel grippers.
Robotics
1 code implementation • 12 Nov 2020 • Huy Ha, Shubham Agrawal, Shuran Song
We propose Fit2Form, a 3D generative design framework that generates pairs of finger shapes to maximize design objectives (i. e., grasp success, stability, and robustness) for target grasp objects.
1 code implementation • 5 Nov 2020 • Huy Ha, Jingxi Xu, Shuran Song
In this paper, we tackle this problem with multi-agent reinforcement learning, where a decentralized policy is trained to control one robot arm in the multi-arm system to reach its target end-effector pose given observations of its workspace state and target end-effector pose.
2 code implementations • 3 Nov 2020 • Zhenjia Xu, Zhanpeng He, Jiajun Wu, Shuran Song
3D scene representation for robot manipulation should capture three key object properties: permanency -- objects that become occluded over time continue to exist; amodal completeness -- objects have 3D occupancy, even if only partial observations are available; spatiotemporal continuity -- the movement of each object is continuous over space and time.
1 code implementation • ECCV 2020 • Chengzhi Mao, Amogh Gupta, Vikram Nitin, Baishakhi Ray, Shuran Song, Junfeng Yang, Carl Vondrick
Although deep networks achieve strong accuracy on a range of computer vision benchmarks, they remain vulnerable to adversarial attacks, where imperceptible input perturbations fool the network.
1 code implementation • 20 Apr 2020 • Jimmy Wu, Xingyuan Sun, Andy Zeng, Shuran Song, Johnny Lee, Szymon Rusinkiewicz, Thomas Funkhouser
Typical end-to-end formulations for learning robotic navigation involve predicting a small set of steering command actions (e. g., step forward, turn left, turn right, etc.)
2 code implementations • CVPR 2020 • Xiaolong Li, He Wang, Li Yi, Leonidas Guibas, A. Lynn Abbott, Shuran Song
We develop a deep network based on PointNet++ that predicts ANCSH from a single depth point cloud, including part segmentation, normalized coordinates, and joint parameters in the canonical object space.
no code implementations • 9 Dec 2019 • Shuran Song, Andy Zeng, Johnny Lee, Thomas Funkhouser
A key aspect of our grasping model is that it uses "action-view" based rendering to simulate future states with respect to different possible actions.
1 code implementation • 30 Oct 2019 • Kevin Zakka, Andy Zeng, Johnny Lee, Shuran Song
This formulation enables the model to acquire a broader understanding of how shapes and surfaces fit together for assembly -- allowing it to generalize to new objects and kits.
no code implementations • 15 Oct 2019 • Boyuan Chen, Shuran Song, Hod Lipson, Carl Vondrick
We train embodied agents to play Visual Hide and Seek where a prey must navigate in a simulated environment in order to avoid capture from a predator.
1 code implementation • 6 Oct 2019 • Shreeyak S. Sajjan, Matthew Moore, Mike Pan, Ganesh Nagaraja, Johnny Lee, Andy Zeng, Shuran Song
To address these challenges, we present ClearGrasp -- a deep learning approach for estimating accurate 3D geometry of transparent objects from a single RGB-D image for robotic manipulation.
Ranked #1 on Semantic Segmentation on Cleargrasp (Novel)
no code implementations • CVPR 2019 • Shuran Song, Thomas Funkhouser
This paper addresses the task of estimating the light arriving from all directions to a 3D point observed at a selected pixel in an RGB image.
no code implementations • 10 Jun 2019 • Zhenjia Xu, Jiajun Wu, Andy Zeng, Joshua B. Tenenbaum, Shuran Song
We study the problem of learning physical object representations for robot manipulation.
no code implementations • 27 Mar 2019 • Andy Zeng, Shuran Song, Johnny Lee, Alberto Rodriguez, Thomas Funkhouser
In this work, we propose an end-to-end formulation that jointly learns to infer control parameters for grasping and throwing motion primitives from visual observations (images of arbitrary objects in a bin) through trial and error.
9 code implementations • CVPR 2019 • He Wang, Srinath Sridhar, Jingwei Huang, Julien Valentin, Shuran Song, Leonidas J. Guibas
The goal of this paper is to estimate the 6D pose and dimensions of unseen object instances in an RGB-D image.
Ranked #2 on 6D Pose Estimation using RGBD on CAMERA25
no code implementations • ECCV 2018 • Michelle Guo, Edward Chou, De-An Huang, Shuran Song, Serena Yeung, Li Fei-Fei
We propose Neural Graph Matching (NGM) Networks, a novel framework that can learn to recognize a previous unseen 3D action class with only a few examples.
Ranked #1 on Skeleton Based Action Recognition on CAD-120
no code implementations • CVPR 2018 • Shuran Song, Andy Zeng, Angel X. Chang, Manolis Savva, Silvio Savarese, Thomas Funkhouser
We present Im2Pano3D, a convolutional neural network that generates a dense prediction of 3D structure and a probability distribution of semantic labels for a full 360 panoramic view of an indoor scene when given only a partial observation ( <=50%) in the form of an RGB-D image.
4 code implementations • 27 Mar 2018 • Andy Zeng, Shuran Song, Stefan Welker, Johnny Lee, Alberto Rodriguez, Thomas Funkhouser
Skilled robotic manipulation benefits from complex synergies between non-prehensile (e. g. pushing) and prehensile (e. g. grasping) actions: pushing can help rearrange cluttered objects to make space for arms and fingers; likewise, grasping can help displace objects to make pushing movements more precise and collision-free.
no code implementations • 12 Dec 2017 • Shuran Song, Andy Zeng, Angel X. Chang, Manolis Savva, Silvio Savarese, Thomas Funkhouser
We present Im2Pano3D, a convolutional neural network that generates a dense prediction of 3D structure and a probability distribution of semantic labels for a full 360 panoramic view of an indoor scene when given only a partial observation (<= 50%) in the form of an RGB-D image.
3 code implementations • 3 Oct 2017 • Andy Zeng, Shuran Song, Kuan-Ting Yu, Elliott Donlon, Francois R. Hogan, Maria Bauza, Daolin Ma, Orion Taylor, Melody Liu, Eudald Romo, Nima Fazeli, Ferran Alet, Nikhil Chavan Dafle, Rachel Holladay, Isabella Morona, Prem Qu Nair, Druck Green, Ian Taylor, Weber Liu, Thomas Funkhouser, Alberto Rodriguez
Since product images are readily available for a wide range of objects (e. g., from the web), the system works out-of-the-box for novel objects without requiring any additional training data.
1 code implementation • 18 Sep 2017 • Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Nießner, Manolis Savva, Shuran Song, Andy Zeng, yinda zhang
Access to large, diverse RGB-D datasets is critical for training RGB-D scene understanding algorithms.
no code implementations • CVPR 2017 • Yinda Zhang, Shuran Song, Ersin Yumer, Manolis Savva, Joon-Young Lee, Hailin Jin, Thomas Funkhouser
One of the bottlenecks in training for better representations is the amount of available per-pixel ground truth data that is required for core scene understanding tasks such as semantic segmentation, normal prediction, and object edge detection.
3 code implementations • CVPR 2017 • Shuran Song, Fisher Yu, Andy Zeng, Angel X. Chang, Manolis Savva, Thomas Funkhouser
This paper focuses on semantic scene completion, a task for producing a complete 3D voxel representation of volumetric occupancy and semantic labels for a scene from a single-view depth map observation.
Ranked #2 on 3D Semantic Scene Completion on KITTI-360
2 code implementations • 29 Sep 2016 • Andy Zeng, Kuan-Ting Yu, Shuran Song, Daniel Suo, Ed Walker Jr., Alberto Rodriguez, Jianxiong Xiao
The approach was part of the MIT-Princeton Team system that took 3rd- and 4th- place in the stowing and picking tasks, respectively at APC 2016.
2 code implementations • CVPR 2017 • Andy Zeng, Shuran Song, Matthias Nießner, Matthew Fisher, Jianxiong Xiao, Thomas Funkhouser
To amass training data for our model, we propose a self-supervised feature learning method that leverages the millions of correspondence labels found in existing RGB-D reconstructions.
Ranked #2 on 3D Reconstruction on Scan2CAD
17 code implementations • 9 Dec 2015 • Angel X. Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qi-Xing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, Jianxiong Xiao, Li Yi, Fisher Yu
We present ShapeNet: a richly-annotated, large-scale repository of shapes represented by 3D CAD models of objects.
no code implementations • CVPR 2016 • Shuran Song, Jianxiong Xiao
We focus on the task of amodal 3D object detection in RGB-D images, which aims to produce a 3D bounding box of an object in metric form at its full extent.
Ranked #6 on 3D Object Detection on SUN-RGBD val (Inference Speed (s) metric)
no code implementations • 9 Jul 2015 • Shuran Song, Linguang Zhang, Jianxiong Xiao
By constraining a robot to stay in a limited territory, we can ensure that the robot has seen most objects before and the speed of introducing a new object is slow.
4 code implementations • 10 Jun 2015 • Fisher Yu, Ari Seff, yinda zhang, Shuran Song, Thomas Funkhouser, Jianxiong Xiao
While there has been remarkable progress in the performance of visual recognition algorithms, the state-of-the-art models tend to be exceptionally data-hungry.
no code implementations • CVPR 2015 • Shuran Song, Samuel P. Lichtenberg, Jianxiong Xiao
Although RGB-D sensors have enabled major breakthroughs for several vision tasks, such as 3D reconstruction, we have not attained the same level of success in high-level scene understanding.
2 code implementations • CVPR 2015 • Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, Jianxiong Xiao
Our model, 3D ShapeNets, learns the distribution of complex 3D shapes across different object categories and arbitrary poses from raw CAD data, and discovers hierarchical compositional part representations automatically.
Ranked #34 on 3D Point Cloud Classification on ModelNet40 (Mean Accuracy metric)