Search Results for author: Wentao Yuan

Found 11 papers, 7 papers with code

M2T2: Multi-Task Masked Transformer for Object-centric Pick and Place

no code implementations2 Nov 2023 Wentao Yuan, Adithyavairavan Murali, Arsalan Mousavian, Dieter Fox

With the advent of large language models and large-scale robotic datasets, there has been tremendous progress in high-level decision-making for object manipulation.

Decision Making valid

Evaluating Robustness of Visual Representations for Object Assembly Task Requiring Spatio-Geometrical Reasoning

no code implementations15 Oct 2023 Chahyon Ku, Carl Winge, Ryan Diaz, Wentao Yuan, Karthik Desingh

We present a novel task scenario designed to evaluate the progress in visuomotor policy learning, with a specific focus on improving the robustness of intricate assembly tasks that require both geometrical and spatial reasoning.

Benchmarking

Sobolev Training for Implicit Neural Representations with Approximated Image Derivatives

1 code implementation21 Jul 2022 Wentao Yuan, Qingtian Zhu, Xiangyue Liu, Yikang Ding, Haotian Zhang, Chi Zhang

Recently, Implicit Neural Representations (INRs) parameterized by neural networks have emerged as a powerful and promising tool to represent different kinds of signals due to its continuous, differentiable properties, showing superiorities to classical discretized representations.

Inverse Rendering

KD-MVS: Knowledge Distillation Based Self-supervised Learning for Multi-view Stereo

1 code implementation21 Jul 2022 Yikang Ding, Qingtian Zhu, Xiangyue Liu, Wentao Yuan, Haotian Zhang, Chi Zhang

Supervised multi-view stereo (MVS) methods have achieved remarkable progress in terms of reconstruction quality, but suffer from the challenge of collecting large-scale ground-truth depth.

Knowledge Distillation Self-Supervised Learning

TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers

1 code implementation CVPR 2022 Yikang Ding, Wentao Yuan, Qingtian Zhu, Haotian Zhang, Xiangyue Liu, Yuanjiang Wang, Xiao Liu

We analogize MVS back to its nature of a feature matching task and therefore propose a powerful Feature Matching Transformer (FMT) to leverage intra- (self-) and inter- (cross-) attention to aggregate long-range context information within and across images.

3D Reconstruction Feature Correlation

SORNet: Spatial Object-Centric Representations for Sequential Manipulation

1 code implementation8 Sep 2021 Wentao Yuan, Chris Paxton, Karthik Desingh, Dieter Fox

Sequential manipulation tasks require a robot to perceive the state of an environment and plan a sequence of actions leading to a desired goal state.

Object Relation Classification +1

Self-Supervised Learning on 3D Point Clouds by Learning Discrete Generative Models

no code implementations CVPR 2021 Benjamin Eckart, Wentao Yuan, Chao Liu, Jan Kautz

In this work, we introduce a general method for 3D self-supervised representation learning that 1) remains agnostic to the underlying neural network architecture, and 2) specifically leverages the geometric nature of 3D point cloud data.

Point Cloud Segmentation Representation Learning +4

STaR: Self-supervised Tracking and Reconstruction of Rigid Objects in Motion with Neural Rendering

no code implementations CVPR 2021 Wentao Yuan, Zhaoyang Lv, Tanner Schmidt, Steven Lovegrove

We achieve this by jointly optimizing the parameters of two neural radiance fields and a set of rigid poses which align the two fields at each frame.

Neural Rendering Object

Iterative Transformer Network for 3D Point Cloud

1 code implementation27 Nov 2018 Wentao Yuan, David Held, Christoph Mertz, Martial Hebert

Recently, neural networks operating on point clouds have shown superior performance on 3D understanding tasks such as shape classification and part segmentation.

General Classification Object +1

PCN: Point Completion Network

5 code implementations2 Aug 2018 Wentao Yuan, Tejas Khot, David Held, Christoph Mertz, Martial Hebert

Shape completion, the problem of estimating the complete geometry of objects from partial observations, lies at the core of many vision and robotics applications.

Point Cloud Completion

Cannot find the paper you are looking for? You can Submit a new open access paper.