Search Results for author: Jiayuan Gu

Found 21 papers, 12 papers with code

SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model

no code implementations27 Jan 2025 Delin Qu, Haoming Song, Qizhi Chen, Yuanqi Yao, Xinyi Ye, Yan Ding, Zhigang Wang, Jiayuan Gu, Bin Zhao, Dong Wang, Xuelong Li

Specifically, we introduce Ego3D Position Encoding to inject 3D information into the input observations of the visual-language-action model, and propose Adaptive Action Grids to represent spatial robot movement actions with adaptive discretized action grids, facilitating learning generalizable and transferrable spatial action knowledge for cross-robot control.

Robot Manipulation

Point-SAM: Promptable 3D Segmentation Model for Point Clouds

1 code implementation25 Jun 2024 Yuchen Zhou, Jiayuan Gu, Tung Yen Chiang, Fanbo Xiang, Hao Su

The development of 2D foundation models for image segmentation has been significantly advanced by the Segment Anything Model (SAM).

Image Segmentation Segmentation +1

Generic 3D Diffusion Adapter Using Controlled Multi-View Editing

1 code implementation18 Mar 2024 Hansheng Chen, Ruoxi Shi, Yulin Liu, Bokui Shen, Jiayuan Gu, Gordon Wetzstein, Hao Su, Leonidas Guibas

Open-domain 3D object synthesis has been lagging behind image synthesis due to limited data and higher computational complexity.

3D Generation Image to 3D +2

One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion

no code implementations CVPR 2024 Minghua Liu, Ruoxi Shi, Linghao Chen, Zhuoyang Zhang, Chao Xu, Xinyue Wei, Hansheng Chen, Chong Zeng, Jiayuan Gu, Hao Su

Recent advancements in open-world 3D object generation have been remarkable, with image-to-3D methods offering superior fine-grained control over their text-to-3D counterparts.

Image Generation Image to 3D +1

Automatic Error Analysis for Document-level Information Extraction

1 code implementation ACL 2022 Aliva Das, Xinya Du, Barry Wang, Kejian Shi, Jiayuan Gu, Thomas Porter, Claire Cardie

Document-level information extraction (IE) tasks have recently begun to be revisited in earnest using the end-to-end neural network techniques that have been successful on their sentence-level IE counterparts.

Event Extraction Relation Extraction +1

Multi-skill Mobile Manipulation for Object Rearrangement

1 code implementation6 Sep 2022 Jiayuan Gu, Devendra Singh Chaplot, Hao Su, Jitendra Malik

To tackle the entire task, prior work chains multiple stationary manipulation skills with a point-goal navigation skill, which are learned individually on subtasks.

Object

Deep Feedback Inverse Problem Solver

no code implementations ECCV 2020 Wei-Chiu Ma, Shenlong Wang, Jiayuan Gu, Sivabalan Manivasagam, Antonio Torralba, Raquel Urtasun

Specifically, at each iteration, the neural network takes the feedback as input and outputs an update on the current estimation.

Pose Estimation

Compositionally Generalizable 3D Structure Prediction

1 code implementation4 Dec 2020 Songfang Han, Jiayuan Gu, Kaichun Mo, Li Yi, Siyu Hu, Xuejin Chen, Hao Su

However, there remains a much more difficult and under-explored issue on how to generalize the learned skills over unseen object categories that have very different shape geometry distributions.

3D Shape Reconstruction Object +2

Towards Scale-Invariant Graph-related Problem Solving by Iterative Homogeneous GNNs

no code implementations NeurIPS 2020 Hao Tang, Zhiao Huang, Jiayuan Gu, Bao-liang Lu, Hao Su

Current graph neural networks (GNNs) lack generalizability with respect to scales (graph sizes, graph diameters, edge weights, etc..) when solving many graph analysis problems.

Towards Scale-Invariant Graph-related Problem Solving by Iterative Homogeneous Graph Neural Networks

1 code implementation26 Oct 2020 Hao Tang, Zhiao Huang, Jiayuan Gu, Bao-liang Lu, Hao Su

Current graph neural networks (GNNs) lack generalizability with respect to scales (graph sizes, graph diameters, edge weights, etc..) when solving many graph analysis problems.

Weakly-supervised 3D Shape Completion in the Wild

no code implementations ECCV 2020 Jiayuan Gu, Wei-Chiu Ma, Sivabalan Manivasagam, Wenyuan Zeng, ZiHao Wang, Yuwen Xiong, Hao Su, Raquel Urtasun

3D shape completion for real data is important but challenging, since partial point clouds acquired by real-world sensors are usually sparse, noisy and unaligned.

Point Cloud Registration Pose Estimation

Multi-view PointNet for 3D Scene Understanding

no code implementations30 Sep 2019 Maximilian Jaritz, Jiayuan Gu, Hao Su

Fusion of 2D images and 3D point clouds is important because information from dense images can enhance sparse point clouds.

3D Instance Segmentation 3D Semantic Segmentation +2

Towards Understanding Learning Representations: To What Extent Do Different Neural Networks Learn the Same Representation

1 code implementation NeurIPS 2018 Liwei Wang, Lunjia Hu, Jiayuan Gu, Yue Wu, Zhiqiang Hu, Kun He, John Hopcroft

The theory gives a complete characterization of the structure of neuron activation subspace matches, where the core concepts are maximum match and simple match which describe the overall and the finest similarity between sets of neurons in two networks respectively.

Learning Region Features for Object Detection

no code implementations ECCV 2018 Jiayuan Gu, Han Hu, Li-Wei Wang, Yichen Wei, Jifeng Dai

While most steps in the modern object detection methods are learnable, the region feature extraction step remains largely hand-crafted, featured by RoI pooling methods.

Object object-detection +1

Relation Networks for Object Detection

6 code implementations CVPR 2018 Han Hu, Jiayuan Gu, Zheng Zhang, Jifeng Dai, Yichen Wei

Although it is well believed for years that modeling relations between objects would help object recognition, there has not been evidence that the idea is working in the deep learning era.

Object object-detection +3

Cannot find the paper you are looking for? You can Submit a new open access paper.