Search Results for author: Guyue Zhou

Found 44 papers, 29 papers with code

ADAPT: Action-aware Driving Caption Transformer

1 code implementation1 Feb 2023 Bu Jin, Xinyu Liu, Yupeng Zheng, Pengfei Li, Hao Zhao, Tong Zhang, Yuhang Zheng, Guyue Zhou, Jingjing Liu

To bridge the gap, we propose an end-to-end transformer-based architecture, ADAPT (Action-aware Driving cAPtion Transformer), which provides user-friendly natural language narrations and reasoning for each decision making step of autonomous vehicular control and action.

Autonomous Driving Decision Making

SNAKE: Shape-aware Neural 3D Keypoint Field

1 code implementation3 Jun 2022 Chengliang Zhong, Peixing You, Xiaoxue Chen, Hao Zhao, Fuchun Sun, Guyue Zhou, Xiaodong Mu, Chuang Gan, Wenbing Huang

Detecting 3D keypoints from point clouds is important for shape reconstruction, while this work investigates the dual question: can shape reconstruction benefit 3D keypoint detection?

Keypoint Detection

LODE: Locally Conditioned Eikonal Implicit Scene Completion from Sparse LiDAR

1 code implementation27 Feb 2023 Pengfei Li, Ruowen Zhao, Yongliang Shi, Hao Zhao, Jirui Yuan, Guyue Zhou, Ya-Qin Zhang

In this paper, we propose a novel Eikonal formulation that conditions the implicit representation on localized shape priors which function as dense boundary value constraints, and demonstrate it works on SemanticKITTI and SemanticPOSS.

Autonomous Driving Representation Learning

VIBUS: Data-efficient 3D Scene Parsing with VIewpoint Bottleneck and Uncertainty-Spectrum Modeling

1 code implementation20 Oct 2022 Beiwen Tian, Liyi Luo, Hao Zhao, Guyue Zhou

In the first stage, we perform self-supervised representation learning on unlabeled points with the proposed Viewpoint Bottleneck loss function.

Representation Learning Scene Parsing

TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation

1 code implementation19 Oct 2022 Pengfei Li, Beiwen Tian, Yongliang Shi, Xiaoxue Chen, Hao Zhao, Guyue Zhou, Ya-Qin Zhang

As such, we study the challenging problem of task oriented detection, which aims to find objects that best afford an action indicated by verbs like sit comfortably on.

Instance Segmentation Referring Expression +2

Semi-supervised Implicit Scene Completion from Sparse LiDAR

1 code implementation29 Nov 2021 Pengfei Li, Yongliang Shi, Tianyu Liu, Hao Zhao, Guyue Zhou, Ya-Qin Zhang

Recent advances show that semi-supervised implicit representation learning can be achieved through physical constraints like Eikonal equations.

Representation Learning

Delving into Shape-aware Zero-shot Semantic Segmentation

1 code implementation CVPR 2023 Xinyu Liu, Beiwen Tian, Zhen Wang, Rui Wang, Kehua Sheng, Bo Zhang, Hao Zhao, Guyue Zhou

Thanks to the impressive progress of large-scale vision-language pretraining, recent recognition models can classify arbitrary objects in a zero-shot and open-set manner, with a surprisingly high accuracy.

Image Segmentation Segmentation +2

3D Implicit Transporter for Temporally Consistent Keypoint Discovery

1 code implementation ICCV 2023 Chengliang Zhong, Yuhang Zheng, Yupeng Zheng, Hao Zhao, Li Yi, Xiaodong Mu, Ling Wang, Pengfei Li, Guyue Zhou, Chao Yang, Xinliang Zhang, Jian Zhao

To address this issue, the Transporter method was introduced for 2D data, which reconstructs the target frame from the source frame to incorporate both spatial and temporal information.

DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection

1 code implementation ICCV 2023 Huan-ang Gao, Beiwen Tian, Pengfei Li, Hao Zhao, Guyue Zhou

While this paradigm is natural for image-level or pixel-level prediction, adapting it to the detection problem is challenged by the issue of proposal matching.

3D Object Detection object-detection +1

PQ-Transformer: Jointly Parsing 3D Objects and Layouts from Point Clouds

1 code implementation12 Sep 2021 Xiaoxue Chen, Hao Zhao, Guyue Zhou, Ya-Qin Zhang

Such a scheme has two limitations: 1) Storing and running several networks for different tasks are expensive for typical robotic platforms.

object-detection Object Detection +2

PAD: A Dataset and Benchmark for Pose-agnostic Anomaly Detection

1 code implementation NeurIPS 2023 Qiang Zhou, Weize Li, Lihan Jiang, Guoliang Wang, Guyue Zhou, Shanghang Zhang, Hao Zhao

Furthermore, we provide an open-source benchmark library, including dataset and baseline methods that cover 8 anomaly detection paradigms, to facilitate future research and application in this domain.

4k Anomaly Detection

STRAP: Structured Object Affordance Segmentation with Point Supervision

1 code implementation17 Apr 2023 Leiyao Cui, Xiaoxue Chen, Hao Zhao, Guyue Zhou, Yixin Zhu

By label affinity, we refer to affordance segmentation as a multi-label prediction problem: A plate can be both holdable and containable.

Object Scene Understanding

Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing

1 code implementation CVPR 2022 Xiaoxue Chen, Tianyu Liu, Hao Zhao, Guyue Zhou, Ya-Qin Zhang

Multi-task indoor scene understanding is widely considered as an intriguing formulation, as the affinity of different tasks may lead to improved performance.

Attribute Scene Understanding +2

When to Trust Your Simulator: Dynamics-Aware Hybrid Offline-and-Online Reinforcement Learning

1 code implementation27 Jun 2022 Haoyi Niu, Shubham Sharma, Yiwen Qiu, Ming Li, Guyue Zhou, Jianming Hu, Xianyuan Zhan

This brings up a new question: is it possible to combine learning from limited real data in offline RL and unrestricted exploration through imperfect simulators in online RL to address the drawbacks of both approaches?

Offline RL reinforcement-learning +1

LATITUDE: Robotic Global Localization with Truncated Dynamic Low-pass Filter in City-scale NeRF

1 code implementation18 Sep 2022 Zhenxin Zhu, Yuantao Chen, Zirui Wu, Chao Hou, Yongliang Shi, Chuxuan Li, Pengfei Li, Hao Zhao, Guyue Zhou

In this paper, we present LATITUDE: Global Localization with Truncated Dynamic Low-pass Filter, which introduces a two-stage localization mechanism in city-scale NeRF.

Pose Prediction

Language-guided Semantic Style Transfer of 3D Indoor Scenes

1 code implementation16 Aug 2022 Bu Jin, Beiwen Tian, Hao Zhao, Guyue Zhou

We address the new problem of language-guided semantic style transfer of 3D indoor scenes.

Style Transfer

Distance-Aware Occlusion Detection with Focused Attention

1 code implementation23 Aug 2022 Yang Li, Yucheng Tu, Xiaoxue Chen, Hao Zhao, Guyue Zhou

In this work, (1) we propose a novel three-decoder architecture as the infrastructure for focused attention; 2) we use the generalized intersection box prediction task to effectively guide our model to focus on occlusion-specific regions; 3) our model achieves a new state-of-the-art performance on distance-aware relationship detection.

Human-Object Interaction Detection Relationship Detection +1

DPF: Learning Dense Prediction Fields with Weak Supervision

1 code implementation CVPR 2023 Xiaoxue Chen, Yuhang Zheng, Yupeng Zheng, Qiang Zhou, Hao Zhao, Guyue Zhou, Ya-Qin Zhang

We showcase the effectiveness of DPFs using two substantially different tasks: high-level semantic parsing and low-level intrinsic image decomposition.

Intrinsic Image Decomposition Scene Understanding +1

Understanding Embodied Reference with Touch-Line Transformer

1 code implementation11 Oct 2022 Yang Li, Xiaoxue Chen, Hao Zhao, Jiangtao Gong, Guyue Zhou, Federico Rossano, Yixin Zhu

Human studies have revealed that objects referred to or pointed to do not lie on the elbow-wrist line, a common misconception; instead, they lie on the so-called virtual touch line.

Car-Studio: Learning Car Radiance Fields from Single-View and Endless In-the-wild Images

1 code implementation26 Jul 2023 Tianyu Liu, Hao Zhao, Yang Yu, Guyue Zhou, Ming Liu

However, previous studies learned within a sequence of autonomous driving datasets, resulting in unsatisfactory blurring when rotating the car in the simulator.

Autonomous Driving

Planning Assembly Sequence with Graph Transformer

1 code implementation11 Oct 2022 Lin Ma, Jiangtao Gong, Hao Xu, Hao Chen, Hao Zhao, Wenbing Huang, Guyue Zhou

In this paper, we present a graph-transformer based framework for the ASP problem which is trained and demonstrated on a self-collected ASP database.

A Comprehensive Survey of Cross-Domain Policy Transfer for Embodied Agents

1 code implementation7 Feb 2024 Haoyi Niu, Jianming Hu, Guyue Zhou, Xianyuan Zhan

Consequently, researchers often resort to data from easily accessible source domains, such as simulation and laboratory environments, for cost-effective data acquisition and rapid model iteration.

ECT: Fine-grained Edge Detection with Learned Cause Tokens

1 code implementation6 Aug 2023 Shaocong Xu, Xiaoxue Chen, Yuhang Zheng, Guyue Zhou, Yurong Chen, Hongbin Zha, Hao Zhao

To address these three issues, we propose a two-stage transformer-based network sequentially predicting generic edges and fine-grained edges, which has a global receptive field thanks to the attention mechanism.

Edge Detection

Beyond SIFT using Binary features for Loop Closure Detection

no code implementations18 Sep 2017 Lei Han, Guyue Zhou, Lan Xu, Lu Fang

The proposed system originates from our previous work Multi-Index hashing for Loop closure Detection (MILD), which employs Multi-Index Hashing (MIH)~\cite{greene1994multi} for Approximate Nearest Neighbor (ANN) search of binary features.

Loop Closure Detection

Utilizing High-level Visual Feature for Indoor Shopping Mall Navigation

no code implementations6 Oct 2016 Ziwei Xu, Haitian Zheng, Minjian Pang, Yangchun Zhu, Xiongfei Su, Guyue Zhou, Lu Fang

Towards robust and convenient indoor shopping mall navigation, we propose a novel learning-based scheme to utilize the high-level visual information from the storefront images captured by personal devices of users.

Visual Navigation Vocal Bursts Intensity Prediction

FlyCap: Markerless Motion Capture Using Multiple Autonomous Flying Cameras

no code implementations29 Oct 2016 Lan Xu, Lu Fang, Wei Cheng, Kaiwen Guo, Guyue Zhou, Qionghai Dai, Yebin Liu

We propose a novel non-rigid surface registration method to track and fuse the depth of the three flying cameras for surface motion tracking of the moving target, and simultaneously calculate the pose of each flying camera.

Markerless Motion Capture Visual Odometry

Discriminator-Guided Model-Based Offline Imitation Learning

no code implementations1 Jul 2022 Wenjia Zhang, Haoran Xu, Haoyi Niu, Peng Cheng, Ming Li, Heming Zhang, Guyue Zhou, Xianyuan Zhan

In this paper, we propose the Discriminator-guided Model-based offline Imitation Learning (DMIL) framework, which introduces a discriminator to simultaneously distinguish the dynamics correctness and suboptimality of model rollout data against real expert demonstrations.

Imitation Learning

City-scale Incremental Neural Mapping with Three-layer Sampling and Panoptic Representation

no code implementations28 Sep 2022 Yongliang Shi, Runyi Yang, Pengfei Li, Zirui Wu, Hao Zhao, Guyue Zhou

Neural implicit representations are drawing a lot of attention from the robotics community recently, as they are expressive, continuous and compact.

A High Fidelity Simulation Framework for Potential Safety Benefits Estimation of Cooperative Pedestrian Perception

no code implementations17 Oct 2022 Longrui Chen, Yan Zhang, Wenjie Jiang, Jiangtao Gong, Jiahao Shen, Mengdi Chu, Chuxuan Li, Yifeng Pan, Yifeng Shi, Nairui Luo, Xu Gao, Jirui Yuan, Guyue Zhou, Yaqin Zhang

This paper proposes a high-fidelity simulation framework that can estimate the potential safety benefits of vehicle-to-infrastructure (V2I) pedestrian safety strategies.

Self-Aligning Depth-regularized Radiance Fields for Asynchronous RGB-D Sequences

no code implementations14 Nov 2022 Yuxin Huang, Andong Yang, Zirui Wu, Yuantao Chen, Runyi Yang, Zhenxin Zhu, Chao Hou, Hao Zhao, Guyue Zhou

It has been shown that learning radiance fields with depth rendering and depth supervision can effectively promote the quality and convergence of view synthesis.

Autonomous Driving Benchmarking

H2O+: An Improved Framework for Hybrid Offline-and-Online RL with Dynamics Gaps

no code implementations22 Sep 2023 Haoyi Niu, Tianying Ji, Bingqi Liu, Haocheng Zhao, Xiangyu Zhu, Jianying Zheng, Pengfei Huang, Guyue Zhou, Jianming Hu, Xianyuan Zhan

Solving real-world complex tasks using reinforcement learning (RL) without high-fidelity simulation environments or large amounts of offline data can be quite challenging.

Offline RL Reinforcement Learning (RL)

ASSIST: Interactive Scene Nodes for Scalable and Realistic Indoor Simulation

no code implementations10 Nov 2023 Zhide Zhong, Jiakai Cao, Songen Gu, Sirui Xie, Weibo Gao, Liyi Luo, Zike Yan, Hao Zhao, Guyue Zhou

We present ASSIST, an object-wise neural radiance field as a panoptic representation for compositional and realistic simulation.

Panoptic Segmentation

Latency-aware Road Anomaly Segmentation in Videos: A Photorealistic Dataset and New Metrics

no code implementations10 Jan 2024 Beiwen Tian, Huan-ang Gao, Leiyao Cui, Yupeng Zheng, Lan Luo, Baofeng Wang, Rong Zhi, Guyue Zhou, Hao Zhao

We believe the latter is valuable as it measures whether an anomaly segmentation algorithm can truly prevent a car from crashing in a temporally informed setting.

Autonomous Driving Benchmarking +2

Adaptive Surface Normal Constraint for Geometric Estimation from Monocular Images

no code implementations8 Feb 2024 Xiaoxiao Long, Yuhang Zheng, Yupeng Zheng, Beiwen Tian, Cheng Lin, Lingjie Liu, Hao Zhao, Guyue Zhou, Wenping Wang

We introduce a novel approach to learn geometries such as depth and surface normal from images while incorporating geometric context.

Depth Estimation

More Than Routing: Joint GPS and Route Modeling for Refine Trajectory Representation Learning

no code implementations25 Feb 2024 Zhipeng Ma, Zheyan Tu, Xinhai Chen, Yan Zhang, Deguo Xia, Guyue Zhou, Yilun Chen, Yu Zheng, Jiangtao Gong

The experimental results demonstrate that JGRM outperforms existing methods in both road segment representation and trajectory representation tasks.

Representation Learning

PreAfford: Universal Affordance-Based Pre-Grasping for Diverse Objects and Environments

no code implementations4 Apr 2024 Kairui Ding, Boyuan Chen, Ruihai Wu, Yuyang Li, Zongzheng Zhang, Huan-ang Gao, Siqi Li, Yixin Zhu, Guyue Zhou, Hao Dong, Hao Zhao

Robotic manipulation of ungraspable objects with two-finger grippers presents significant challenges due to the paucity of graspable features, while traditional pre-grasping techniques, which rely on repositioning objects and leveraging external aids like table edges, lack the adaptability across object categories and scenes.

Object

Cannot find the paper you are looking for? You can Submit a new open access paper.