Search Results for author: Sanyuan Zhao

Found 10 papers, 4 papers with code

World knowledge-enhanced Reasoning Using Instruction-guided Interactor in Autonomous Driving

no code implementations9 Dec 2024 Mingliang Zhai, Cheng Li, Zengyuan Guo, Ningrui Yang, Xiameng Qin, Sanyuan Zhao, Junyu Han, Ji Tao, Yuwei Wu, Yunde Jia

The Multi-modal Large Language Models (MLLMs) with extensive world knowledge have revitalized autonomous driving, particularly in reasoning tasks within perceivable regions.

Autonomous Driving World Knowledge

RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception

1 code implementation15 Jul 2024 ChunLiang Li, Wencheng Han, Junbo Yin, Sanyuan Zhao, Jianbing Shen

Concurrent processing of multiple autonomous driving 3D perception tasks within the same spatiotemporal scene poses a significant challenge, in particular due to the computational inefficiencies and feature competition between tasks when using traditional multi-task learning approaches.

3D Lane Detection 3D Object Detection +3

TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision

no code implementations6 Jun 2023 Yukun Zhai, Xiaoqiang Zhang, Xiameng Qin, Sanyuan Zhao, Xingping Dong, Jianbing Shen

End-to-end text spotting is a vital computer vision task that aims to integrate scene text detection and recognition into a unified framework.

Decoder Scene Text Detection +2

Generalized Few-Shot 3D Object Detection of LiDAR Point Cloud for Autonomous Driving

no code implementations8 Feb 2023 Jiawei Liu, Xingping Dong, Sanyuan Zhao, Jianbing Shen

To achieve simultaneous detection for both common and rare objects, we propose a novel task, called generalized few-shot 3D object detection, where we have a large amount of training data for common (base) objects, but only a few data for rare (novel) classes.

3D Object Detection Autonomous Driving +1

Self-Learning with Rectification Strategy for Human Parsing

no code implementations CVPR 2020 Tao Li, Zhiyuan Liang, Sanyuan Zhao, Jiahao Gong, Jianbing Shen

For the global error, we first transform category-wise features into a high-level graph model with coarse-grained structural information, and then decouple the high-level graph to reconstruct the category features.

Human Parsing Self-Learning

Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection

1 code implementation ECCV 2018 Hongmei Song, Wenguan Wang, Sanyuan Zhao, Jianbing Shen, Kin-Man Lam

This paper proposes a fast video salient object detection model, based on a novel recurrent network architecture, named Pyramid Dilated Bidirectional ConvLSTM (PDB-ConvLSTM).

 Ranked #1 on Video Salient Object Detection on UVSD (using extra training data)

Object object-detection +5

Improved Face Detection and Alignment using Cascade Deep Convolutional Network

no code implementations28 Jul 2017 Weilin Cong, Sanyuan Zhao, Hui Tian, Jianbing Shen

Real-world face detection and alignment demand an advanced discriminative model to address challenges by pose, lighting and expression.

Face Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.