Search Results for author: Penghao Wu

Found 12 papers, 9 papers with code

V*: Guided Visual Search as a Core Mechanism in Multimodal LLMs

1 code implementation21 Dec 2023 Penghao Wu, Saining Xie

However, the lack of this visual search mechanism in current multimodal LLMs (MLLMs) hinders their ability to focus on important visual details, especially when handling high-resolution and visually crowded images.

Visual Question Answering World Knowledge

End-to-end Autonomous Driving: Challenges and Frontiers

1 code implementation29 Jun 2023 Li Chen, Penghao Wu, Kashyap Chitta, Bernhard Jaeger, Andreas Geiger, Hongyang Li

The autonomous driving community has witnessed a rapid growth in approaches that embrace an end-to-end algorithm framework, utilizing raw sensor input to generate vehicle motion plans, instead of concentrating on individual tasks such as detection and motion prediction.

Autonomous Driving motion prediction

Policy Pre-training for Autonomous Driving via Self-supervised Geometric Modeling

1 code implementation3 Jan 2023 Penghao Wu, Li Chen, Hongyang Li, Xiaosong Jia, Junchi Yan, Yu Qiao

Witnessing the impressive achievements of pre-training techniques on large-scale data in the field of computer vision and natural language processing, we wonder whether this idea could be adapted in a grab-and-go spirit, and mitigate the sample inefficiency problem for visuomotor driving.

Autonomous Driving Decision Making

Inharmonious Region Localization via Recurrent Self-Reasoning

no code implementations5 Oct 2022 Penghao Wu, Li Niu, Jing Liang, Liqing Zhang

Synthetic images created by image editing operations are prevalent, but the color or illumination inconsistency between the manipulated region and background may make it unrealistic.

Clustering Image Harmonization

Inharmonious Region Localization with Auxiliary Style Feature

no code implementations5 Oct 2022 Penghao Wu, Li Niu, Liqing Zhang

Based on the extracted style features, we also propose a novel style voting module to guide the localization of inharmonious region.

Inharmonious Region Localization by Magnifying Domain Discrepancy

1 code implementation30 Sep 2022 Jing Liang, Li Niu, Penghao Wu, Fengjun Guo, Teng Long

Inharmonious region localization aims to localize the region in a synthetic image which is incompatible with surrounding background.

Image Harmonization

ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature Learning

1 code implementation15 Jul 2022 Shengchao Hu, Li Chen, Penghao Wu, Hongyang Li, Junchi Yan, DaCheng Tao

In particular, we propose a spatial-temporal feature learning scheme towards a set of more representative features for perception, prediction and planning tasks simultaneously, which is called ST-P3.

Ranked #7 on Bird's-Eye View Semantic Segmentation on nuScenes (IoU ped - 224x480 - Vis filter. - 100x100 at 0.5 metric)

Autonomous Driving Bird's-Eye View Semantic Segmentation +1

Level 2 Autonomous Driving on a Single Device: Diving into the Devils of Openpilot

no code implementations16 Jun 2022 Li Chen, Tutian Tang, Zhitian Cai, Yang Li, Penghao Wu, Hongyang Li, Jianping Shi, Junchi Yan, Yu Qiao

Equipped with a wide span of sensors, predominant autonomous driving solutions are becoming more modular-oriented for safe system design.

Autonomous Driving

HDGT: Heterogeneous Driving Graph Transformer for Multi-Agent Trajectory Prediction via Scene Encoding

1 code implementation30 Apr 2022 Xiaosong Jia, Penghao Wu, Li Chen, Yu Liu, Hongyang Li, Junchi Yan

Based on these observations, we propose Heterogeneous Driving Graph Transformer (HDGT), a backbone modelling the driving scene as a heterogeneous graph with different types of nodes and edges.

Autonomous Driving graph construction +2

Cannot find the paper you are looking for? You can Submit a new open access paper.