Search Results for author: Peng Jia

Found 45 papers, 8 papers with code

PosePilot: Steering Camera Pose for Generative World Models with Self-supervised Depth

no code implementations3 May 2025 Bu Jin, Weize Li, Baihan Yang, Zhenxin Zhu, Junpeng Jiang, Huan-ang Gao, Haiyang Sun, Kun Zhan, Hengtong Hu, Xueyang Zhang, Peng Jia, Hao Zhao

In this paper, we introduce PosePilot, a lightweight yet powerful framework that significantly enhances camera pose controllability in generative world models.

Autonomous Driving Camera Pose Estimation +3

Adaptive Detection of Fast Moving Celestial Objects Using a Mixture of Experts and Physical-Inspired Neural Network

no code implementations10 Apr 2025 Peng Jia, Ge Li, Bafeng Cheng, Yushan Li, Rongyu Sun

However, the growing prevalence of space-based telescopes, along with their diverse observational modes, produces images with different properties, rendering conventional methods less effective.

Mixture-of-Experts object-detection +1

TokenFLEX: Unified VLM Training for Flexible Visual Tokens Inference

no code implementations4 Apr 2025 Junshan Hu, Jialiang Mao, Zhikang Liu, Zhongpu Xia, Peng Jia, Xianpeng Lang

Conventional Vision-Language Models(VLMs) typically utilize a fixed number of vision tokens, regardless of task complexity.

Large Language Model

StyledStreets: Multi-style Street Simulator with Spatial and Temporal Consistency

no code implementations27 Mar 2025 Yuyin Chen, Yida Wang, Xueyang Zhang, Kun Zhan, Peng Jia, Yifei Zhan, Xianpeng Lang

Urban scene reconstruction requires modeling both static infrastructure and dynamic elements while supporting diverse environmental conditions.

Finetuning Generative Trajectory Model with Reinforcement Learning from Human Feedback

no code implementations13 Mar 2025 Derun Li, Jianwei Ren, Yue Wang, Xin Wen, Pengxiang Li, Leimeng Xu, Kun Zhan, Zhongpu Xia, Peng Jia, Xianpeng Lang, Ningyi Xu, Hang Zhao

To address this, we introduce TrajHF, a human feedback-driven finetuning framework for generative trajectory models, designed to align motion planning with diverse driving preferences.

Imitation Learning Motion Planning +1

UniPLV: Towards Label-Efficient Open-World 3D Scene Understanding by Regional Visual Language Supervision

no code implementations24 Dec 2024 Yuru Wang, Songtao Wang, Zehan Zhang, Xinyan Lu, Changwei Cai, Hao Li, Fu Liu, Peng Jia, Xianpeng Lang

We present UniPLV, a powerful framework that unifies point clouds, images and text in a single learning paradigm for open-world 3D scene understanding.

Scene Understanding Semantic Segmentation

GaussianAD: Gaussian-Centric End-to-End Autonomous Driving

1 code implementation13 Dec 2024 Wenzhao Zheng, Junjie Wu, Yao Zheng, Sicheng Zuo, Zixun Xie, Longchao Yang, Yong Pan, Zhihui Hao, Peng Jia, Xianpeng Lang, Shanghang Zhang

We initialize the scene with uniform 3D Gaussians and use surrounding-view images to progressively refine them to obtain the 3D Gaussian scene representation.

Autonomous Driving Decision Making +1

DAT: Dialogue-Aware Transformer with Modality-Group Fusion for Human Engagement Estimation

1 code implementation11 Oct 2024 Jia Li, Yangchen Yu, Yin Chen, Yu Zhang, Peng Jia, Yunbo Xu, Ziqiang Li, Meng Wang, Richang Hong

Engagement estimation plays a crucial role in understanding human social behaviors, attracting increasing research interests in fields such as affective computing and human-computer interaction.

DiVE: DiT-based Video Generation with Enhanced Control

no code implementations3 Sep 2024 Junpeng Jiang, Gangyi Hong, Lijun Zhou, Enhui Ma, Hengtong Hu, Xia Zhou, Jie Xiang, Fan Liu, Kaicheng Yu, Haiyang Sun, Kun Zhan, Peng Jia, Miao Zhang

Generating high-fidelity, temporally consistent videos in autonomous driving scenarios faces a significant challenge, e. g. problematic maneuvers in corner cases.

Autonomous Driving Video Generation

RMFA-Net: A Neural ISP for Real RAW to RGB Image Reconstruction

no code implementations17 Jun 2024 Fei Li, Wenbo Hou, Peng Jia

It is demonstrated that RMFA-Net outperforms previous algorithms, achieving a PSNR score of over 25 dB, surpassing the state-of-the-art by +1 dB.

Image Reconstruction Tone Mapping

S2-Track: A Simple yet Strong Approach for End-to-End 3D Multi-Object Tracking

no code implementations4 Jun 2024 Tao Tang, Lijun Zhou, Pengkun Hao, Zihang He, Kalok Ho, Shuo Gu, Zhihui Hao, Haiyang Sun, Kun Zhan, Peng Jia, Xianpeng Lang, Xiaodan Liang

In this paper, we first summarize the current end-to-end 3D MOT framework by decomposing it into three constituent parts: query initialization, query propagation, and query matching.

3D Multi-Object Tracking Autonomous Driving +3

Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation

no code implementations3 Jun 2024 Enhui Ma, Lijun Zhou, Tao Tang, Zhan Zhang, Dong Han, Junpeng Jiang, Kun Zhan, Peng Jia, Xianpeng Lang, Haiyang Sun, Di Lin, Kaicheng Yu

Instead of randomly generating new data, we further design a sampling policy to let Delphi generate new data that are similar to those failure cases to improve the sample efficiency.

Autonomous Driving Video Generation

A Versatile Framework for Analyzing Galaxy Image Data by Implanting Human-in-the-loop on a Large Vision Model

no code implementations17 May 2024 Mingxiang Fu, Yu Song, Jiameng Lv, Liang Cao, Peng Jia, Nan Li, Xiangru Li, Jifeng Liu, A-Li Luo, Bo Qiu, Shiyin Shen, Liangping Tu, Lili Wang, Shoulin Wei, Haifeng Yang, Zhenping Yi, Zhiqiang Zou

Hence, as an example to present how to overcome the issue, we built a framework for general analysis of galaxy images, based on a large vision model (LVM) plus downstream tasks (DST), including galaxy morphological classification, image restoration, object detection, parameter extraction, and more.

Astronomy Few-Shot Learning +4

An Image Quality Evaluation and Masking Algorithm Based On Pre-trained Deep Neural Networks

no code implementations6 May 2024 Peng Jia, Yu Song, Jiameng Lv, Runyu Ning

With the growing amount of astronomical data, there is an increasing need for automated data processing pipelines, which can extract scientific information from observation data without human interventions.

Motion planning for off-road autonomous driving based on human-like cognition and weight adaptation

no code implementations27 Apr 2024 YuChun Wang, Cheng Gong, Jianwei Gong, Peng Jia

Then, based on human-like generated trajectories in different environments, we design a primitive-based trajectory planner that aims to mimic human trajectories and cost weight selection, generating trajectories that are consistent with the dynamics of off-road vehicles.

Autonomous Driving Motion Planning

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

1 code implementation28 Mar 2024 Bu Jin, Yupeng Zheng, Pengfei Li, Weize Li, Yuhang Zheng, Sujie Hu, Xinyu Liu, Jinwei Zhu, Zhijie Yan, Haiyang Sun, Kun Zhan, Peng Jia, Xiaoxiao Long, Yilun Chen, Hao Zhao

However, the exploration of 3D dense captioning in outdoor scenes is hindered by two major challenges: 1) the domain gap between indoor and outdoor scenes, such as dynamics and sparse visual inputs, makes it difficult to directly adapt existing indoor methods; 2) the lack of data with comprehensive box-caption pair annotations specifically tailored for outdoor scenes.

3D dense captioning Dense Captioning

DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models

no code implementations19 Feb 2024 Xiaoyu Tian, Junru Gu, Bailin Li, Yicheng Liu, Yang Wang, Zhiyong Zhao, Kun Zhan, Peng Jia, Xianpeng Lang, Hang Zhao

A primary hurdle of autonomous driving in urban environments is understanding complex and long-tail scenarios, such as challenging road conditions and delicate human behaviors.

Autonomous Driving Scene Understanding +1

PC-NeRF: Parent-Child Neural Radiance Fields Using Sparse LiDAR Frames in Autonomous Driving Environments

1 code implementation14 Feb 2024 Xiuzhong Hu, Guangming Xiong, Zheng Zang, Peng Jia, Yuxuan Han, Junyi Ma

With extensive experiments, PC-NeRF is proven to achieve high-precision novel LiDAR view synthesis and 3D reconstruction in large-scale scenes.

3D Reconstruction 3D Scene Reconstruction +3

Guidelines in Wastewater-based Epidemiology of SARS-CoV-2 with Diagnosis

no code implementations26 Dec 2023 Madiha Fatima, Zhihua Cao, Aichun Huang, Shengyuan Wu, Xinxian Fan, Yi Wang, Liu Jiren, Ziyun Zhu, Qiongrou Ye, Yuan Ma, Joseph K. F Chow, Peng Jia, Yangshou Liu, Yubin Lin, Manjun Ye, Tong Wu, ZHIXUN LI, Cong Cai, Wenhai Zhang, Cheris H. Q. Ding, Yuanzhe Cai, Feijuan Huang

With the global spread and increasing transmission rate of SARS-CoV-2, more and more laboratories and researchers are turning their attention to wastewater-based epidemiology (WBE), hoping it can become an effective tool for large-scale testing and provide more ac-curate predictions of the number of infected individuals.

Diagnostic Epidemiology

Perception of Misalignment States for Sky Survey Telescopes with the Digital Twin and the Deep Neural Networks

no code implementations30 Nov 2023 Miao Zhang, Peng Jia, Zhengyang Li, Wennan Xiang, Jiameng Lv, Rui Sun

To address this, we need a method to obtain misalignment states, aiding in the reconstruction of accurate point spread functions for data processing methods or facilitating adjustments of optical components for improved image quality.

Astronomy

Target Detection Framework for Lobster Eye X-Ray Telescopes with Machine Learning Algorithms

no code implementations11 Dec 2022 Peng Jia, Wenbo Liu, YuAn Liu, Haiwu Pan

Then an algorithm based on morphological operations and two neural networks would be used to detect candidates of celestial objects with different flux from these 2D images.

Detection of Strongly Lensed Arcs in Galaxy Clusters with Transformers

no code implementations11 Nov 2022 Peng Jia, Ruiqi Sun, Nan Li, Yu Song, Runyu Ning, Hongyan Wei, Rui Luo

We embed prior information of strongly lensed arcs at cluster-scale into the training data through simulation and then train the detection algorithm with simulated images.

Reinforcement Learning for Few-Shot Text Generation Adaptation

1 code implementation22 Nov 2021 Pengsen Cheng, Jinqiao Dai, Jiamiao Liu, Jiayong Liu, Peng Jia

Controlling the generative model to adapt a new domain with limited samples is a difficult challenge and it is receiving increasing attention.

Diversity Domain Adaptation +5

PNet -- A Deep Learning Based Photometry and Astrometry Bayesian Framework

no code implementations28 Jun 2021 Rui Sun, Peng Jia, Yongyang Sun, Zhimin Yang, Qiang Liu, Hongyan Wei

Time domain astronomy has emerged as a vibrant research field in recent years, focusing on celestial objects that exhibit variable magnitudes or positions.

Astronomy Deep Learning +2

Smart obervation method with wide field small aperture telescopes for real time transient detection

no code implementations20 Nov 2020 Peng Jia, Qiang Liu, Yongyang Sun, Yitian Zheng, Wenbo Liu, Yifei Zhao

The ARGUS uses a deep learning based astronomical detection algorithm implemented in embedded devices in each WFSATs to detect astronomical targets.

Ensemble Learning

Compressive Shack-Hartmann Wavefront Sensor based on Deep Neural Networks

no code implementations20 Nov 2020 Peng Jia, Mingyang Ma, Dongmei Cai, Weihua Wang, Juanjuan Li, Can Li

However if there exists strong atmospheric turbulence or the brightness of guide stars is low, the accuracy of wavefront measurements will be affected.

Compressive Sensing Image Deconvolution +1

Data--driven Image Restoration with Option--driven Learning for Big and Small Astronomical Image Datasets

no code implementations7 Nov 2020 Peng Jia, Ruiyu Ning, Ruiqi Sun, Xiaoshan Yang, Dongmei Cai

In recent years, developments of deep neural networks and increments of the number of astronomical images have evoked a lot of data--driven image restoration methods.

Image Restoration

PSF--NET: A Non-parametric Point Spread Function Model for Ground Based Optical Telescopes

no code implementations2 Mar 2020 Peng Jia, Xuebo Wu, Yi Huang, Bojun Cai, Dongmei Cai

Assuming point spread functions induced by the atmospheric turbulence with the same profile belong to the same manifold space, we propose a non-parametric point spread function -- PSF-NET.

Image Restoration

Detection and Classification of Astronomical Targets with Deep Neural Networks in Wide Field Small Aperture Telescopes

no code implementations21 Feb 2020 Peng Jia, Qiang Liu, Yongyang Sun

To increase the generalization ability of our framework, we use both simulated and real observation images to train the neural network.

General Classification Transfer Learning

Point Spread Function Modelling for Wide Field Small Aperture Telescopes with a Denoising Autoencoder

no code implementations31 Jan 2020 Peng Jia, Xiyu Li, Zhengyang Li, Weinan Wang, Dongmei Cai

For wide field small aperture telescopes, the point spread function is hard to model, because it is affected by many different effects and has strong temporal and spatial variations.

Denoising

A systematic review of fuzzing based on machine learning techniques

no code implementations4 Aug 2019 Yan Wang, Peng Jia, Luping Liu, Jiayong Liu

Next, this paper assesses the performance of the machine learning models based on the frequently used evaluation metrics.

BIG-bench Machine Learning

Solar Image Restoration with the Cycle-GAN Based on Multi-Fractal Properties of Texture Features

no code implementations29 Jul 2019 Peng Jia, Yi Huang, Bojun Cai, Dongmei Cai

Texture is one of the most obvious characteristics in solar images and it is normally described by texture features.

Image Restoration

Perception Evaluation -- A new solar image quality metric based on the multi-fractal property of texture features

1 code implementation24 May 2019 Yi Huang, Peng Jia, Dongmei Cai, Bojun Cai

Next-generation ground-based solar observations require good image quality metrics for post-facto processing techniques.

Cannot find the paper you are looking for? You can Submit a new open access paper.