Search Results for author: Xu Cao

Found 41 papers, 21 papers with code

Infrared Small Target Detection in Satellite Videos: A New Dataset and A Novel Recurrent Feature Refinement Framework

1 code implementation19 Sep 2024 Xinyi Ying, Li Liu, Zaipin Lin, Yangsi Shi, Yingqian Wang, Ruojing Li, Xu Cao, Boyang Li, Shilin Zhou

To address the aforementioned challenges, in this paper, we first build a large-scale dataset for MIRST detection in satellite videos (namely IRSatVideo-LEO), and then develop a recurrent feature refinement (RFR) framework as the baseline method.

Motion Compensation Video Generation

TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets

1 code implementation30 Jun 2024 Jintai Chen, Yaojun Hu, Yue Wang, Yingzhou Lu, Xu Cao, Miao Lin, Hongxia Xu, Jian Wu, Cao Xiao, Jimeng Sun, Lucas Glass, Kexin Huang, Marinka Zitnik, Tianfan Fu

Clinical trials are pivotal for developing new medical treatments, yet they typically pose some risks such as patient mortality, adverse events, and enrollment failure that waste immense efforts spanning over a decade.

MM-SpuBench: Towards Better Understanding of Spurious Biases in Multimodal LLMs

no code implementations24 Jun 2024 Wenqian Ye, Guangtao Zheng, Yunsheng Ma, Xu Cao, Bolin Lai, James M. Rehg, Aidong Zhang

Our findings illuminate the persistence of the reliance on spurious correlations from these models and underscore the urge for new methodologies to mitigate spurious biases.

Question Answering Visual Question Answering

What is the Visual Cognition Gap between Humans and Multimodal LLMs?

1 code implementation14 Jun 2024 Xu Cao, Bolin Lai, Wenqian Ye, Yunsheng Ma, Joerg Heintz, Jintai Chen, Jianguo Cao, James M. Rehg

Recently, Multimodal Large Language Models (MLLMs) have shown great promise in language-guided perceptual tasks such as recognition, segmentation, and object detection.

object-detection Object Detection +2

Scaling Multi-Camera 3D Object Detection through Weak-to-Strong Eliciting

1 code implementation10 Apr 2024 Hao Lu, Jiaqi Tang, Xinli Xu, Xu Cao, Yunpeng Zhang, Guoqing Wang, Dalong Du, Hao Chen, Yingcong Chen

Finally, for MC3D-Det joint training, the elaborate dataset merge strategy is designed to solve the problem of inconsistent camera numbers and camera parameters.

3D Object Detection Autonomous Driving +1

Spurious Correlations in Machine Learning: A Survey

1 code implementation20 Feb 2024 Wenqian Ye, Guangtao Zheng, Xu Cao, Yunsheng Ma, Aidong Zhang

Machine learning systems are known to be sensitive to spurious correlations between non-essential features of the inputs (e. g., background, texture, and secondary objects) and the corresponding labels.

Survey

MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene Understanding

1 code implementation CVPR 2024 Xu Cao, Tong Zhou, Yunsheng Ma, Wenqian Ye, Can Cui, Kun Tang, Zhipeng Cao, Kaizhao Liang, Ziran Wang, James M. Rehg, Chao Zheng

Specifically we annotate and leverage large-scale broad-coverage traffic and map data extracted from huge HD map annotations and use CLIP and LLaMA-2 / Vicuna to finetune a baseline model with instruction-following data.

Autonomous Driving Instruction Following +2

If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents

no code implementations1 Jan 2024 Ke Yang, Jiateng Liu, John Wu, Chaoqi Yang, Yi R. Fung, Sha Li, Zixuan Huang, Xu Cao, Xingyao Wang, Yiquan Wang, Heng Ji, ChengXiang Zhai

The prominent large language models (LLMs) of today differ from past language models not only in size, but also in the fact that they are trained on a combination of natural language and formal language (code).

Code Generation

A Survey on Multimodal Large Language Models for Autonomous Driving

1 code implementation21 Nov 2023 Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Yang Zhou, Kaizhao Liang, Jintai Chen, Juanwu Lu, Zichong Yang, Kuei-Da Liao, Tianren Gao, Erlong Li, Kun Tang, Zhipeng Cao, Tong Zhou, Ao Liu, Xinrui Yan, Shuqi Mei, Jianguo Cao, Ziran Wang, Chao Zheng

We first introduce the background of Multimodal Large Language Models (MLLMs), the multimodal models development using LLMs, and the history of autonomous driving.

Autonomous Driving

MACP: Efficient Model Adaptation for Cooperative Perception

1 code implementation25 Oct 2023 Yunsheng Ma, Juanwu Lu, Can Cui, Sicheng Zhao, Xu Cao, Wenqian Ye, Ziran Wang

We approach this objective by identifying the key challenges of shifting from single-agent to cooperative settings, adapting the model by freezing most of its parameters and adding a few lightweight modules.

Receive, Reason, and React: Drive as You Say with Large Language Models in Autonomous Vehicles

no code implementations12 Oct 2023 Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang

The fusion of human-centric design and artificial intelligence (AI) capabilities has opened up new possibilities for next-generation autonomous vehicles that go beyond transportation.

Autonomous Driving Decision Making

PIE: Simulating Disease Progression via Progressive Image Editing

1 code implementation21 Sep 2023 Kaizhao Liang, Xu Cao, Kuei-Da Liao, Tianren Gao, Wenqian Ye, Zhengyu Chen, Jianguo Cao, Tejas Nama, Jimeng Sun

Disease progression simulation is a crucial area of research that has significant implications for clinical diagnosis, prognosis, and treatment.

Empowering In-Browser Deep Learning Inference on Edge Devices with Just-in-Time Kernel Optimizations

no code implementations16 Sep 2023 Fucheng Jia, Shiqi Jiang, Ting Cao, Wei Cui, Tianrui Xia, Xu Cao, Yuanchun Li, Deyu Zhang, Ju Ren, Yunxin Liu, Lili Qiu, Mao Yang

Web is increasingly becoming the primary platform to deliver AI services onto edge devices, making in-browser deep learning (DL) inference more prominent.

Mitigating Transformer Overconfidence via Lipschitz Regularization

1 code implementation12 Jun 2023 Wenqian Ye, Yunsheng Ma, Xu Cao, Kun Tang

Though Transformers have achieved promising results in many computer vision tasks, they tend to be over-confident in predictions, as the standard Dot Product Self-Attention (DPSA) can barely preserve distance for the unbounded input domain.

Learning Remote Sensing Object Detection with Single Point Supervision

1 code implementation23 May 2023 Shitian He, Huanxin Zou, Yingqian Wang, Boyang Li, Xu Cao, Ning Jing

In this paper, we make the first attempt to achieve RS object detection with single point supervision, and propose a PSOD method tailored for RS images.

Object object-detection +1

CEMFormer: Learning to Predict Driver Intentions from In-Cabin and External Cameras via Spatial-Temporal Transformers

no code implementations13 May 2023 Yunsheng Ma, Wenqian Ye, Xu Cao, Amr Abdelraouf, Kyungtae Han, Rohit Gupta, Ziran Wang

Driver intention prediction seeks to anticipate drivers' actions by analyzing their behaviors with respect to surrounding traffic environments.

ECON: Explicit Clothed humans Optimized via Normal integration

1 code implementation CVPR 2023 Yuliang Xiu, Jinlong Yang, Xu Cao, Dimitrios Tzionas, Michael J. Black

To increase robustness for these cases, existing work uses an explicit parametric body model to constrain surface reconstruction, but this limits the recovery of free-form surfaces such as loose clothing that deviates from the body.

3D Human Reconstruction Surface Reconstruction

THMA: Tencent HD Map AI System for Creating HD Map Annotations

no code implementations14 Dec 2022 Kun Tang, Xu Cao, Zhipeng Cao, Tong Zhou, Erlong Li, Ao Liu, Shengtao Zou, Chang Liu, Shuqi Mei, Elena Sizikova, Chao Zheng

THMA has been deployed by the Tencent Map team to provide services to downstream companies and users, serving over 1, 000 labeling workers and producing more than 30, 000 kilometers of HD map data per day at most.

Active Learning Weakly-supervised Learning

ViTASD: Robust Vision Transformer Baselines for Autism Spectrum Disorder Facial Diagnosis

1 code implementation30 Oct 2022 Xu Cao, Wenqian Ye, Elena Sizikova, Xue Bai, Megan Coffee, Hongwu Zeng, Jianguo Cao

Research progress in the field of ASD facial analysis in pediatric patients has been hindered due to a lack of well-established baselines.

Decoder

A Compacted Structure for Cross-domain learning on Monocular Depth and Flow Estimation

no code implementations25 Aug 2022 Yu Chen, Xu Cao, Xiaoyi Lin, Baoru Huang, Xiao-Yun Zhou, Jian-Qing Zheng, Guang-Zhong Yang

A dual-head mechanism is used to predict optical flow for rigid and non-rigid motion based on a divide-and-conquer manner, which significantly improves the optical flow estimation performance.

Autonomous Driving Depth Estimation +1

Improving Computed Tomography (CT) Reconstruction via 3D Shape Induction

1 code implementation23 Aug 2022 Elena Sizikova, Xu Cao, Ashia Lewis, Kenny Moise, Megan Coffee

Chest computed tomography (CT) imaging adds valuable insight in the diagnosis and management of pulmonary infectious diseases, like tuberculosis (TB).

3D Reconstruction Computed Tomography (CT) +3

AggPose: Deep Aggregation Vision Transformer for Infant Pose Estimation

1 code implementation11 May 2022 Xu Cao, Xiaoye Li, Liya Ma, Yi Huang, Xuan Feng, Zening Chen, Hongwu Zeng, Jianguo Cao

We show that AggPose outperforms hybrid model HRFormer and TokenPose in the infant pose estimation dataset.

Keypoint Detection

Normal Integration via Inverse Plane Fitting With Minimum Point-to-Plane Distance

1 code implementation CVPR 2021 Xu Cao, Boxin Shi, Fumio Okura, Yasuyuki Matsushita

Experimental results on analytically computed, synthetic, and real-world surfaces show that our method yields accurate and stable reconstruction for both orthographic and perspective normal maps.

Surface Reconstruction

VeniBot: Towards Autonomous Venipuncture with Automatic Puncture Area and Angle Regression from NIR Images

no code implementations27 May 2021 Xu Cao, Zijie Chen, Bolin Lai, Yuxuan Wang, Yu Chen, Zhengqing Cao, Zhilin Yang, Nanyang Ye, Junbo Zhao, Xiao-Yun Zhou, Peng Qi

For the automation, we focus on the positioning part and propose a Dual-In-Dual-Out network based on two-step learning and two-task learning, which can achieve fully automatic regression of the suitable puncture area and angle from near-infrared(NIR) images.

Navigate regression

Treatment Planning System for Electron FLASH Radiotherapy: Open-source for Clinical Implementation

no code implementations9 Mar 2021 Mahbubur Rahman, M. Ramish Ashraf, David J. Gladstone, Petr Bruza, Lesley A. Jarvis, Philip E. Schaner, Xu Cao, Brian W. Pogue, P. Jack Hoopes, Rongxiao Zhang

eFLASH-RT plans were MC forward calculated in Geant4 for a mouse brain treatment and compared to a conventional (Conv-RT) plan in Eclipse for a human patient with metastatic renal cell carcinoma.

Medical Physics

Balanced Joint Adversarial Training for Robust Intent Detection and Slot Filling

no code implementations COLING 2020 Xu Cao, Deyi Xiong, Chongyang Shi, Chao Wang, Yao Meng, Changjian Hu

Joint intent detection and slot filling has recently achieved tremendous success in advancing the performance of utterance understanding.

Intent Detection slot-filling +1

Stereoscopic Flash and No-Flash Photography for Shape and Albedo Recovery

no code implementations CVPR 2020 Xu Cao, Michael Waechter, Boxin Shi, Ye Gao, Bo Zheng, Yasuyuki Matsushita

From the stereo image pair, we recover a rough shape that captures low-frequency shape variation without high-frequency details.

Shadow Detection

CAggNet: Crossing Aggregation Network for Medical Image Segmentation

no code implementations16 Apr 2020 Xu Cao, Yanghao Lin

In this paper, we present Crossing Aggregation Network (CAggNet), a novel densely connected semantic segmentation approach for medical image analysis.

Image Segmentation Medical Image Analysis +4

UCT-ADP Progressive Bias Algorithm for Solving Gomoku

1 code implementation11 Dec 2019 Xu Cao, Yanghao Lin

We combine Adaptive Dynamic Programming (ADP), a reinforcement learning method and UCB applied to trees (UCT) algorithm with a more powerful heuristic function based on Progressive Bias method and two pruning strategies for a traditional board game Gomoku.

Reinforcement Learning

Neural Style Transfer for Point Clouds

no code implementations14 Mar 2019 Xu Cao, Weimin WANG, Katashi Nagao

How can we edit or transform the geometric or color property of a point cloud?

Style Transfer

Point Cloud Colorization Based on Densely Annotated 3D Shape Dataset

no code implementations12 Oct 2018 Xu Cao, Katashi Nagao

This paper introduces DensePoint, a densely sampled and annotated point cloud dataset containing over 10, 000 single objects across 16 categories, by merging different kind of information from two existing datasets.

Colorization

Cannot find the paper you are looking for? You can Submit a new open access paper.