Search Results for author: Caiyan Jia

Found 13 papers, 4 papers with code

GraphBEV: Towards Robust BEV Feature Alignment for Multi-Modal 3D Object Detection

no code implementations • 18 Mar 2024 • Ziying Song, Lei Yang, Shaoqing Xu, Lin Liu, Dongyang Xu, Caiyan Jia, Feiyang Jia, Li Wang

Additionally, we propose a Global Align module to rectify the misalignment between LiDAR and camera BEV features.

3D Object Detection Autonomous Driving +3

Paper
Add Code

Instruction-Guided Scene Text Recognition

no code implementations • 31 Jan 2024 • Yongkun Du, Zhineng Chen, Yuchen Su, Caiyan Jia, Yu-Gang Jiang

Multi-modal models have shown appealing performance in visual tasks recently, as instruction-guided training has evoked the ability to understand fine-grained visual content.

Scene Text Recognition

Paper
Add Code

Robustness-Aware 3D Object Detection in Autonomous Driving: A Review and Outlook

no code implementations • 12 Jan 2024 • Ziying Song, Lin Liu, Feiyang Jia, Yadan Luo, Guoxin Zhang, Lei Yang, Li Wang, Caiyan Jia

In the realm of modern autonomous driving, the perception system is indispensable for accurately assessing the state of the surrounding environment, thereby enabling informed prediction and planning.

3D Object Detection Autonomous Driving +2

Paper
Add Code

RoboFusion: Towards Robust Multi-Modal 3D Object Detection via SAM

1 code implementation • 8 Jan 2024 • Ziying Song, Guoxing Zhang, Lin Liu, Lei Yang, Shaoqing Xu, Caiyan Jia, Feiyang Jia, Li Wang

To align SAM or SAM-AD with multi-modal methods, we then introduce AD-FPN for upsampling the image features extracted by SAM.

3D Object Detection Autonomous Driving +2

Paper
Code

VoxelNextFusion: A Simple, Unified and Effective Voxel Fusion Framework for Multi-Modal 3D Object Detection

no code implementations • 5 Jan 2024 • Ziying Song, Guoxin Zhang, Jun Xie, Lin Liu, Caiyan Jia, Shaoqing Xu, Zhepeng Wang

In particular, we propose a voxel-based image pipeline that involves projecting point clouds onto images to obtain both pixel- and patch-level features.

3D Object Detection Feature Importance +2

Paper
Add Code

VGA: Vision and Graph Fused Attention Network for Rumor Detection

no code implementations • 3 Jan 2024 • Lin Bai, Caiyan Jia, Ziying Song, Chaoqun Cui

Moreover, these methods usually only extract visual features in a basic manner, seldom consider tampering or textual information in images.

Paper
Add Code

GraphAlign: Enhancing Accurate Feature Alignment by Graph matching for Multi-Modal 3D Object Detection

no code implementations • ICCV 2023 • Ziying Song, Haiyue Wei, Lin Bai, Lei Yang, Caiyan Jia

Through the projection calibration between the image and point cloud, we project the nearest neighbors of point cloud features onto the image features.

3D Object Detection Autonomous Driving +3

Paper
Add Code

Context Perception Parallel Decoder for Scene Text Recognition

1 code implementation • 23 Jul 2023 • Yongkun Du, Zhineng Chen, Caiyan Jia, Xiaoting Yin, Chenxia Li, Yuning Du, Yu-Gang Jiang

We first present an empirical study of AR decoding in STR, and discover that the AR decoder not only models linguistic context, but also provides guidance on visual context perception.

Ranked #1 on Scene Text Recognition on CUTE80 (using extra training data)

Decoder Language Modelling +1

38,644

Paper
Code

Unsupervised Cross-Domain Rumor Detection with Contrastive Learning and Cross-Attention

no code implementations • 20 Mar 2023 • Hongyan Ran, Caiyan Jia

Moreover, we use a cross-attention mechanism on a pair of source data and target data with the same labels to learn domain-invariant representations.

Contrastive Learning

Paper
Add Code

Deep Embedded Clustering with Distribution Consistency Preservation for Attributed Networks

1 code implementation • 28 May 2022 • Yimei Zheng, Caiyan Jia, Jian Yu, Xuanya Li

Under the assumption of consistency for data in different views, the cluster structure of network topology and that of node attributes should be consistent for an attributed network.

Attribute Clustering

Paper
Code

SVTR: Scene Text Recognition with a Single Visual Model

2 code implementations • 30 Apr 2022 • Yongkun Du, Zhineng Chen, Caiyan Jia, Xiaoting Yin, Tianlun Zheng, Chenxia Li, Yuning Du, Yu-Gang Jiang

Dominant scene text recognition models commonly contain two building blocks, a visual model for feature extraction and a sequence model for text transcription.

Ranked #16 on Scene Text Recognition on ICDAR2013