no code implementations • 13 May 2024 • Yao Sun, Tengyu Jing, Jiapeng Wang, Wei Wang
The Transformer branch network is used to extract the temporal characteristics of historical trajectories and capture the impact of the fighter's historical state on future trajectories, while the GAT branch network is used to extract spatial features in historical trajectories and capture potential spatial correlations between fighters. Then we concatenate the outputs of the two branches into a new feature vector and input it into a decoder composed of a fully connected network to predict the future position coordinates of the blue army fighter. The computer simulation results show that the proposed network significantly improves the prediction accuracy of flight trajectories compared to the enhanced CNN-LSTM network (ECNN-LSTM), with improvements of 47% and 34% in both ADE and FDE indicators, providing strong support for subsequent autonomous combat missions.
no code implementations • 8 Mar 2024 • Jiapeng Wang, Chengyu Wang, Tingfeng Cao, Jun Huang, Lianwen Jin
We present DiffChat, a novel method to align Large Language Models (LLMs) to "chat" with prompt-as-input Text-to-Image Synthesis (TIS) models (e. g., Stable Diffusion) for interactive image creation.
no code implementations • 7 Jan 2024 • Zening Lin, Jiapeng Wang, Teng Li, Wenhui Liao, Dayi Huang, Longfei Xiong, Lianwen Jin
However, simply concatenating SER and RE serially can lead to severe error propagation, and it fails to handle cases like multi-line entities in real scenarios.
1 code implementation • ICCV 2023 • Qing Jiang, Jiapeng Wang, Dezhi Peng, Chongyu Liu, Lianwen Jin
To this end, we consolidate a large-scale real STR dataset, namely Union14M, which comprises 4 million labeled images and 10 million unlabeled images, to assess the performance of STR models in more complex real-world scenarios.
no code implementations • 28 May 2023 • Jiapeng Wang, Chengyu Wang, Xiaodan Wang, Jun Huang, Lianwen Jin
Large-scale pre-trained text-image models with dual-encoder architectures (such as CLIP) are typically adopted for various vision-language applications, including text-image retrieval.
no code implementations • 8 Sep 2022 • Jiapeng Wang, Ming Ma, Zhenhua Yu
In this paper, we propose a simple yet effective differentiable network pruning method CWP based on instance complexity weighted filter importance scores.
2 code implementations • ACL 2022 • Jiapeng Wang, Lianwen Jin, Kai Ding
LiLT can be pre-trained on the structured documents of a single language and then directly fine-tuned on other languages with the corresponding off-the-shelf monolingual/multilingual pre-trained textual models.
Ranked #5 on Key Information Extraction on CORD
no code implementations • 24 Jun 2021 • Guozhi Tang, Lele Xie, Lianwen Jin, Jiapeng Wang, Jingdong Chen, Zhen Xu, Qianying Wang, Yaqiang Wu, Hui Li
Through key-value matching based on relevancy evaluation, the proposed MatchVIE can bypass the recognitions to various semantics, and simply focuses on the strong relevancy between entities.
no code implementations • 20 Jun 2021 • Jiapeng Wang, Tianwei Wang, Guozhi Tang, Lianwen Jin, Weihong Ma, Kai Ding, Yichao Huang
Visual information extraction (VIE) has attracted increasing attention in recent years.
1 code implementation • 24 Jan 2021 • Jiapeng Wang, Chongyu Liu, Lianwen Jin, Guozhi Tang, Jiaxin Zhang, Shuaitao Zhang, Qianying Wang, Yaqiang Wu, Mingxiang Cai
Visual information extraction (VIE) has attracted considerable attention recently owing to its various advanced applications such as document understanding, automatic marking and intelligent education.
1 code implementation • 14 Jul 2020 • Weihong Ma, Hesuo Zhang, Lianwen Jin, Sihang Wu, Jiapeng Wang, Yongpan Wang
In this framework, two branches named character branch and layout branch are added behind the feature extraction network.