1 code implementation • 10 Sep 2024 • Jingkai Zhou, Benzhi Wang, Weihua Chen, Jingqi Bai, Dongyang Li, Aixi Zhang, Hao Xu, Mingyang Yang, Fan Wang
2) The hands generated using the DWPose sequence are blurry and unrealistic.
1 code implementation • 5 Sep 2024 • Benzhi Wang, Jingkai Zhou, Jingqi Bai, Yang Yang, Weihua Chen, Fan Wang, Zhen Lei
First, it generates realistic human parts, such as hands or faces, using the original malformed parts as references, ensuring consistent details with the original image.
no code implementations • ICCV 2023 • Benzhi Wang, Yang Yang, Jinlin Wu, Guo-Jun Qi, Zhen Lei
On the other hand, the similarity of cross-scale images is often smaller than that of images with the same scale for a person, which will increase the difficulty of matching.
no code implementations • 26 Apr 2022 • Benzhi Wang, Meiyu Liang, Ang Li
With the advent of the information age, the scale of data on the Internet is getting larger and larger, and it is full of text, images, videos, and other information.
no code implementations • 29 Mar 2022 • Benzhi Wang, Meiyu Liang, Feifei Kou, Mingying Xu
Science and technology big data contain a lot of cross-media information. There are images and texts in the scientific paper. The s ingle modal search method cannot well meet the needs of scientific researchers. This paper proposes a cross-media scientific research achievements retrieval method based on deep language model (CARDL). It achieves a unified cross-media semantic representation by learning the semantic association between different modal data, and is applied to the generation of text semantic vector of scientific research achievements, and then cross-media retrieval is realized through semantic similarity matching between different modal data. Experimental results show that the proposed CARDL method achieves better cross-modal retrieval performance than existing methods.