no code implementations • 7 Apr 2024 • Youze Xue, Binghui Chen, Yifeng Geng, Xuansong Xie, Jiansheng Chen, Hongbing Ma
Customized generative text-to-image models have the ability to produce images that closely resemble a given subject.
no code implementations • 7 Apr 2024 • Binghui Chen, Wenyu Li, Yifeng Geng, Xuansong Xie, WangMeng Zuo
Specifically, we propose a shoe-wearing system, called Shoe-Model, to generate plausible images of human legs interacting with the given shoes.
no code implementations • 1 Mar 2024 • Zhenpeng Huang, Chao Li, Hao Chen, Yongjian Deng, Yifeng Geng, LiMin Wang
Our pre-training overcomes the limitations of previous methods, which either sacrifice temporal information by converting event sequences into 2D images for utilizing pre-trained image models or directly employ paired image data for knowledge distillation to enhance the learning of event streams.
no code implementations • 3 Jan 2024 • Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, Jingdong Sun, Wangmeng Xiang, Yusen Hu, Xianhui Lin, Xiaoyang Kang, Zengke Jin, Bin Luo, Yifeng Geng, Xuansong Xie, Jingren Zhou
This paper introduces the WordArt Designer API, a novel framework for user-driven artistic typography synthesis utilizing Large Language Models (LLMs) on ModelScope.
1 code implementation • 29 Dec 2023 • Jiawen Zhu, Zhi-Qi Cheng, Jun-Yan He, Chenyang Li, Bin Luo, Huchuan Lu, Yifeng Geng, Xuansong Xie
The perception component then generates the tracking results based on the embeddings.
no code implementations • 9 Nov 2023 • Wei Tan, Yifeng Geng, Xuansong Xie
On CoreML, FMViT outperforms MobileOne by 2. 6% in top-1 accuracy on the ImageNet dataset, with inference latency comparable to MobileOne (78. 5% vs. 75. 9%).
1 code implementation • 6 Nov 2023 • Yuxiang Tuo, Wangmeng Xiang, Jun-Yan He, Yifeng Geng, Xuansong Xie
Based on AnyWord-3M dataset, we propose AnyText-benchmark for the evaluation of visual text generation accuracy and quality.
no code implementations • 20 Oct 2023 • Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, Jingdong Sun, Wangmeng Xiang, Xianhui Lin, Xiaoyang Kang, Zengke Jin, Yusen Hu, Bin Luo, Yifeng Geng, Xuansong Xie, Jingren Zhou
This paper introduces WordArt Designer, a user-driven framework for artistic typography synthesis, relying on the Large Language Model (LLM).
1 code implementation • 4 Sep 2023 • Hanbing Liu, Wangmeng Xiang, Jun-Yan He, Zhi-Qi Cheng, Bin Luo, Yifeng Geng, Xuansong Xie
Accurately estimating the 3D pose of humans in video sequences requires both accuracy and a well-structured architecture.
1 code implementation • 18 Aug 2023 • Hanbing Liu, Jun-Yan He, Zhi-Qi Cheng, Wangmeng Xiang, Qize Yang, Wenhao Chai, Gaoang Wang, Xu Bao, Bin Luo, Yifeng Geng, Xuansong Xie
Typically, PoSynDA uses a diffusion-inspired structure to simulate 3D pose distribution in the target domain.
1 code implementation • ICCV 2023 • Junwen He, Yifan Wang, Lijun Wang, Huchuan Lu, Jun-Yan He, Jin-Peng Lan, Bin Luo, Yifeng Geng, Xuansong Xie
Our method sets the new state of the art for depth-aware panoptic segmentation on both Cityscapes-DVPS and SemKITTI-DVPS datasets.
no code implementations • 6 Jun 2023 • Yanwen Fang, Jintai Chen, Peng-Tao Jiang, Chao Li, Yifeng Geng, Eddy K. F. LAM, Guodong Li
Multi-person motion prediction is a challenging task, especially for real-world scenarios of highly interacted persons.
1 code implementation • 25 May 2023 • Xu Bao, Zhi-Qi Cheng, Jun-Yan He, Chenyang Li, Wangmeng Xiang, Jingdong Sun, Hanbing Liu, Wei Liu, Bin Luo, Yifeng Geng, Xuansong Xie
By spearheading the integration of Multilateration with facial analysis, KeyPosS marks a paradigm shift in facial landmark detection.
1 code implementation • 19 May 2023 • Yuxuan Zhou, Zhi-Qi Cheng, Jun-Yan He, Bin Luo, Yifeng Geng, Xuansong Xie
As a remedy, we propose a threefold strategy: (1) We forge an innovative pathway that encodes bone connectivity by harnessing the power of graph distances.
1 code implementation • 30 Mar 2023 • Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, Wangmeng Xiang, Binghui Chen, Bin Luo, Yifeng Geng, Xuansong Xie
Real-time perception, or streaming perception, is a crucial aspect of autonomous driving that has yet to be thoroughly explored in existing research.
1 code implementation • CVPR 2023 • Junjie He, Pengyu Li, Yifeng Geng, Xuansong Xie
In this paper, we show the strong potential of query-based models on efficient instance segmentation algorithm designs.
1 code implementation • 3 Feb 2023 • Hanyuan Chen, Jun-Yan He, Wangmeng Xiang, Zhi-Qi Cheng, Wei Liu, Hanbing Liu, Bin Luo, Yifeng Geng, Xuansong Xie
Human pose estimation is a challenging task due to its structured data sequence nature.
Ranked #74 on 3D Human Pose Estimation on Human3.6M
no code implementations • CVPR 2023 • Xiaolin Song, Binghui Chen, Pengyu Li, Jun-Yan He, Biao Wang, Yifeng Geng, Xuansong Xie, Honggang Zhang
End-to-end pedestrian detection focuses on training a pedestrian detection model via discarding the Non-Maximum Suppression (NMS) post-processing.
1 code implementation • 17 Nov 2022 • Yuxuan Zhou, Zhi-Qi Cheng, Chao Li, Yanwen Fang, Yifeng Geng, Xuansong Xie, Margret Keuper
Skeleton-based action recognition aims to recognize human actions given human joint coordinates with skeletal interconnections.
Ranked #7 on Skeleton Based Action Recognition on NTU RGB+D 120
2 code implementations • 27 Oct 2022 • Jin-Peng Lan, Zhi-Qi Cheng, Jun-Yan He, Chenyang Li, Bin Luo, Xu Bao, Wangmeng Xiang, Yifeng Geng, Xuansong Xie
Existing Visual Object Tracking (VOT) only takes the target area in the first frame as a template.
2 code implementations • 27 Oct 2022 • Chenyang Li, Zhi-Qi Cheng, Jun-Yan He, Pengyu Li, Bin Luo, Hanyuan Chen, Yifeng Geng, Jin-Peng Lan, Xuansong Xie
Streaming perception is a critical task in autonomous driving that requires balancing the latency and accuracy of the autopilot system.
no code implementations • 13 Jan 2021 • Mengting Chen, Xinggang Wang, Heng Luo, Yifeng Geng, Wenyu Liu
By applying the proposed feature matching block in different layers of the few-shot recognition network, multi-scale information among the compared images can be incorporated into the final cascaded matching feature, which boosts the recognition performance further and generalizes better by learning on relationships.
1 code implementation • 31 Dec 2019 • Mengting Chen, Yuxin Fang, Xinggang Wang, Heng Luo, Yifeng Geng, Xin-Yu Zhang, Chang Huang, Wenyu Liu, Bo wang
The learning problem of the sample generation (i. e., diversity transfer) is solved via minimizing an effective meta-classification loss in a single-stage network, instead of the generative loss in previous works.