1 code implementation • 10 Sep 2024 • Jingkai Zhou, Benzhi Wang, Weihua Chen, Jingqi Bai, Dongyang Li, Aixi Zhang, Hao Xu, Mingyang Yang, Fan Wang
2) The hands generated using the DWPose sequence are blurry and unrealistic.
no code implementations • 5 Sep 2024 • Benzhi Wang, Jingkai Zhou, Jingqi Bai, Yang Yang, Weihua Chen, Fan Wang, Zhen Lei
First, it generates realistic human parts, such as hands or faces, using the original malformed parts as references, ensuring consistent details with the original image.
no code implementations • 23 Jul 2024 • Canyu Zhao, MingYu Liu, Wen Wang, Weihua Chen, Fan Wang, Hao Chen, Bo Zhang, Chunhua Shen
Our approach utilizes autoregressive models for global narrative coherence, predicting sequences of visual tokens that are subsequently transformed into high-quality video frames through diffusion rendering.
no code implementations • CVPR 2024 • Junyan Wang, Zhenhong Sun, Zhiyu Tan, Xuanbai Chen, Weihua Chen, Hao Li, Cheng Zhang, Yang song
Vanilla text-to-image diffusion models struggle with generating accurate human images, commonly resulting in imperfect anatomies such as unnatural postures or disproportionate limbs. Existing methods address this issue mostly by fine-tuning the model with extra images or adding additional controls -- human-centric priors such as pose or depth maps -- during the image generation phase.
no code implementations • 7 Sep 2023 • Shuting He, Weihua Chen, Kai Wang, Hao Luo, Fan Wang, Wei Jiang, Henghui Ding
Then, to measure the importance of each generated region, we introduce a Region Assessment Module (RAM) that assigns confidence scores to different regions and reduces the negative impact of the occlusion regions by lower scores.
1 code implementation • 15 Jun 2023 • Yuqi Zhang, Qi Qian, Hongsong Wang, Chong Liu, Weihua Chen, Fan Wang
In particular, the plain GCR is extended for cross-camera retrieval and an improved feature propagation formulation is presented to leverage affinity relationships across different cameras.
1 code implementation • 15 Jun 2023 • Chong Liu, Yuqi Zhang, Hongsong Wang, Weihua Chen, Fan Wang, Yan Huang, Yi-Dong Shen, Liang Wang
Most previous works either simply learn coarse-grained representations of the overall image and text, or elaborately establish the correspondence between image regions or pixels and text words.
1 code implementation • 4 May 2023 • Yuan Zhang, Weihua Chen, Yichen Lu, Tao Huang, Xiuyu Sun, Jian Cao
Knowledge distillation is an effective paradigm for boosting the performance of pocket-size model, especially when multiple teacher models are available, the student would break the upper limit again.
4 code implementations • CVPR 2023 • Weihua Chen, Xianzhe Xu, Jian Jia, Hao Luo, Yaohua Wang, Fan Wang, Rong Jin, Xiuyu Sun
Unlike the existing self-supervised learning methods, prior knowledge from human images is utilized in SOLIDER to build pseudo semantic labels and import more semantic information into the learned representation.
Ranked #1 on Person Search on PRW
3 code implementations • 23 Nov 2022 • Xianzhe Xu, Yiqi Jiang, Weihua Chen, Yilun Huang, Yuan Zhang, Xiuyu Sun
Based on these new techs, we build a suite of models at various scales to meet the needs of different scenarios.
Ranked #43 on Real-Time Object Detection on MS COCO
1 code implementation • 24 Oct 2022 • Zhaopeng Dou, Zhongdao Wang, Weihua Chen, YaLi Li, Shengjin Wang
(3) the data uncertainty and the model uncertainty are jointly learned in a unified network, and they serve as two fundamental criteria for the reliability assessment: if a probe is high-quality (low data uncertainty) and the model is confident in the prediction of the probe (low model uncertainty), the final ranking will be assessed as reliable.
no code implementations • 12 Jul 2022 • Xiao Pan, Hao Luo, Weihua Chen, Fan Wang, Hao Li, Wei Jiang, Jianming Zhang, Jianyang Gu, Peike Li
To address this issue, we propose the Ranking-based Backward Compatible Learning (RBCL), which directly optimizes the ranking metric between new features and old features.
1 code implementation • 28 Dec 2021 • Kai Chen, Weihua Chen, Tao He, Rong Du, Fan Wang, Xiuyu Sun, Yuchen Guo, Guiguang Ding
In TAGPerson, we extract information from target scenes and use them to control our parameterized rendering process to generate target-aware synthetic images, which would hold a smaller gap to the real images in the target domain.
no code implementations • 17 Nov 2021 • Ming Yan, Haiyang Xu, Chenliang Li, Junfeng Tian, Bin Bi, Wei Wang, Weihua Chen, Xianzhe Xu, Fan Wang, Zheng Cao, Zhicheng Zhang, Qiyu Zhang, Ji Zhang, Songfang Huang, Fei Huang, Luo Si, Rong Jin
The Visual Question Answering (VQA) task utilizes both visual image and language analysis to answer a textual question with respect to an image.
Ranked #8 on Visual Question Answering (VQA) on VQA v2 test-dev
no code implementations • 29 Sep 2021 • Tao He, Tongkun Xu, Weihua Chen, Yuchen Guo, Guiguang Ding, Zhenhua Guo
Due to the discrepancies between cameras caused by illumination, background, or viewpoint, the underlying difficulty for Re-ID is the camera bias problem, which leads to the large gap of within-identity features from different cameras.
2 code implementations • ICLR 2022 • Tongkun Xu, Weihua Chen, Pichao Wang, Fan Wang, Hao Li, Rong Jin
Along with the pseudo labels, a weight-sharing triple-branch transformer framework is proposed to apply self-attention and cross-attention for source/target feature learning and source-target domain alignment, respectively.
Ranked #4 on Domain Adaptation on Office-31
no code implementations • 23 Aug 2021 • Yiqi Jiang, Weihua Chen, Xiuyu Sun, Xiaoyu Shi, Fan Wang, Hao Li
Recently, GAN based method has demonstrated strong effectiveness in generating augmentation data for person re-identification (ReID), on account of its ability to bridge the gap between domains and enrich the data variety in feature space.
no code implementations • ICCV 2021 • Takashi Isobe, Dong Li, Lu Tian, Weihua Chen, Yi Shan, Shengjin Wang
We observe that these proposed schemes are capable of facilitating the learning of discriminative feature representations.
1 code implementation • 5 Jul 2021 • Yuqi Zhang, Qian Qi, Chong Liu, Weihua Chen, Fan Wang, Hao Li, Rong Jin
In this work, we propose a graph-based re-ranking method to improve learned features while still keeping Euclidean distance as the similarity metric.
1 code implementation • 20 May 2021 • Hao Luo, Weihua Chen, Xianzhe Xu, Jianyang Gu, Yuqi Zhang, Chong Liu, Yiqi Jiang, Shuting He, Fan Wang, Hao Li
We mainly focus on four points, i. e. training data, unsupervised domain-adaptive (UDA) training, post-processing, model ensembling in this challenge.
1 code implementation • 14 May 2021 • Chong Liu, Yuqi Zhang, Hao Luo, Jiasheng Tang, Weihua Chen, Xianzhe Xu, Fan Wang, Hao Li, Yi-Dong Shen
Multi-Target Multi-Camera Tracking has a wide range of applications and is the basis for many advanced inferences and predictions.
1 code implementation • 25 Dec 2020 • Jianyang Gu, Hao Luo, Weihua Chen, Yiqi Jiang, Yuqi Zhang, Shuting He, Fan Wang, Hao Li, Wei Jiang
Considering the large gap between the source domain and target domain, we focused on solving two biases that influenced the performance on domain adaptive pedestrian Re-ID and proposed a two-stage training procedure.
2 code implementations • 22 Apr 2020 • Shuting He, Hao Luo, Weihua Chen, Miao Zhang, Yuqi Zhang, Fan Wang, Hao Li, Wei Jiang
Our solution is based on a strong baseline with bag of tricks (BoT-BS) proposed in person ReID.
3 code implementations • CVPR 2017 • Weihua Chen, Xiaotang Chen, Jian-Guo Zhang, Kaiqi Huang
In particular, a quadruplet deep network using a margin-based online hard negative mining is proposed based on the quadruplet loss for the person ReID.
no code implementations • 19 Jul 2016 • Weihua Chen, Xiaotang Chen, Jian-Guo Zhang, Kaiqi Huang
Person re-identification (ReID) focuses on identifying people across different scenes in video surveillance, which is usually formulated as a binary classification task or a ranking task in current person ReID approaches.
1 code implementation • 12 Feb 2015 • Weihua Chen, Lijun Cao, Xiaotang Chen, Kaiqi Huang
Non-overlapping multi-camera visual object tracking typically consists of two steps: single camera object tracking and inter-camera object tracking.