1 code implementation • 26 May 2025 • Zhongzhan Huang, Guoming Ling, Shanshan Zhong, Hefeng Wu, Liang Lin
Long Context Understanding (LCU) is a critical area for exploration in current large language models (LLMs).
no code implementations • 3 May 2025 • Kaidong Zhang, Rongtao Xu, Pengzhen Ren, Junfan Lin, Hefeng Wu, Liang Lin, Xiaodan Liang
Operating robots in open-ended scenarios with diverse tasks is a crucial research and application direction in robotics.
1 code implementation • 8 Mar 2025 • Zhongzhan Huang, Guoming Ling, Yupei Lin, Yandong Chen, Shanshan Zhong, Hefeng Wu, Liang Lin
This improvement can even surpass the performance of the best single model in the pool and many existing strong LLMs, confirming it a highly promising paradigm.
no code implementations • 9 Dec 2024 • Haijing Liu, Tao Pu, Hefeng Wu, Keze Wang, Liang Lin
The proposed framework consists of two complementary modules, i. e., intra-category semantic refinement (ISR) module and inter-category semantic transfer (IST) module.
1 code implementation • 4 Sep 2024 • Zhibin Liu, Haoye Dong, Aviral Chharia, Hefeng Wu
Generating lifelike 3D humans from a single RGB image remains a challenging task in computer vision, as it requires accurate modeling of geometry, high-quality texture, and plausible unseen parts.
Ranked #1 on
Lifelike 3D Human Generation
on THuman2.0 Dataset
no code implementations • 8 Aug 2024 • Hefeng Wu, Hao Jiang, Keze Wang, Ziyi Tang, Xianghuan He, Liang Lin
The pursuit of greater interpretability in neural networks often results in a degradation of their original performance.
no code implementations • 23 Apr 2024 • Weifeng Chen, Jiacheng Zhang, Jie Wu, Hefeng Wu, Xuefeng Xiao, Liang Lin
The rapid development of diffusion models has triggered diverse applications.
no code implementations • 18 Jan 2024 • Jie Qin, Jie Wu, Weifeng Chen, Yuxi Ren, Huixia Li, Hefeng Wu, Xuefeng Xiao, Rui Wang, Shilei Wen
Diffusion models have opened up new avenues for the field of image generation, resulting in the proliferation of high-quality models shared on open-source platforms.
no code implementations • 13 Jan 2024 • Hefeng Wu, Guangzhi Ye, Ziyang Zhou, Ling Tian, Qing Wang, Liang Lin
Specifically, an instance-view data hallucination module hallucinates each sample of a novel class to generate new data by employing local semantic correlated attention and global semantic feature fusion derived from base classes.
1 code implementation • 16 Nov 2023 • Hefeng Wu, Yandong Chen, Lingbo Liu, Tianshui Chen, Keze Wang, Liang Lin
In the localization stage, the Scale-aware Multi-head Localization (SAML) module utilizes the query tensor to predict the confidence, location, and size of each potential object.
1 code implementation • 15 Nov 2023 • Hefeng Wu, Weifeng Chen, Zhibin Liu, Tianshui Chen, Zhiguang Chen, Liang Lin
Moreover, we propose a proximity data generation (PDG) module to automatically produce more diverse data for cross-modal training.
1 code implementation • 10 Oct 2023 • Fei Wang, Kongzhang Tang, Hefeng Wu, Baoquan Zhao, Hao Cai, Teng Zhou
Compared with natural images, freehand sketches are much more flexible to depict various shapes, providing a high potential and valuable way for 3D human reconstruction.
1 code implementation • 23 Sep 2023 • Tao Pu, Tianshui Chen, Hefeng Wu, Yongyi Lu, Liang Lin
In this work, we propose a spatial-temporal knowledge-embedded transformer (STKET) that incorporates the prior spatial-temporal knowledge into the multi-head cross-attention mechanism to learn more representative relationship representations.
1 code implementation • 23 May 2023 • Weifeng Chen, Yatai Ji, Jie Wu, Hefeng Wu, Pan Xie, Jiashi Li, Xin Xia, Xuefeng Xiao, Liang Lin
Recent advances in text-to-image (T2I) diffusion models have enabled impressive image generation capabilities guided by text prompts.
no code implementations • 6 May 2023 • Yang Wu, Zhibin Liu, Hefeng Wu, Liang Lin
In this paper, we study video synthesis with emphasis on simplifying the generation conditions.
no code implementations • 15 Nov 2022 • Tao Pu, Qianru Lao, Hefeng Wu, Tianshui Chen, Liang Lin
To reject noisy labels, recent works regard large loss samples as noise but ignore the semantic correlation different multi-label images.
1 code implementation • 26 May 2022 • Tao Pu, Tianshui Chen, Hefeng Wu, Yukai Shi, Zhijing Yang, Liang Lin
Specifically, an instance-perspective representation blending (IPRB) module is designed to blend the representations of the known labels in an image with the representations of the corresponding unknown labels in another image to complement these unknown labels.
no code implementations • 8 Apr 2022 • Tao Pu, Mingzhan Sun, Hefeng Wu, Tianshui Chen, Ling Tian, Liang Lin
We also design an object erasing (OE) module to implicitly learn semantic dependency among categories by erasing semantic-aware regions to regularize the network training.
1 code implementation • 4 Mar 2022 • Tao Pu, Tianshui Chen, Hefeng Wu, Liang Lin
However, these algorithms depend on sufficient multi-label annotations to train the models, leading to poor performance especially with low known label proportion.
Multi-Label Image Recognition
Multi-label Image Recognition with Partial Labels
1 code implementation • 21 Dec 2021 • Tianshui Chen, Tao Pu, Hefeng Wu, Yuan Xie, Liang Lin
To reduce the annotation cost, we propose a structured semantic transfer (SST) framework that enables training multi-label recognition models with partial labels, i. e., merely some labels are known while other labels are missing (also called unknown labels) per image.
Multi-Label Image Recognition
Multi-label Image Recognition with Partial Labels
1 code implementation • 29 Dec 2020 • Tao Pu, Tianshui Chen, Yuan Xie, Hefeng Wu, Liang Lin
In this work, we explore the correlations among the action units and facial expressions, and devise an AU-Expression Knowledge Constrained Representation Learning (AUE-CRL) framework to learn the AU representations without AU annotations and adaptively use representations to facilitate facial expression recognition.
Facial Expression Recognition
Facial Expression Recognition (FER)
+1
1 code implementation • CVPR 2021 • Lingbo Liu, Jiaqi Chen, Hefeng Wu, Guanbin Li, Chenglong Li, Liang Lin
Extensive experiments conducted on the RGBT-CC benchmark demonstrate the effectiveness of our framework for RGBT crowd counting.
no code implementations • 20 Sep 2020 • Tianshui Chen, Liang Lin, Riquan Chen, Xiaolu Hui, Hefeng Wu
The framework exploits prior knowledge to guide adaptive information propagation among different categories to facilitate multi-label analysis and reduce the dependency of training samples.
1 code implementation • 3 Aug 2020 • Yuan Xie, Tianshui Chen, Tao Pu, Hefeng Wu, Liang Lin
However, most of these works focus on holistic feature adaptation, and they ignore local features that are more transferable across different datasets.
Cross-Domain Facial Expression Recognition
Facial Expression Recognition (FER)
1 code implementation • 3 Aug 2020 • Tianshui Chen, Tao Pu, Hefeng Wu, Yuan Xie, Lingbo Liu, Liang Lin
Although each declares to achieve superior performance, fair comparisons are lacking due to the inconsistent choices of the source/target datasets and feature extractors.
Ranked #1 on
Cross-Domain Facial Expression Recognition
on Source: AFE, Target: CK+, JAFFE, SFEW2.0, FER2013, ExpW
Cross-Domain Facial Expression Recognition
Domain Adaptation
+3
1 code implementation • 21 Jul 2020 • Jie Wu, Tianshui Chen, Hefeng Wu, Zhi Yang, Guangchun Luo, Liang Lin
This is primarily due to (i) the conservative characteristic of traditional training objectives that drives the model to generate correct but hardly discriminative captions for similar images and (ii) the uneven word distribution of the ground-truth captions, which encourages generating highly frequent words/phrases while suppressing the less frequent but more concrete ones.
2 code implementations • 23 Mar 2020 • Lingbo Liu, Jiaqi Chen, Hefeng Wu, Tianshui Chen, Guanbin Li, Liang Lin
Crowd counting is an application-oriented task and its inference efficiency is crucial for real-world applications.
2 code implementations • 14 Jan 2020 • Lingbo Liu, Jingwen Chen, Hefeng Wu, Jiajie Zhen, Guanbin Li, Liang Lin
To address this problem, we model a metro system as graphs with various topologies and propose a unified Physical-Virtual Collaboration Graph Network (PVCGN), which can effectively learn the complex ridership patterns from the tailor-designed graphs.
1 code implementation • 21 Nov 2019 • Riquan Chen, Tianshui Chen, Xiaolu Hui, Hefeng Wu, Guanbin Li, Liang Lin
In this work, we represent the semantic correlations in the form of structured knowledge graph and integrate this graph into deep neural networks to promote few-shot learning by a novel Knowledge Graph Transfer Network (KGTN).
2 code implementations • ICCV 2019 • Tianshui Chen, Muxin Xu, Xiaolu Hui, Hefeng Wu, Liang Lin
Recognizing multiple labels of images is a practical and challenging task, and significant progress has been made by searching semantic-aware regions and modeling label dependency.
Ranked #9 on
Multi-Label Classification
on PASCAL VOC 2007
no code implementations • 29 May 2019 • Hefeng Wu, Yafei Hu, Keze Wang, Hanhui Li, Lin Nie, Hui Cheng
Multi-Person Tracking (MPT) is often addressed within the detection-to-association paradigm.
no code implementations • 28 Dec 2018 • Fei Wang, Shujin Lin, Hanhui Li, Hefeng Wu, Junkun Jiang, Ruomei Wang, Xiaonan Luo
Traditional sketch segmentation methods mainly rely on handcrafted features and complicate models, and their performance is far from satisfactory due to the abstract representation of sketches.
1 code implementation • CVPR 2019 • Ning Liu, Yongchao Long, Changqing Zou, Qun Niu, Li Pan, Hefeng Wu
We propose an attention-injective deformable convolutional network called ADCrowdNet for crowd understanding that can address the accuracy degradation problem of highly congested noisy scenes.
Ranked #2 on
Crowd Counting
on TRANCOS
no code implementations • 20 Jan 2018 • Hanhui Li, Xiangjian He, Hefeng Wu, Saeed Amirgholipour Kasmani, Ruomei Wang, Xiaonan Luo, Liang Lin
In this paper, we aim at tackling the problem of crowd counting in extremely high-density scenes, which contain hundreds, or even thousands of people.
no code implementations • 29 Dec 2017 • Daiguo Deng, Ruomei Wang, Hefeng Wu, Huayong He, Qi Li, Xiaonan Luo
Fabric image retrieval is beneficial to many applications including clothing searching, online shopping and cloth modeling.