1 code implementation • 16 Jul 2024 • Guowei Xu, Jiale Tao, Wen Li, Lixin Duan
Expanding on SLD, we introduce a set of motion queries to enhance the diversity of predictions.
1 code implementation • 9 Jul 2024 • Fanyue Wei, Wei Zeng, Zhenyang Li, Dawei Yin, Lixin Duan, Wen Li
Personalized text-to-image models allow users to generate varied styles of images (specified with a sentence) for an object (specified with a set of reference images).
no code implementations • 4 Jul 2024 • Linlong Fan, Ye Huang, Yanqi Ge, Wen Li, Lixin Duan
It has properties such as viewpoint invariance and rotation robustness, which give it an advantage in addressing the 3D object recognition problem under arbitrary views.
no code implementations • 18 Apr 2024 • Xunsong Li, Pengzhan Sun, Yangcen Liu, Lixin Duan, Wen Li
Existing methods usually adopt a two-stage pipeline, where object proposals are first detected using a pretrained detector, and then are fed to an action recognition model for extracting video features and learning the object relations for action recognition.
no code implementations • 10 Apr 2024 • Yanqi Ge, Jiaqi Liu, Qingnan Fan, Xi Jiang, Ye Huang, Shuai Qin, Hong Gu, Wen Li, Lixin Duan
In this work, we propose a novel solution to the text-driven style transfer task, namely, Adaptive Style Incorporation~(ASI), to achieve fine-grained feature-level style incorporation.
no code implementations • 26 Jan 2024 • Yanqi Ge, Ye Huang, Wen Li, Lixin Duan
We introduced SSR, which utilizes SAM (segment-anything) as a strong regularizer during training, to greatly enhance the robustness of the image encoder for handling various domains.
1 code implementation • 19 Dec 2023 • Yanqi Ge, Qiang Nie, Ye Huang, Yong liu, Chengjie Wang, Feng Zheng, Wen Li, Lixin Duan
By pulling the learned features to these semantic anchors, several advantages can be attained: 1) the intra-class compactness and naturally inter-class separability, 2) induced bias or errors from feature learning can be avoided, and 3) robustness to the long-tailed problem.
no code implementations • 24 Nov 2023 • Yuan Qing, Naixing Wu, Shaohua Wan, Lixin Duan
In the source domain, some training samples are of low-relevance to target domain due to the difference in viewpoints, action styles, etc.
no code implementations • 15 Mar 2023 • Ye Huang, Di Kang, Shenghua Gao, Wen Li, Lixin Duan
One crucial design of the HFG is to protect the high-level features from being contaminated by using proper stop-gradient operations so that the backbone does not update according to the noisy gradient from the upsampler.
1 code implementation • 11 Jan 2023 • Ye Huang, Di Kang, Liang Chen, Wenjing Jia, Xiangjian He, Lixin Duan, Xuefei Zhe, Linchao Bao
Extensive experiments and ablation studies conducted on multiple benchmark datasets demonstrate that the proposed CAR can boost the accuracy of all baseline models by up to 2. 23% mIOU with superior generalization ability.
1 code implementation • CVPR 2023 • Jinhong Deng, Dongli Xu, Wen Li, Lixin Duan
Self-training approaches recently achieved promising results in cross-domain object detection, where people iteratively generate pseudo labels for unlabeled target domain samples with a model, and select high-confidence samples to refine the model.
1 code implementation • CVPR 2023 • Anqi Zhao, Tong Chu, Yahao Liu, Wen Li, Jingjing Li, Lixin Duan
On the algorithmic side, we derive a new algorithm for black-box targeted attacks based on our theoretical analysis, in which we additionally minimize the maximum model discrepancy(M3D) of the substitute models when training the generator to generate adversarial examples.
no code implementations • 1 Dec 2022 • Junde Wu, Huihui Fang, Yehui Yang, Yuanpei Liu, Jing Gao, Lixin Duan, Weihua Yang, Yanwu Xu
In this paper, we propose a novel neural network framework, called Multi-Rater Prism (MrPrism) to learn the medical image segmentation from multiple labels.
no code implementations • 29 Sep 2022 • Borun Xu, Biao Wang, Jinhong Deng, Jiale Tao, Tiezheng Ge, Yuning Jiang, Wen Li, Lixin Duan
Motion transfer aims to transfer the motion of a driving video to a source image.
1 code implementation • 28 Sep 2022 • Jiale Tao, Biao Wang, Tiezheng Ge, Yuning Jiang, Wen Li, Lixin Duan
Image animation aims to animate a source image by using motion learned from a driving video.
2 code implementations • 5 Aug 2022 • Junde Wu, Huihui Fang, Hoayi Xiong, Lixin Duan, Mingkui Tan, Weihua Yang, Huiying Liu, Yanwu Xu
Inspired by this observation, we propose diagnosis-first principle, which is to take disease diagnosis as the criterion to calibrate the inter-observer segmentation uncertainty.
no code implementations • 1 Jun 2022 • Jinhong Deng, Xiaoyue Zhang, Wen Li, Lixin Duan
In particular, we take advantage of the characteristics of cross-attention as used in detection transformer and propose the spatial-aware token alignment (SpaTA) and the semantic-aware token alignment (SemTA) strategies to guide the token alignment across domains.
1 code implementation • CVPR 2022 • Yahao Liu, Jinhong Deng, Jiale Tao, Tong Chu, Lixin Duan, Wen Li
Existing works typically treat cross-domain semantic segmentation (CDSS) as a data distribution mismatch problem and focus on aligning the marginal distribution or conditional distribution.
1 code implementation • CVPR 2022 • Jiale Tao, Biao Wang, Borun Xu, Tiezheng Ge, Yuning Jiang, Wen Li, Lixin Duan
Specifically, inspired by the known deformable part model (DPM), our DAM introduces two types of anchors or keypoints: i) a number of motion anchors that capture both appearance and motion information from the source image and driving video; ii) a latent root anchor, which is linked to the motion anchors to facilitate better learning of the representations of the object structure information.
no code implementations • CVPR 2022 • Fanyue Wei, Biao Wang, Tiezheng Ge, Yuning Jiang, Wen Li, Lixin Duan
To this end, we propose to learn pixel-level distinctions to improve the video highlight detection.
no code implementations • 1 Apr 2022 • Yan Zhang, Changyu Li, Ivor W. Tsang, Hui Xu, Lixin Duan, Hongzhi Yin, Wen Li, Jie Shao
Motivated by the idea of meta-augmentation, in this paper, by treating a user's preference over items as a task, we propose a so-called Diverse Preference Augmentation framework with multiple source domains based on meta-learning (referred to as MetaDPA) to i) generate diverse ratings in a new domain of interest (known as target domain) to handle overfitting on the case of sparse interactions, and to ii) learn a preference model in the target domain via a meta-learning scheme to alleviate cold-start issues.
1 code implementation • 19 Dec 2021 • Borun Xu, Biao Wang, Jiale Tao, Tiezheng Ge, Yuning Jiang, Wen Li, Lixin Duan
Creative image animations are attractive in e-commerce applications, where motion transfer is one of the import ways to generate animations from static images.
1 code implementation • ACM International Conference on Multimedia 2021 • Pengzhan Sun, Bo Wu, Xunsong Li, Wen Li, Lixin Duan, Chuang Gan
By doing that, our proposed CDN method can better recognize unseen action instances by debiasing the effect of appearances.
no code implementations • CVPR 2021 • Fengmao Lv, Xiang Chen, Yanyong Huang, Lixin Duan, Guosheng Lin
In turn, it also collects the reinforced features from each modality and uses them to generate a reinforced common message.
no code implementations • 17 May 2021 • Andrey Ignatov, Andres Romero, Heewon Kim, Radu Timofte, Chiu Man Ho, Zibo Meng, Kyoung Mu Lee, Yuxiang Chen, Yutong Wang, Zeyu Long, Chenhao Wang, Yifei Chen, Boshen Xu, Shuhang Gu, Lixin Duan, Wen Li, Wang Bofei, Zhang Diankai, Zheng Chengjian, Liu Shaoli, Gao Si, Zhang Xiaofeng, Lu Kaidi, Xu Tianyu, Zheng Hui, Xinbo Gao, Xiumei Wang, Jiaming Guo, Xueyi Zhou, Hao Jia, Youliang Yan
Video super-resolution has recently become one of the most important mobile-related problems due to the rise of video communication and streaming services.
1 code implementation • ICCV 2021 • Yahao Liu, Jinhong Deng, Xinchen Gao, Wen Li, Lixin Duan
By integrating the boundary adaptation and prototype alignment, we are able to train a discriminative and domain-invariant model for cross-domain semantic segmentation.
no code implementations • 2 Nov 2020 • Yan Zhang, Ivor W. Tsang, Lixin Duan
Cold-start has being a critical issue in recommender systems with the explosion of data in e-commerce.
1 code implementation • 8 Sep 2020 • Zhiyu Xue, Lixin Duan, Wen Li, Lin Chen, Jiebo Luo
For that, in this work, we propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works as in a neural network as well as to find out specific regions that are related to each other in images coming from the query and support sets.
Ranked #34 on Few-Shot Image Classification on CIFAR-FS 5-way (5-shot)
no code implementations • ECCV 2020 • Chaofan Tao, Qinhong Jiang, Lixin Duan, Ping Luo
Existing work addressed this challenge by either learning social spatial interactions represented by the positions of a group of pedestrians, while ignoring their temporal coherence (\textit{i. e.} dependencies between different long trajectories), or by understanding the complicated scene layout (\textit{e. g.} scene segmentation) to ensure safe navigation.
no code implementations • 27 Jul 2020 • Changsheng Li, Chong Liu, Lixin Duan, Peng Gao, Kai Zheng
In this paper, we present a novel deep metric learning method to tackle the multi-label image classification problem.
no code implementations • 5 Apr 2020 • Minghao Fu, Zhenshan Xie, Wen Li, Lixin Duan
Cross-domain object detection has recently attracted more and more attention for real-world applications, since it helps build robust detectors adapting well to new environments.
no code implementations • 31 Mar 2020 • Fengmao Lv, Jianyang Zhang, Guowu Yang, Lei Feng, YuFeng Yu, Lixin Duan
Zero-Shot Learning (ZSL) learns models for recognizing new classes.
1 code implementation • CVPR 2021 • Jinhong Deng, Wen Li, Yu-Hua Chen, Lixin Duan
We reveal that there often exists a considerable model bias for the simple mean teacher (MT) model in cross-domain scenarios, and eliminate the model bias with several simple yet highly effective strategies.
no code implementations • ICLR 2020 • Yan Zhang, Ivor W. Tsang, Lixin Duan, Guowu Yang
Cold-start and efficiency issues of the Top-k recommendation are critical to large-scale recommender systems.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Yiming Xu, Lin Chen, Zhongwei Cheng, Lixin Duan, Jiebo Luo
A straightforward solution is to fine-tune a pre-trained source model by using those limited labeled target data, but it usually cannot work well due to the considerable difference between the data distributions of the source and target domains.
1 code implementation • ICCV 2019 • Qing Lian, Fengmao Lv, Lixin Duan, Boqing Gong
We propose a new approach, called self-motivated pyramid curriculum domain adaptation (PyCDA), to facilitate the adaptation of semantic segmentation neural networks from synthetic source domains to real target domains.
Ranked #14 on Image-to-Image Translation on SYNTHIA-to-Cityscapes
no code implementations • 24 Jun 2019 • Zhaoquan Yuan, Siyuan Sun, Lixin Duan, Xiao Wu, Changsheng Xu
In AMN, as inspired by generative adversarial networks, we propose to learn multimodal feature representations by finding a more coherent subspace for video clips and the corresponding texts (e. g., subtitles and questions).
no code implementations • 10 May 2019 • Jin Chen, Xinxiao wu, Lixin Duan, Shenghua Gao
In this more general and practical scenario, a major challenge is how to select source instances in the shared classes across different domains for positive transfer.
1 code implementation • 3 May 2019 • Qing Lian, Wen Li, Lin Chen, Lixin Duan
Particularly, in open set domain adaptation, we allow the classes from the source and target domains to be partially overlapped.
no code implementations • 21 Apr 2019 • Chaofan Tao, Fengmao Lv, Lixin Duan, Min Wu
Unlike most existing approaches which employ a generator to deal with domain difference, MMEN focuses on learning the categorical information from unlabeled target samples with the help of labeled source samples.
no code implementations • 20 Feb 2019 • Xiao-Yu Zhang, Haichao Shi, Changsheng Li, Kai Zheng, Xiaobin Zhu, Lixin Duan
Action recognition in videos has attracted a lot of attention in the past decade.
no code implementations • 11 May 2018 • Feiwu Yu, Xinxiao wu, Yuchao Sun, Lixin Duan
By taking advantage of these two-level adversarial learning, our method is capable of learning a domain-invariant feature representation of source images and target videos.
no code implementations • 15 Dec 2016 • Hao Liu, Yang Yang, Fumin Shen, Lixin Duan, Heng Tao Shen
Along with the prosperity of recurrent neural network in modelling sequential data and the power of attention mechanism in automatically identify salient information, image captioning, a. k. a., image description, has been remarkably advanced in recent years.
no code implementations • CVPR 2013 • Lin Chen, Lixin Duan, Dong Xu
In this work, we propose to leverage a large number of loosely labeled web videos (e. g., from YouTube) and web images (e. g., from Google/Bing image search) for visual event recognition in consumer videos without requiring any labeled consumer videos.