Search Results for author: Lixin Duan

Found 41 papers, 15 papers with code

Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition

no code implementations18 Apr 2024 Xunsong Li, Pengzhan Sun, Yangcen Liu, Lixin Duan, Wen Li

Existing methods usually adopt a two-stage pipeline, where object proposals are first detected using a pretrained detector, and then are fed to an action recognition model for extracting video features and learning the object relations for action recognition.

Tuning-Free Adaptive Style Incorporation for Structure-Consistent Text-Driven Style Transfer

no code implementations10 Apr 2024 Yanqi Ge, Jiaqi Liu, Qingnan Fan, Xi Jiang, Ye Huang, Shuai Qin, Hong Gu, Wen Li, Lixin Duan

In this work, we propose a novel solution to the text-driven style transfer task, namely, Adaptive Style Incorporation~(ASI), to achieve fine-grained feature-level style incorporation.

Style Transfer

SSR: SAM is a Strong Regularizer for domain adaptive semantic segmentation

no code implementations26 Jan 2024 Yanqi Ge, Ye Huang, Wen Li, Lixin Duan

We introduced SSR, which utilizes SAM (segment-anything) as a strong regularizer during training, to greatly enhance the robustness of the image encoder for handling various domains.

Semantic Segmentation

Beyond Prototypes: Semantic Anchor Regularization for Better Representation Learning

1 code implementation19 Dec 2023 Yanqi Ge, Qiang Nie, Ye Huang, Yong liu, Chengjie Wang, Feng Zheng, Wen Li, Lixin Duan

By pulling the learned features to these semantic anchors, several advantages can be attained: 1) the intra-class compactness and naturally inter-class separability, 2) induced bias or errors from feature learning can be avoided, and 3) robustness to the long-tailed problem.

Disentanglement

Multi-modal Instance Refinement for Cross-domain Action Recognition

no code implementations24 Nov 2023 Yuan Qing, Naixing Wu, Shaohua Wan, Lixin Duan

In the source domain, some training samples are of low-relevance to target domain due to the difference in viewpoints, action styles, etc.

Action Recognition Domain Adaptation +3

High-level Feature Guided Decoding for Semantic Segmentation

no code implementations15 Mar 2023 Ye Huang, Di Kang, Shenghua Gao, Wen Li, Lixin Duan

One crucial design of the HFG is to protect the high-level features from being contaminated by using proper stop-gradient operations so that the backbone does not update according to the noisy gradient from the upsampler.

Semantic Segmentation Vocal Bursts Intensity Prediction

CARD: Semantic Segmentation with Efficient Class-Aware Regularized Decoder

1 code implementation11 Jan 2023 Ye Huang, Di Kang, Liang Chen, Wenjing Jia, Xiangjian He, Lixin Duan, Xuefei Zhe, Linchao Bao

Extensive experiments and ablation studies conducted on multiple benchmark datasets demonstrate that the proposed CAR can boost the accuracy of all baseline models by up to 2. 23% mIOU with superior generalization ability.

Representation Learning Semantic Segmentation +1

Harmonious Teacher for Cross-Domain Object Detection

1 code implementation CVPR 2023 Jinhong Deng, Dongli Xu, Wen Li, Lixin Duan

Self-training approaches recently achieved promising results in cross-domain object detection, where people iteratively generate pseudo labels for unlabeled target domain samples with a model, and select high-confidence samples to refine the model.

Object object-detection +1

Minimizing Maximum Model Discrepancy for Transferable Black-box Targeted Attacks

1 code implementation CVPR 2023 Anqi Zhao, Tong Chu, Yahao Liu, Wen Li, Jingjing Li, Lixin Duan

On the algorithmic side, we derive a new algorithm for black-box targeted attacks based on our theoretical analysis, in which we additionally minimize the maximum model discrepancy(M3D) of the substitute models when training the generator to generate adversarial examples.

Multi-rater Prism: Learning self-calibrated medical image segmentation from multiple raters

no code implementations1 Dec 2022 Junde Wu, Huihui Fang, Yehui Yang, Yuanpei Liu, Jing Gao, Lixin Duan, Weihua Yang, Yanwu Xu

In this paper, we propose a novel neural network framework, called Multi-Rater Prism (MrPrism) to learn the medical image segmentation from multiple labels.

Image Segmentation Medical Image Segmentation +2

Motion Transformer for Unsupervised Image Animation

1 code implementation28 Sep 2022 Jiale Tao, Biao Wang, Tiezheng Ge, Yuning Jiang, Wen Li, Lixin Duan

Image animation aims to animate a source image by using motion learned from a driving video.

Image Animation

Calibrate the inter-observer segmentation uncertainty via diagnosis-first principle

2 code implementations5 Aug 2022 Junde Wu, Huihui Fang, Hoayi Xiong, Lixin Duan, Mingkui Tan, Weihua Yang, Huiying Liu, Yanwu Xu

Inspired by this observation, we propose diagnosis-first principle, which is to take disease diagnosis as the criterion to calibrate the inter-observer segmentation uncertainty.

Image Segmentation Lesion Segmentation +3

Cross-domain Detection Transformer based on Spatial-aware and Semantic-aware Token Alignment

no code implementations1 Jun 2022 Jinhong Deng, Xiaoyue Zhang, Wen Li, Lixin Duan

In particular, we take advantage of the characteristics of cross-attention as used in detection transformer and propose the spatial-aware token alignment (SpaTA) and the semantic-aware token alignment (SemTA) strategies to guide the token alignment across domains.

Domain Adaptation object-detection +1

Undoing the Damage of Label Shift for Cross-domain Semantic Segmentation

1 code implementation CVPR 2022 Yahao Liu, Jinhong Deng, Jiale Tao, Tong Chu, Lixin Duan, Wen Li

Existing works typically treat cross-domain semantic segmentation (CDSS) as a data distribution mismatch problem and focus on aligning the marginal distribution or conditional distribution.

Semantic Segmentation

Structure-Aware Motion Transfer with Deformable Anchor Model

1 code implementation CVPR 2022 Jiale Tao, Biao Wang, Borun Xu, Tiezheng Ge, Yuning Jiang, Wen Li, Lixin Duan

Specifically, inspired by the known deformable part model (DPM), our DAM introduces two types of anchors or keypoints: i) a number of motion anchors that capture both appearance and motion information from the source image and driving video; ii) a latent root anchor, which is linked to the motion anchors to facilitate better learning of the representations of the object structure information.

Diverse Preference Augmentation with Multiple Domains for Cold-start Recommendations

no code implementations1 Apr 2022 Yan Zhang, Changyu Li, Ivor W. Tsang, Hui Xu, Lixin Duan, Hongzhi Yin, Wen Li, Jie Shao

Motivated by the idea of meta-augmentation, in this paper, by treating a user's preference over items as a task, we propose a so-called Diverse Preference Augmentation framework with multiple source domains based on meta-learning (referred to as MetaDPA) to i) generate diverse ratings in a new domain of interest (known as target domain) to handle overfitting on the case of sparse interactions, and to ii) learn a preference model in the target domain via a meta-learning scheme to alleviate cold-start issues.

Domain Adaptation Meta-Learning +1

Move As You Like: Image Animation in E-Commerce Scenario

1 code implementation19 Dec 2021 Borun Xu, Biao Wang, Jiale Tao, Tiezheng Ge, Yuning Jiang, Wen Li, Lixin Duan

Creative image animations are attractive in e-commerce applications, where motion transfer is one of the import ways to generate animations from static images.

Image Animation

BAPA-Net: Boundary Adaptation and Prototype Alignment for Cross-Domain Semantic Segmentation

1 code implementation ICCV 2021 Yahao Liu, Jinhong Deng, Xinchen Gao, Wen Li, Lixin Duan

By integrating the boundary adaptation and prototype alignment, we are able to train a discriminative and domain-invariant model for cross-domain semantic segmentation.

Segmentation Semantic Segmentation +1

Collaborative Generative Hashing for Marketing and Fast Cold-start Recommendation

no code implementations2 Nov 2020 Yan Zhang, Ivor W. Tsang, Lixin Duan

Cold-start has being a critical issue in recommender systems with the explosion of data in e-commerce.

Marketing Recommendation Systems

Region Comparison Network for Interpretable Few-shot Image Classification

1 code implementation8 Sep 2020 Zhiyu Xue, Lixin Duan, Wen Li, Lin Chen, Jiebo Luo

For that, in this work, we propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works as in a neural network as well as to find out specific regions that are related to each other in images coming from the query and support sets.

Classification Few-Shot Image Classification +3

Dynamic and Static Context-aware LSTM for Multi-agent Motion Prediction

no code implementations ECCV 2020 Chaofan Tao, Qinhong Jiang, Lixin Duan, Ping Luo

Existing work addressed this challenge by either learning social spatial interactions represented by the positions of a group of pedestrians, while ignoring their temporal coherence (\textit{i. e.} dependencies between different long trajectories), or by understanding the complicated scene layout (\textit{e. g.} scene segmentation) to ensure safe navigation.

motion prediction Trajectory Prediction

Reconstruction Regularized Deep Metric Learning for Multi-label Image Classification

no code implementations27 Jul 2020 Changsheng Li, Chong Liu, Lixin Duan, Peng Gao, Kai Zheng

In this paper, we present a novel deep metric learning method to tackle the multi-label image classification problem.

General Classification Metric Learning +1

Deeply Aligned Adaptation for Cross-domain Object Detection

no code implementations5 Apr 2020 Minghao Fu, Zhenshan Xie, Wen Li, Lixin Duan

Cross-domain object detection has recently attracted more and more attention for real-world applications, since it helps build robust detectors adapting well to new environments.

Object object-detection +1

Unbiased Mean Teacher for Cross-domain Object Detection

1 code implementation CVPR 2021 Jinhong Deng, Wen Li, Yu-Hua Chen, Lixin Duan

We reveal that there often exists a considerable model bias for the simple mean teacher (MT) model in cross-domain scenarios, and eliminate the model bias with several simple yet highly effective strategies.

Object object-detection +2

Open-Ended Visual Question Answering by Multi-Modal Domain Adaptation

no code implementations Findings of the Association for Computational Linguistics 2020 Yiming Xu, Lin Chen, Zhongwei Cheng, Lixin Duan, Jiebo Luo

A straightforward solution is to fine-tune a pre-trained source model by using those limited labeled target data, but it usually cannot work well due to the considerable difference between the data distributions of the source and target domains.

Domain Adaptation Question Answering +1

Constructing Self-motivated Pyramid Curriculums for Cross-Domain Semantic Segmentation: A Non-Adversarial Approach

1 code implementation ICCV 2019 Qing Lian, Fengmao Lv, Lixin Duan, Boqing Gong

We propose a new approach, called self-motivated pyramid curriculum domain adaptation (PyCDA), to facilitate the adaptation of semantic segmentation neural networks from synthetic source domains to real target domains.

Segmentation Semantic Segmentation +2

Adversarial Multimodal Network for Movie Question Answering

no code implementations24 Jun 2019 Zhaoquan Yuan, Siyuan Sun, Lixin Duan, Xiao Wu, Changsheng Xu

In AMN, as inspired by generative adversarial networks, we propose to learn multimodal feature representations by finding a more coherent subspace for video clips and the corresponding texts (e. g., subtitles and questions).

Question Answering Video Question Answering +1

Domain Adversarial Reinforcement Learning for Partial Domain Adaptation

no code implementations10 May 2019 Jin Chen, Xinxiao wu, Lixin Duan, Shenghua Gao

In this more general and practical scenario, a major challenge is how to select source instances in the shared classes across different domains for positive transfer.

Partial Domain Adaptation Q-Learning +2

Known-class Aware Self-ensemble for Open Set Domain Adaptation

1 code implementation3 May 2019 Qing Lian, Wen Li, Lin Chen, Lixin Duan

Particularly, in open set domain adaptation, we allow the classes from the source and target domains to be partially overlapped.

Domain Adaptation

MiniMax Entropy Network: Learning Category-Invariant Features for Domain Adaptation

no code implementations21 Apr 2019 Chaofan Tao, Fengmao Lv, Lixin Duan, Min Wu

Unlike most existing approaches which employ a generator to deal with domain difference, MMEN focuses on learning the categorical information from unlabeled target samples with the help of labeled source samples.

Domain Adaptation

Exploiting Images for Video Recognition with Hierarchical Generative Adversarial Networks

no code implementations11 May 2018 Feiwu Yu, Xinxiao wu, Yuchao Sun, Lixin Duan

By taking advantage of these two-level adversarial learning, our method is capable of learning a domain-invariant feature representation of source images and target videos.

Domain Adaptation Video Recognition

Recurrent Image Captioner: Describing Images with Spatial-Invariant Transformation and Attention Filtering

no code implementations15 Dec 2016 Hao Liu, Yang Yang, Fumin Shen, Lixin Duan, Heng Tao Shen

Along with the prosperity of recurrent neural network in modelling sequential data and the power of attention mechanism in automatically identify salient information, image captioning, a. k. a., image description, has been remarkably advanced in recent years.

Image Captioning Variational Inference

Event Recognition in Videos by Learning from Heterogeneous Web Sources

no code implementations CVPR 2013 Lin Chen, Lixin Duan, Dong Xu

In this work, we propose to leverage a large number of loosely labeled web videos (e. g., from YouTube) and web images (e. g., from Google/Bing image search) for visual event recognition in consumer videos without requiring any labeled consumer videos.

Domain Adaptation Image Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.