Search Results for author: Changsheng Xu

Found 90 papers, 45 papers with code

StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion

2 code implementations9 Apr 2024 Ming Tao, Bing-Kun Bao, Hao Tang, YaoWei Wang, Changsheng Xu

3) The story visualization and continuation models are trained and inferred independently, which is not user-friendly.

Image Generation Story Visualization

CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion

1 code implementation25 Jan 2024 Nisha Huang, WeiMing Dong, Yuxin Zhang, Fan Tang, Ronghui Li, Chongyang Ma, Xiu Li, Changsheng Xu

Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images.

Image Generation Style Transfer

Hierarchical Prompts for Rehearsal-free Continual Learning

no code implementations21 Jan 2024 Yukun Zuo, Hantao Yao, Lu Yu, Liansheng Zhuang, Changsheng Xu

Nonetheless, these learnable prompts tend to concentrate on the discriminatory knowledge of the current task while ignoring past task knowledge, leading to that learnable prompts still suffering from catastrophic forgetting.

Continual Learning

Hierarchical Augmentation and Distillation for Class Incremental Audio-Visual Video Recognition

no code implementations11 Jan 2024 Yukun Zuo, Hantao Yao, Liansheng Zhuang, Changsheng Xu

We introduce Hierarchical Augmentation and Distillation (HAD), which comprises the Hierarchical Augmentation Module (HAM) and Hierarchical Distillation Module (HDM) to efficiently utilize the hierarchical structure of data and models, respectively.

Video Recognition

Erasing Self-Supervised Learning Backdoor by Cluster Activation Masking

1 code implementation13 Dec 2023 Shengsheng Qian, Yifei Wang, Dizhan Xue, Shengjie Zhang, Huaiwen Zhang, Changsheng Xu

After obtaining the threat model trained on the poisoned dataset, our method can precisely detect poisonous samples based on the assumption that masking the backdoor trigger can effectively change the activation of a downstream clustering model.

backdoor defense Self-Supervised Learning

MotionCrafter: One-Shot Motion Customization of Diffusion Models

1 code implementation8 Dec 2023 Yuxin Zhang, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, WeiMing Dong, Changsheng Xu

The essence of a video lies in its dynamic motions, including character actions, object movements, and camera movements.

Disentanglement Motion Disentanglement +3

TCP:Textual-based Class-aware Prompt tuning for Visual-Language Model

1 code implementation30 Nov 2023 Hantao Yao, Rui Zhang, Changsheng Xu

However, those textual tokens have a limited generalization ability regarding unseen domains, as they cannot dynamically adjust to the distribution of testing classes.

Language Modelling

Test-time Adaptive Vision-and-Language Navigation

no code implementations22 Nov 2023 Junyu Gao, Xuan Yao, Changsheng Xu

Then, these components are adaptively accumulated to pinpoint a concordant direction for fast model adaptation.

Test-time Adaptation Vision and Language Navigation

Learning Transferable Conceptual Prototypes for Interpretable Unsupervised Domain Adaptation

no code implementations12 Oct 2023 Junyu Gao, Xinhong Ma, Changsheng Xu

Despite the great progress of unsupervised domain adaptation (UDA) with the deep neural networks, current UDA models are opaque and cannot provide promising explanations, limiting their applications in the scenarios that require safe and controllable model decisions.

Decision Making Pseudo Label +2

A Survey on Interpretable Cross-modal Reasoning

1 code implementation5 Sep 2023 Dizhan Xue, Shengsheng Qian, Zuyi Zhou, Changsheng Xu

In recent years, cross-modal reasoning (CMR), the process of understanding and reasoning across different modalities, has emerged as a pivotal area with applications spanning from multimedia analysis to healthcare diagnostics.

Cross-Modal Retrieval Decision Making +8

Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object Detection

no code implementations30 Aug 2023 Yifan Xu, Mengdan Zhang, Xiaoshan Yang, Changsheng Xu

In this paper, we for the first time explore helpful multi-modal contextual knowledge to understand novel categories for open-vocabulary object detection (OVD).

Knowledge Distillation Language Modelling +4

Introducing Foundation Models as Surrogate Models: Advancing Towards More Practical Adversarial Attacks

no code implementations13 Jul 2023 Jiaming Zhang, Jitao Sang, Qi Yi, Changsheng Xu

Harnessing the concept of non-robust features, we elaborate on two guiding principles for surrogate model selection to explain why the foundational model is an optimal choice for this role.

Adversarial Attack Attribute +1

Multimodal Imbalance-Aware Gradient Modulation for Weakly-supervised Audio-Visual Video Parsing

no code implementations5 Jul 2023 Jie Fu, Junyu Gao, Changsheng Xu

In this paper, to balance the feature learning processes of different modalities, a dynamic gradient modulation (DGM) mechanism is explored, where a novel and effective metric function is designed to measure the imbalanced feature learning between audio and visual modalities.

Multi-modal Queried Object Detection in the Wild

1 code implementation NeurIPS 2023 Yifan Xu, Mengdan Zhang, Chaoyou Fu, Peixian Chen, Xiaoshan Yang, Ke Li, Changsheng Xu

To address the learning inertia problem brought by the frozen detector, a vision conditioned masked language prediction strategy is proposed.

Few-Shot Object Detection Object +2

ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models

3 code implementations25 May 2023 Yuxin Zhang, WeiMing Dong, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Oliver Deussen, Changsheng Xu

We apply ProSpect in various personalized attribute-aware image generation applications, such as image-guided or text-driven manipulations of materials, style, and layout, achieving previously unattainable results from a single image input without fine-tuning the diffusion models.

Attribute Disentanglement +1

Camera-Incremental Object Re-Identification with Identity Knowledge Evolution

1 code implementation25 May 2023 Hantao Yao, Lu Yu, Jifei Luo, Changsheng Xu

In this paper, we propose a novel Identity Knowledge Evolution (IKE) framework for CIOR, consisting of the Identity Knowledge Association (IKA), Identity Knowledge Distillation (IKD), and Identity Knowledge Update (IKU).

Knowledge Distillation Object

CLIP-VG: Self-paced Curriculum Adapting of CLIP for Visual Grounding

1 code implementation15 May 2023 Linhui Xiao, Xiaoshan Yang, Fang Peng, Ming Yan, YaoWei Wang, Changsheng Xu

In order to utilize vision and language pre-trained models to address the grounding problem, and reasonably take advantage of pseudo-labels, we propose CLIP-VG, a novel method that can conduct self-paced curriculum adapting of CLIP with pseudo-language labels.

Transfer Learning Visual Grounding

Visual-Language Prompt Tuning with Knowledge-guided Context Optimization

1 code implementation CVPR 2023 Hantao Yao, Rui Zhang, Changsheng Xu

Representative CoOp-based work combines the learnable textual tokens with the class tokens to obtain specific textual knowledge.

Language Modelling

A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive Learning

1 code implementation9 Mar 2023 Yuxin Zhang, Fan Tang, WeiMing Dong, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Changsheng Xu

Our framework consists of three key components, i. e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.

Contrastive Learning Representation Learning +1

Backdoor for Debias: Mitigating Model Bias with Backdoor Attack-based Artificial Bias

no code implementations1 Mar 2023 Shangxi Wu, Qiuyang He, Fangzhao Wu, Jitao Sang, YaoWei Wang, Changsheng Xu

In this work, we found that the backdoor attack can construct an artificial bias similar to the model bias derived in standard training.

Backdoor Attack Knowledge Distillation

Region-Aware Diffusion for Zero-shot Text-driven Image Editing

1 code implementation23 Feb 2023 Nisha Huang, Fan Tang, WeiMing Dong, Tong-Yee Lee, Changsheng Xu

Different from current mask-based image editing methods, we propose a novel region-aware diffusion model (RDM) for entity-level image editing, which could automatically locate the region of interest and replace it following given text prompts.

Image Manipulation

Variational Causal Inference Network for Explanatory Visual Question Answering

1 code implementation ICCV 2023 Dizhan Xue, Shengsheng Qian, Changsheng Xu

To address these issues, we propose a Variational Causal Inference Network (VCIN) that establishes the causal correlation between predicted answers and explanations, and captures cross-modal relationships to generate rational explanations.

Explanation Generation Explanatory Visual Question Answering +2

Active Exploration of Multimodal Complementarity for Few-Shot Action Recognition

no code implementations CVPR 2023 Yuyang Wanyan, Xiaoshan Yang, Chaofan Chen, Changsheng Xu

In meta-training, we design an Active Sample Selection (ASS) module to organize query samples with large differences in the reliability of modalities into different groups based on modality-specific posterior distributions.

Few-Shot action recognition Few Shot Action Recognition +2

Cascade Evidential Learning for Open-World Weakly-Supervised Temporal Action Localization

no code implementations CVPR 2023 Mengyuan Chen, Junyu Gao, Changsheng Xu

Targeting at recognizing and localizing action instances with only video-level labels during training, Weakly-supervised Temporal Action Localization (WTAL) has achieved significant progress in recent years.

Open Set Learning Weakly-supervised Temporal Action Localization +1

VQACL: A Novel Visual Question Answering Continual Learning Setting

1 code implementation CVPR 2023 Xi Zhang, Feifei Zhang, Changsheng Xu

Research on continual learning has recently led to a variety of work in unimodal community, however little attention has been paid to multimodal tasks like visual question answering (VQA).

Continual Learning Question Answering +2

Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception

1 code implementation CVPR 2023 Junyu Gao, Mengyuan Chen, Changsheng Xu

We argue that, for an event residing in one modality, the modality itself should provide ample presence evidence of this event, while the other complementary modality is encouraged to afford the absence evidence as a reference signal.

UTM: A Unified Multiple Object Tracking Model With Identity-Aware Feature Enhancement

no code implementations CVPR 2023 Sisi You, Hantao Yao, Bing-Kun Bao, Changsheng Xu

Recently, Multiple Object Tracking has achieved great success, which consists of object detection, feature embedding, and identity association.

Multiple Object Tracking object-detection +1

Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples

1 code implementation CVPR 2023 Jiaming Zhang, Xingjun Ma, Qi Yi, Jitao Sang, Yu-Gang Jiang, YaoWei Wang, Changsheng Xu

Furthermore, we propose to leverage VisionandLanguage Pre-trained Models (VLPMs) like CLIP as the surrogate model to improve the transferability of the crafted UCs to diverse domains.

Data Poisoning

SgVA-CLIP: Semantic-guided Visual Adapting of Vision-Language Models for Few-shot Image Classification

no code implementations28 Nov 2022 Fang Peng, Xiaoshan Yang, Linhui Xiao, YaoWei Wang, Changsheng Xu

Although significant progress has been made in few-shot learning, most of existing few-shot image classification methods require supervised pre-training on a large amount of samples of base classes, which limits their generalization ability in real world application.

Few-Shot Image Classification Few-Shot Learning +2

Inversion-Based Style Transfer with Diffusion Models

1 code implementation CVPR 2023 Yuxin Zhang, Nisha Huang, Fan Tang, Haibin Huang, Chongyang Ma, WeiMing Dong, Changsheng Xu

Our key idea is to learn artistic style directly from a single painting and then guide the synthesis without providing complex textual descriptions.

Denoising Style Transfer +1

DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization

1 code implementation19 Nov 2022 Nisha Huang, Yuxin Zhang, Fan Tang, Chongyang Ma, Haibin Huang, Yong Zhang, WeiMing Dong, Changsheng Xu

Despite the impressive results of arbitrary image-guided style transfer methods, text-driven image stylization has recently been proposed for transferring a natural image into a stylized one according to textual descriptions of the target style provided by the user.

Denoising Image Stylization

Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models

1 code implementation4 Nov 2022 Chengcheng Ma, Yang Liu, Jiankang Deng, Lingxi Xie, WeiMing Dong, Changsheng Xu

Pretrained vision-language models (VLMs) such as CLIP have shown impressive generalization capability in downstream vision tasks with appropriate text prompts.

object-detection Open Vocabulary Object Detection +2

MMT: Image-guided Story Ending Generation with Multimodal Memory Transformer

1 code implementation ACM MM 2022 Dizhan Xue, Shengsheng Qian, Quan Fang, Changsheng Xu

Finally, a multimodal transformer decoder constructs attention among multimodal features to learn the story dependency and generates informative, reasonable, and coherent story endings.

Image Captioning Image-guided Story Ending Generation +2

Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion

1 code implementation27 Sep 2022 Nisha Huang, Fan Tang, WeiMing Dong, Changsheng Xu

Extensive experimental results on the quality and quantity of the generated digital art paintings confirm the effectiveness of the combination of the diffusion model and multimodal guidance.

Integrating multi-label contrastive learning with dual adversarial graph neural networks for cross-modal retrieval

1 code implementation IEEE Transactions on Pattern Analysis and Machine Intelligence 2022 Dizhan Xue, Shengsheng Qian, Quan Fang, Changsheng Xu

To date, most of the existing techniques mainly convert multimodal data into a common representation space where similarities in semantics between samples can be easily measured across multiple modalities.

Contrastive Learning Cross-Modal Retrieval +1

Learning Muti-expert Distribution Calibration for Long-tailed Video Classification

no code implementations22 May 2022 Yufan Hu, Junyu Gao, Changsheng Xu

Most existing state-of-the-art video classification methods assume that the training data obey a uniform distribution.

Classification Image Classification +1

Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning

1 code implementation19 May 2022 Yuxin Zhang, Fan Tang, WeiMing Dong, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Changsheng Xu

Our framework consists of three key components, i. e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.

Contrastive Learning Image Stylization +1

MGDCF: Distance Learning via Markov Graph Diffusion for Neural Collaborative Filtering

2 code implementations5 Apr 2022 Jun Hu, Bryan Hooi, Shengsheng Qian, Quan Fang, Changsheng Xu

Based on a Markov process that trades off two types of distances, we present Markov Graph Diffusion Collaborative Filtering (MGDCF) to generalize some state-of-the-art GNN-based CF models.

Collaborative Filtering Recommendation Systems +1

Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding

1 code implementation4 Apr 2022 Ziyue Wu, Junyu Gao, Shucheng Huang, Changsheng Xu

Then, a commonsense-aware interaction module is designed to obtain bridged visual and text features by utilizing the learned commonsense concepts.

Natural Language Queries

Fine-grained Temporal Contrastive Learning for Weakly-supervised Temporal Action Localization

1 code implementation CVPR 2022 Junyu Gao, Mengyuan Chen, Changsheng Xu

We target at the task of weakly-supervised action localization (WSAL), where only video-level action labels are available during model training.

Classification Contrastive Learning +4

StyTr2: Image Style Transfer With Transformers

3 code implementations CVPR 2022 Yingying Deng, Fan Tang, WeiMing Dong, Chongyang Ma, Xingjia Pan, Lei Wang, Changsheng Xu

The goal of image style transfer is to render an image with artistic features guided by a style reference while maintaining the original content.

Style Transfer

Dynamic Scene Graph Generation via Anticipatory Pre-Training

no code implementations CVPR 2022 Yiming Li, Xiaoshan Yang, Changsheng Xu

Humans can not only see the collection of objects in visual scenes, but also identify the relationship between objects.

Graph Generation Scene Graph Generation

Dual Cluster Contrastive learning for Object Re-Identification

1 code implementation9 Dec 2021 Hantao Yao, Changsheng Xu

Unlike the individual-based updating mechanism, the centroid-based updating mechanism that applies the mean feature of each cluster to update the cluster memory can reduce the impact of individual samples.

Contrastive Learning Object +1

SSAGCN: Social Soft Attention Graph Convolution Network for Pedestrian Trajectory Prediction

no code implementations5 Dec 2021 Pei Lv, Wentong Wang, Yunxin Wang, Yuzhen Zhang, Mingliang Xu, Changsheng Xu

In detail, when modeling social interaction, we propose a new \emph{social soft attention function}, which fully considers various interaction factors among pedestrians.

Autonomous Driving Pedestrian Trajectory Prediction +1

Contrastive Adaptive Propagation Graph Neural Networks for Efficient Graph Learning

1 code implementation2 Dec 2021 Jun Hu, Shengsheng Qian, Quan Fang, Changsheng Xu

Recently the field has advanced from local propagation schemes that focus on local neighbors towards extended propagation schemes that can directly deal with extended neighbors consisting of both local and high-order neighbors.

Graph Learning Self-Supervised Learning

Weakly-Supervised Video Object Grounding via Causal Intervention

no code implementations1 Dec 2021 Wei Wang, Junyu Gao, Changsheng Xu

With this in mind, we design a unified causal framework to learn the deconfounded object-relevant association for more accurate and robust video object grounding.

Contrastive Learning Object +1

GRecX: An Efficient and Unified Benchmark for GNN-based Recommendation

1 code implementation19 Nov 2021 Desheng Cai, Jun Hu, Quan Zhao, Shengsheng Qian, Quan Fang, Changsheng Xu

In this paper, we present GRecX, an open-source TensorFlow framework for benchmarking GNN-based recommendation models in an efficient and unified way.

Benchmarking Management

Towards Predictable Feature Attribution: Revisiting and Improving Guided BackPropagation

no code implementations29 Sep 2021 Guanhua Zheng, Jitao Sang, Wang Haonan, Changsheng Xu

Recently, backpropagation(BP)-based feature attribution methods have been widely adopted to interpret the internal mechanisms of convolutional neural networks (CNNs), and expected to be human-understandable (lucidity) and faithful to decision-making processes (fidelity).

Decision Making

Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer

1 code implementation3 Aug 2021 Yifan Xu, Zhijie Zhang, Mengdan Zhang, Kekai Sheng, Ke Li, WeiMing Dong, Liqing Zhang, Changsheng Xu, Xing Sun

Vision transformers (ViTs) have recently received explosive popularity, but the huge computational cost is still a severe issue.

Efficient ViTs

DualVGR: A Dual-Visual Graph Reasoning Unit for Video Question Answering

1 code implementation10 Jul 2021 Jianyu Wang, Bing-Kun Bao, Changsheng Xu

However, existing graph-based methods fail to perform multi-step reasoning well, neglecting two properties of VideoQA: (1) Even for the same video, different questions may require different amount of video clips or objects to infer the answer with relational reasoning; (2) During reasoning, appearance and motion features have complicated interdependence which are correlated and complementary to each other.

Graph Attention Question Answering +3

ECKPN: Explicit Class Knowledge Propagation Network for Transductive Few-shot Learning

no code implementations CVPR 2021 Chaofan Chen, Xiaoshan Yang, Changsheng Xu, Xuhui Huang, Zhe Ma

Specifically, we first employ the comparison module to explore the pairwise sample relations to learn rich sample representations in the instance-level graph.

Few-Shot Learning

User-Guided Personalized Image Aesthetic Assessment based on Deep Reinforcement Learning

no code implementations14 Jun 2021 Pei Lv, Jianqi Fan, Xixi Nie, WeiMing Dong, Xiaoheng Jiang, Bing Zhou, Mingliang Xu, Changsheng Xu

This framework leverages user interactions to retouch and rank images for aesthetic assessment based on deep reinforcement learning (DRL), and generates personalized aesthetic distribution that is more in line with the aesthetic preferences of different users.

Image Enhancement reinforcement-learning +1

StyTr$^2$: Image Style Transfer with Transformers

4 code implementations30 May 2021 Yingying Deng, Fan Tang, WeiMing Dong, Chongyang Ma, Xingjia Pan, Lei Wang, Changsheng Xu

The goal of image style transfer is to render an image with artistic features guided by a style reference while maintaining the original content.

Style Transfer

Dual adversarial graph neural networks for multi-label cross-modal retrieval

1 code implementation AAAI 2021 Shengsheng Qian, Dizhan Xue, Huaiwen Zhang, Quan Fang, Changsheng Xu

To date, most existing methods transform multimodal data into a common representation space where semantic similarities between items can be directly measured across different modalities.

Cross-Modal Retrieval Retrieval

Towards Corruption-Agnostic Robust Domain Adaptation

no code implementations21 Apr 2021 Yifan Xu, Kekai Sheng, WeiMing Dong, Baoyuan Wu, Changsheng Xu, Bao-Gang Hu

However, due to unpredictable corruptions (e. g., noise and blur) in real data like web images, domain adaptation methods are increasingly required to be corruption robust on target domains.

Domain Adaptation

Health Status Prediction with Local-Global Heterogeneous Behavior Graph

no code implementations23 Mar 2021 Xuan Ma, Xiaoshan Yang, Junyu Gao, Changsheng Xu

However, these data streams are multi-source and heterogeneous, containing complex temporal structures with local contextual and global temporal aspects, which makes the feature learning and data joint utilization challenging.

Management

Unveiling the Potential of Structure Preserving for Weakly Supervised Object Localization

1 code implementation CVPR 2021 Xingjia Pan, Yingguo Gao, Zhiwen Lin, Fan Tang, WeiMing Dong, Haolei Yuan, Feiyue Huang, Changsheng Xu

Weakly supervised object localization(WSOL) remains an open problem given the deficiency of finding object extent information using a classification network.

Classification General Classification +3

Efficient Graph Deep Learning in TensorFlow with tf_geometric

1 code implementation27 Jan 2021 Jun Hu, Shengsheng Qian, Quan Fang, Youze Wang, Quan Zhao, Huaiwen Zhang, Changsheng Xu

We introduce tf_geometric, an efficient and friendly library for graph deep learning, which is compatible with both TensorFlow 1. x and 2. x.

General Classification Graph Classification +5

Active Universal Domain Adaptation

no code implementations ICCV 2021 Xinhong Ma, Junyu Gao, Changsheng Xu

This paper proposes a new paradigm for unsupervised domain adaptation, termed as Active Universal Domain Adaptation (AUDA), which removes all label set assumptions and aims for not only recognizing target samples from source classes but also inferring those from target-private classes by using active learning to annotate a small budget of target data.

Active Learning Universal Domain Adaptation +1

Fast Video Moment Retrieval

no code implementations ICCV 2021 Junyu Gao, Changsheng Xu

To tackle this issue, we replace the cross-modal interaction module with a cross-modal common space, in which moment-query alignment is learned and efficient moment search can be performed.

Moment Retrieval Retrieval +1

Effective Label Propagation for Discriminative Semi-Supervised Domain Adaptation

no code implementations4 Dec 2020 Zhiyong Huang, Kekai Sheng, WeiMing Dong, Xing Mei, Chongyang Ma, Feiyue Huang, Dengwen Zhou, Changsheng Xu

For intra-domain propagation, we propose an effective self-training strategy to mitigate the noises in pseudo-labeled target domain data and improve the feature discriminability in the target domain.

Domain Adaptation Image Classification +1

Arbitrary Video Style Transfer via Multi-Channel Correlation

no code implementations17 Sep 2020 Yingying Deng, Fan Tang, Wei-Ming Dong, Haibin Huang, Chongyang Ma, Changsheng Xu

Towards this end, we propose Multi-Channel Correction network (MCCNet), which can be trained to fuse the exemplar style features and input content features for efficient style transfer while naturally maintaining the coherence of input videos.

Style Transfer Video Style Transfer

MMCGAN: Generative Adversarial Network with Explicit Manifold Prior

no code implementations18 Jun 2020 Guanhua Zheng, Jitao Sang, Changsheng Xu

Since the basic assumption of conventional manifold learning fails in case of sparse and uneven data distribution, we introduce a new target, Minimum Manifold Coding (MMC), for manifold learning to encourage simple and unfolded manifold.

Generative Adversarial Network

Distribution Aligned Multimodal and Multi-Domain Image Stylization

no code implementations2 Jun 2020 Minxuan Lin, Fan Tang, Wei-Ming Dong, Xiao Li, Chongyang Ma, Changsheng Xu

Currently, there are few methods that can perform both multimodal and multi-domain stylization simultaneously.

Image Stylization

Attribute-Induced Bias Eliminating for Transductive Zero-Shot Learning

no code implementations31 May 2020 Hantao Yao, Shaobo Min, Yongdong Zhang, Changsheng Xu

Then, an attentional graph attribute embedding is proposed to reduce the semantic bias between seen and unseen categories, which utilizes the graph operation to capture the semantic relationship between categories.

Attribute Transfer Learning +1

Joint Person Objectness and Repulsion for Person Search

no code implementations30 May 2020 Hantao Yao, Changsheng Xu

Based on this repulsion constraint, the repulsion term is proposed to reduce the similarity of distractor images that are not most similar to the probe person.

Human Detection Person Search

Arbitrary Style Transfer via Multi-Adaptation Network

2 code implementations27 May 2020 Yingying Deng, Fan Tang, Wei-Ming Dong, Wen Sun, Feiyue Huang, Changsheng Xu

Arbitrary style transfer is a significant topic with research value and application prospect.

Disentanglement Style Transfer

Adaptive Adversarial Logits Pairing

no code implementations25 May 2020 Shangxi Wu, Jitao Sang, Kaiyuan Xu, Guanhua Zheng, Changsheng Xu

Specifically, AALP consists of an adaptive feature optimization module with Guided Dropout to systematically pursue fewer high-contribution features, and an adaptive sample weighting module by setting sample-specific training weights to balance between logits pairing loss and classification loss.

Classification General Classification +1

Dynamic Refinement Network for Oriented and Densely Packed Object Detection

1 code implementation CVPR 2020 Xingjia Pan, Yuqiang Ren, Kekai Sheng, Wei-Ming Dong, Haolei Yuan, Xiaowei Guo, Chongyang Ma, Changsheng Xu

However, the detection of oriented and densely packed objects remains challenging because of following inherent reasons: (1) receptive fields of neurons are all axis-aligned and of the same shape, whereas objects are usually of diverse shapes and align along various directions; (2) detection models are typically trained with generic knowledge and may not generalize well to handle specific objects at test time; (3) the limited dataset hinders the development on this task.

feature selection object-detection +2

Time-Guided High-Order Attention Model of Longitudinal Heterogeneous Healthcare Data

no code implementations28 Nov 2019 Yi Huang, Xiaoshan Yang, Changsheng Xu

(1) It can model longitudinal heterogeneous EHRs data via capturing the 3-order correlations of different modalities and the irregular temporal impact of historical events.

Management Mortality Prediction +1

A Generalization Theory based on Independent and Task-Identically Distributed Assumption

no code implementations28 Nov 2019 Guanhua Zheng, Jitao Sang, Houqiang Li, Jian Yu, Changsheng Xu

The derived generalization bound based on the ITID assumption identifies the significance of hypothesis invariance in guaranteeing generalization performance.

Image Classification

Adversarial Multimodal Network for Movie Question Answering

no code implementations24 Jun 2019 Zhaoquan Yuan, Siyuan Sun, Lixin Duan, Xiao Wu, Changsheng Xu

In AMN, as inspired by generative adversarial networks, we propose to learn multimodal feature representations by finding a more coherent subspace for video clips and the corresponding texts (e. g., subtitles and questions).

Question Answering Video Question Answering +1

Depth Information Guided Crowd Counting for Complex Crowd Scenes

no code implementations3 Mar 2018 Mingliang Xu, Zhaoyang Ge, Xiaoheng Jiang, Gaoge Cui, Pei Lv, Bing Zhou, Changsheng Xu

DigCrowd first uses the depth information of an image to segment the scene into a far-view region and a near-view region.

Crowd Counting

Understanding Deep Learning Generalization by Maximum Entropy

no code implementations ICLR 2018 Guanhua Zheng, Jitao Sang, Changsheng Xu

DNN is then regarded as approximating the feature conditions with multilayer feature learning, and proved to be a recursive solution towards maximum entropy principle.

regression

Structural Sparse Tracking

no code implementations CVPR 2015 Tianzhu Zhang, Si Liu, Changsheng Xu, Shuicheng Yan, Bernard Ghanem, Narendra Ahuja, Ming-Hsuan Yang

Sparse representation has been applied to visual tracking by finding the best target candidate with minimal reconstruction error by use of target templates.

Visual Tracking

Matching-CNN Meets KNN: Quasi-Parametric Human Parsing

no code implementations CVPR 2015 Si Liu, Xiaodan Liang, Luoqi Liu, Xiaohui Shen, Jianchao Yang, Changsheng Xu, Liang Lin, Xiaochun Cao, Shuicheng Yan

Under the classic K Nearest Neighbor (KNN)-based nonparametric framework, the parametric Matching Convolutional Neural Network (M-CNN) is proposed to predict the matching confidence and displacements of the best matched region in the testing image for a particular semantic region in one KNN image.

Human Parsing

Cannot find the paper you are looking for? You can Submit a new open access paper.