Search Results for author: Xiaojun Chang

Found 118 papers, 43 papers with code

Compound Rank-k Projections for Bilinear Analysis

no code implementations23 Nov 2014 Xiaojun Chang, Feiping Nie, Sen Wang, Yi Yang, Xiaofang Zhou, Chengqi Zhang

In many real-world applications, data are represented by matrices or high-order tensors.

Semi-supervised Feature Analysis by Mining Correlations among Multiple Tasks

no code implementations23 Nov 2014 Xiaojun Chang, Yi Yang

In this paper, we propose a novel semi-supervised feature selection framework by mining correlations among multiple tasks and apply it to different multimedia applications.

feature selection

Balanced k-Means and Min-Cut Clustering

no code implementations23 Nov 2014 Xiaojun Chang, Feiping Nie, Zhigang Ma, Yi Yang

Clustering is an effective technique in data mining to generate groups that are the matter of interest.

Clustering

Improved Spectral Clustering via Embedded Label Propagation

no code implementations23 Nov 2014 Xiaojun Chang, Feiping Nie, Yi Yang, Heng Huang

Our algorithm is built upon two advancements of the state of the art:1) label propagation, which propagates a node\'s labels to neighboring nodes according to their proximity; and 2) manifold learning, which has been widely used in its capacity to leverage the manifold structure of data points.

Clustering

A Convex Formulation for Spectral Shrunk Clustering

no code implementations23 Nov 2014 Xiaojun Chang, Feiping Nie, Zhigang Ma, Yi Yang, Xiaofang Zhou

Thus, applying manifold information obtained from the original space to the clustering process in a low-dimensional subspace is prone to inferior performance.

Clustering Dimensionality Reduction

A Convex Sparse PCA for Feature Analysis

no code implementations23 Nov 2014 Xiaojun Chang, Feiping Nie, Yi Yang, Heng Huang

In addition, based on the sparse model used in CSPCA, an optimal weight is assigned to each of the original feature, which in turn provides the output with good interpretability.

Dimensionality Reduction feature selection +1

Unsupervised Feature Analysis with Class Margin Optimization

no code implementations3 Jun 2015 Sen Wang, Feiping Nie, Xiaojun Chang, Lina Yao, Xue Li, Quan Z. Sheng

In this paper, we propose an unsupervised feature selection method seeking a feature coefficient matrix to select the most distinctive features.

Clustering Feature Correlation +1

Dynamic Concept Composition for Zero-Example Event Detection

no code implementations14 Jan 2016 Xiaojun Chang, Yi Yang, Guodong Long, Chengqi Zhang, Alexander G. Hauptmann

In this paper, we focus on automatically detecting events in unconstrained videos without the use of any visual training exemplars.

Event Detection Zero-Shot Learning

Strategies for Searching Video Content with Text Queries or Video Examples

no code implementations17 Jun 2016 Shoou-I Yu, Yi Yang, Zhongwen Xu, Shicheng Xu, Deyu Meng, Zexi Mao, Zhigang Ma, Ming Lin, Xuanchong Li, Huan Li, Zhenzhong Lan, Lu Jiang, Alexander G. Hauptmann, Chuang Gan, Xingzhong Du, Xiaojun Chang

The large number of user-generated videos uploaded on to the Internet everyday has led to many commercial video search engines, which mainly rely on text metadata for search.

Event Detection Retrieval +1

Uncovering Locally Discriminative Structure for Feature Analysis

no code implementations9 Jul 2016 Sen Wang, Feiping Nie, Xiaojun Chang, Xue Li, Quan Z. Sheng, Lina Yao

We propose a method that utilizes both the manifold structure of data and local discriminant information.

Simple to Complex Cross-modal Learning to Rank

no code implementations4 Feb 2017 Minnan Luo, Xiaojun Chang, Zhihui Li, Liqiang Nie, Alexander G. Hauptmann, Qinghua Zheng

The heterogeneity-gap between different modalities brings a significant challenge to multimedia information retrieval.

Cross-Modal Retrieval Information Retrieval +3

Complex Event Detection by Identifying Reliable Shots From Untrimmed Videos

no code implementations ICCV 2017 Hehe Fan, Xiaojun Chang, De Cheng, Yi Yang, Dong Xu, Alexander G. Hauptmann

relevant) to the given event class, we formulate this task as a multi-instance learning (MIL) problem by taking each video as a bag and the video shots in each video as instances.

Event Detection

Reinforcement Cutting-Agent Learning for Video Object Segmentation

no code implementations CVPR 2018 Junwei Han, Le Yang, Dingwen Zhang, Xiaojun Chang, Xiaodan Liang

In this paper, we formulate this problem as a Markov Decision Process, where agents are learned to segment object regions under a deep reinforcement learning framework.

Decision Making Object +5

Multi-shot Person Re-identification through Set Distance with Visual Distributional Representation

no code implementations3 Aug 2018 Ting-yao Hu, Xiaojun Chang, Alexander G. Hauptmann

In this work, we propose the idea of visual distributional representation, which interprets an image set as samples drawn from an unknown distribution in appearance feature space.

Person Re-Identification

RCAA: Relational Context-Aware Agents for Person Search

no code implementations ECCV 2018 Xiaojun Chang, Po-Yao Huang, Yi-Dong Shen, Xiaodan Liang, Yi Yang, Alexander G. Hauptmann

In this paper, we address this problem by training relational context-aware agents which learn the actions to localize the target person from the gallery of whole scene images.

Person Search

MMALFM: Explainable Recommendation by Leveraging Reviews and Images

no code implementations12 Nov 2018 Zhiyong Cheng, Xiaojun Chang, Lei Zhu, Rose C. Kanjirathinkal, Mohan Kankanhalli

Then the aspect importance is integrated into a novel aspect-aware latent factor model (ALFM), which learns user's and item's latent factors based on ratings.

Explainable Recommendation

Ensemble Teaching for Hybrid Label Propagation

no code implementations8 Apr 2019 Chen Gong, DaCheng Tao, Xiaojun Chang, Jian Yang

More importantly, HyDEnT conducts propagation under the guidance of an ensemble of teachers.

Continual Reinforcement Learning with Diversity Exploration and Adversarial Self-Correction

no code implementations21 Jun 2019 Fengda Zhu, Xiaojun Chang, Runhao Zeng, Mingkui Tan

We first develop an unsupervised diversity exploration method to learn task-specific skills using an unsupervised objective.

Autonomous Driving Continuous Control +2

Multi-Head Attention with Diversity for Learning Grounded Multilingual Multimodal Representations

no code implementations IJCNLP 2019 Po-Yao Huang, Xiaojun Chang, Alexander Hauptmann

With the aim of promoting and understanding the multilingual version of image search, we leverage visual object detection and propose a model with diverse multi-head attention to learn grounded multilingual multimodal representations.

Image Retrieval object-detection +2

Vision-Language Navigation with Self-Supervised Auxiliary Reasoning Tasks

no code implementations CVPR 2020 Fengda Zhu, Yi Zhu, Xiaojun Chang, Xiaodan Liang

In this paper, we introduce Auxiliary Reasoning Navigation (AuxRN), a framework with four self-supervised auxiliary reasoning tasks to take advantage of the additional training signals derived from the semantic information.

Navigate Vision-Language Navigation

Unity Style Transfer for Person Re-Identification

no code implementations CVPR 2020 Chong Liu, Xiaojun Chang, Yi-Dong Shen

To solve this problem, we propose a UnityStyle adaption method, which can smooth the style disparities within the same camera and across different cameras.

Person Re-Identification Style Transfer +1

ZSTAD: Zero-Shot Temporal Activity Detection

no code implementations CVPR 2020 Lingling Zhang, Xiaojun Chang, Jun Liu, Minnan Luo, Sen Wang, ZongYuan Ge, Alexander Hauptmann

An integral part of video analysis and surveillance is temporal activity detection, which means to simultaneously recognize and localize activities in long untrimmed videos.

Action Detection Activity Detection

Vision-Dialog Navigation by Exploring Cross-modal Memory

1 code implementation CVPR 2020 Yi Zhu, Fengda Zhu, Zhaohuan Zhan, Bingqian Lin, Jianbin Jiao, Xiaojun Chang, Xiaodan Liang

Benefiting from the collaborative learning of the L-mem and the V-mem, our CMN is able to explore the memory about the decision making of historical navigation actions which is for the current step.

Decision Making

Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks

2 code implementations24 May 2020 Zonghan Wu, Shirui Pan, Guodong Long, Jing Jiang, Xiaojun Chang, Chengqi Zhang

Modeling multivariate time series has long been a subject that has attracted researchers from a diverse range of fields including economics, finance, and traffic.

Graph Learning Multivariate Time Series Forecasting +3

A Comprehensive Survey of Neural Architecture Search: Challenges and Solutions

no code implementations1 Jun 2020 Pengzhen Ren, Yun Xiao, Xiaojun Chang, Po-Yao Huang, Zhihui Li, Xiaojiang Chen, Xin Wang

Neural Architecture Search (NAS) is just such a revolutionary algorithm, and the related research work is complicated and rich.

Neural Architecture Search

Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report Generation

2 code implementations6 Jun 2020 Mingjie Li, Fuyu Wang, Xiaojun Chang, Xiaodan Liang

Firstly, the regions of primary interest to radiologists are usually located in a small area of the global image, meaning that the remainder parts of the image could be considered as irrelevant noise in the training procedure.

Image Captioning Medical Report Generation +1

Multi-view Drone-based Geo-localization via Style and Spatial Alignment

no code implementations23 Jun 2020 Siyi Hu, Xiaojun Chang

In this paper, we focus on the task of multi-view multi-source geo-localization, which serves as an important auxiliary method of GPS positioning by matching drone-view image and satellite-view image with pre-annotated GPS tag.

TAG

Accurate Bounding-box Regression with Distance-IoU Loss for Visual Tracking

no code implementations3 Jul 2020 Di Yuan, Xiu Shu, Nana Fan, Xiaojun Chang, Qiao Liu, Zhenyu He

Moreover, we introduce a classification part that is trained online and optimized with a Conjugate-Gradient-based strategy to guarantee real-time tracking speed.

regression Visual Tracking

Self-Weighted Robust LDA for Multiclass Classification with Edge Classes

no code implementations24 Sep 2020 Caixia Yan, Xiaojun Chang, Minnan Luo, Qinghua Zheng, Xiaoqin Zhang, Zhihui Li, Feiping Nie

In this regard, a novel self-weighted robust LDA with l21-norm based pairwise between-class distance criterion, called SWRLDA, is proposed for multi-class classification especially with edge classes.

Classification Computational Efficiency +2

Hierarchical Neural Architecture Search for Deep Stereo Matching

1 code implementation NeurIPS 2020 Xuelian Cheng, Yiran Zhong, Mehrtash Harandi, Yuchao Dai, Xiaojun Chang, Tom Drummond, Hongdong Li, ZongYuan Ge

To reduce the human efforts in neural network design, Neural Architecture Search (NAS) has been applied with remarkable success to various high-level vision tasks such as classification and semantic segmentation.

Neural Architecture Search Semantic Segmentation +3

Differentiable Neural Architecture Search in Equivalent Space with Exploration Enhancement

1 code implementation NeurIPS 2020 Miao Zhang, Huiqi Li, Shirui Pan, Xiaojun Chang, ZongYuan Ge, Steven Su

A probabilistic exploration enhancement method is accordingly devised to encourage intelligent exploration during the architecture search in the latent space, to avoid local optimal in architecture search.

Bilevel Optimization Neural Architecture Search

Exploring Inter-Channel Correlation for Diversity-Preserved Knowledge Distillation

2 code implementations ICCV 2021 Li Liu, Qingle Huang, Sihao Lin, Hongwei Xie, Bing Wang, Xiaojun Chang, Xiaodan Liang

Extensive experiments on two vision tasks, including ImageNet classification and Pascal VOC segmentation, demonstrate the superiority of our ICKD, which consistently outperforms many existing methods, advancing the state-of-the-art in the fields of Knowledge Distillation.

Knowledge Distillation

UPDeT: Universal Multi-agent RL via Policy Decoupling with Transformers

no code implementations ICLR 2021 Siyi Hu, Fengda Zhu, Xiaojun Chang, Xiaodan Liang

Recent advances in multi-agent reinforcement learning have been largely limited in training one model from scratch for every new task.

reinforcement-learning Reinforcement Learning (RL) +1

EXPLORING VULNERABILITIES OF BERT-BASED APIS

no code implementations1 Jan 2021 Xuanli He, Lingjuan Lyu, Lichao Sun, Xiaojun Chang, Jun Zhao

We then demonstrate how the extracted model can be exploited to develop effective attribute inference attack to expose sensitive information of the training data.

Attribute Inference Attack +4

UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers

1 code implementation20 Jan 2021 Siyi Hu, Fengda Zhu, Xiaojun Chang, Xiaodan Liang

Recent advances in multi-agent reinforcement learning have been largely limited in training one model from scratch for every new task.

reinforcement-learning Reinforcement Learning (RL) +1

A Comprehensive Survey of Scene Graphs: Generation and Application

no code implementations17 Mar 2021 Xiaojun Chang, Pengzhen Ren, Pengfei Xu, Zhihui Li, Xiaojiang Chen, Alex Hauptmann

For example, given an image, we want to not only detect and recognize objects in the image, but also know the relationship between objects (visual relationship detection), and generate a text description (image captioning) based on the image content.

Image Captioning Question Answering +4

NAS-TC: Neural Architecture Search on Temporal Convolutions for Complex Action Recognition

no code implementations17 Mar 2021 Pengzhen Ren, Gang Xiao, Xiaojun Chang, Yun Xiao, Zhihui Li, Xiaojiang Chen

Accordingly, because of the automated design of its network structure, Neural architecture search (NAS) has achieved great success in the image processing field and attracted substantial research attention in recent years.

Action Recognition In Videos Neural Architecture Search

BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search

1 code implementation ICCV 2021 Changlin Li, Tao Tang, Guangrun Wang, Jiefeng Peng, Bing Wang, Xiaodan Liang, Xiaojun Chang

In this work, we present Block-wisely Self-supervised Neural Architecture Search (BossNAS), an unsupervised NAS method that addresses the problem of inaccurate architecture rating caused by large weight-sharing space and biased supervision in previous methods.

Image Classification Neural Architecture Search +1

Dynamic Slimmable Network

1 code implementation CVPR 2021 Changlin Li, Guangrun Wang, Bing Wang, Xiaodan Liang, Zhihui Li, Xiaojun Chang

Here, we explore a dynamic network slimming regime, named Dynamic Slimmable Network (DS-Net), which aims to achieve good hardware-efficiency via dynamically adjusting filter numbers of networks at test time with respect to different inputs, while keeping filters stored statically and contiguously in hardware to prevent the extra burden.

Fairness Model Compression

SOON: Scenario Oriented Object Navigation with Graph-based Exploration

1 code implementation CVPR 2021 Fengda Zhu, Xiwen Liang, Yi Zhu, Xiaojun Chang, Xiaodan Liang

In this task, an agent is required to navigate from an arbitrary position in a 3D embodied environment to localize a target following a scene description.

Attribute Navigate +2

Attribute-Modulated Generative Meta Learning for Zero-Shot Classification

no code implementations22 Apr 2021 Yun Li, Zhe Liu, Lina Yao, Xiaojun Chang

The promising strategies for ZSL are to synthesize visual features of unseen classes conditioned on semantic side information and to incorporate meta-learning to eliminate the model's inherent bias towards seen classes.

Attribute Classification +6

Person Search Challenges and Solutions: A Survey

no code implementations1 May 2021 Xiangtan Lin, Pengzhen Ren, Yun Xiao, Xiaojun Chang, Alex Hauptmann

This paper surveyed the recent works on image-based and text-based person search from the perspective of challenges and solutions.

Person Search Text based Person Search

iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients

1 code implementation21 Jun 2021 Miao Zhang, Steven Su, Shirui Pan, Xiaojun Chang, Ehsan Abbasnejad, Reza Haffari

A key challenge to the scalability and quality of the learned architectures is the need for differentiating through the inner-loop optimisation.

Neural Architecture Search

Deep Learning for Embodied Vision Navigation: A Survey

no code implementations7 Jul 2021 Fengda Zhu, Yi Zhu, Vincent CS Lee, Xiaodan Liang, Xiaojun Chang

A navigation agent is supposed to have various intelligent skills, such as visual perceiving, mapping, planning, exploring and reasoning, etc.

Autonomous Driving Navigate +1

Legislator Representation Learning with Social Context and Expert Knowledge

1 code implementation9 Aug 2021 Shangbin Feng, Zhaoxuan Tan, Zilong Chen, Peisheng Yu, Qinghua Zheng, Xiaojun Chang, Minnan Luo

Modeling the ideological perspectives of political actors is an essential task in computational political science with applications in many downstream tasks.

Representation Learning Stance Detection

Unsupervised Person Re-Identification: A Systematic Survey of Challenges and Solutions

no code implementations1 Sep 2021 Xiangtan Lin, Pengzhen Ren, Chung-Hsing Yeh, Lina Yao, Andy Song, Xiaojun Chang

Therefore, comprehensive surveys on this topic are essential to summarise challenges and solutions to foster future research.

Unsupervised Person Re-Identification

Semantics-Guided Contrastive Network for Zero-Shot Object detection

no code implementations4 Sep 2021 Caixia Yan, Xiaojun Chang, Minnan Luo, Huan Liu, Xiaoqin Zhang, Qinghua Zheng

To address these issues, we develop a novel Semantics-Guided Contrastive Network for ZSD, named ContrastZSD, a detection framework that first brings contrastive learning mechanism into the realm of zero-shot detection.

Contrastive Learning Generalized Zero-Shot Object Detection +3

DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Transformers

1 code implementation21 Sep 2021 Changlin Li, Guangrun Wang, Bing Wang, Xiaodan Liang, Zhihui Li, Xiaojun Chang

Dynamic networks have shown their promising capability in reducing theoretical computation complexity by adapting their architectures to the input during inference.

Fairness Model Compression

Role Diversity Matters: A Study of Cooperative Training Strategies for Multi-Agent RL

no code implementations29 Sep 2021 Siyi Hu, Chuanlong Xie, Xiaodan Liang, Xiaojun Chang

In addition, role diversity can help to find a better training strategy and increase performance in cooperative MARL.

SMAC+ Starcraft +1

Reliable Shot Identification for Complex Event Detection via Visual-Semantic Embedding

no code implementations12 Oct 2021 Minnan Luo, Xiaojun Chang, Chen Gong

In this paper, we decompose the video into several segments and intuitively model the task of complex event detection as a multiple instance learning problem by representing each video as a "bag" of segments in which each segment is referred to as an instance.

Event Detection Multiple Instance Learning

Active Learning for Deep Visual Tracking

no code implementations17 Oct 2021 Di Yuan, Xiaojun Chang, Yi Yang, Qiao Liu, Dehua Wang, Zhenyu He

In this paper, we propose an active learning method for deep visual tracking, which selects and annotates the unlabeled samples to train the deep CNNs model.

Active Learning Visual Tracking

Dynamic Slimmable Denoising Network

no code implementations17 Oct 2021 Zutao Jiang, Changlin Li, Xiaojun Chang, Jihua Zhu, Yi Yang

Here, we present dynamic slimmable denoising network (DDS-Net), a general method to achieve good denoising quality with less computational complexity, via dynamically adjusting the channel configurations of networks at test time with respect to different noisy images.

Fairness Image Denoising

Signature-Graph Networks

no code implementations22 Oct 2021 Ali Hamdi, Flora Salim, Du Yong Kim, Xiaojun Chang

SGN constructs unique undirected graphs for each image based on the CNN feature maps.

Image Classification Representation Learning

An Entropy-guided Reinforced Partial Convolutional Network for Zero-Shot Learning

no code implementations3 Nov 2021 Yun Li, Zhe Liu, Lina Yao, Xianzhi Wang, Julian McAuley, Xiaojun Chang

Zero-Shot Learning (ZSL) aims to transfer learned knowledge from observed classes to unseen classes via semantic correlations.

Generalized Zero-Shot Learning

BaLeNAS: Differentiable Architecture Search via the Bayesian Learning Rule

no code implementations CVPR 2022 Miao Zhang, Jilin Hu, Steven Su, Shirui Pan, Xiaojun Chang, Bin Yang, Gholamreza Haffari

Differentiable Architecture Search (DARTS) has received massive attention in recent years, mainly because it significantly reduces the computational cost through weight sharing and continuous relaxation.

Neural Architecture Search Variational Inference

Diversity-boosted Generalization-Specialization Balancing for Zero-shot Learning

no code implementations6 Jan 2022 Yun Li, Zhe Liu, Xiaojun Chang, Julian McAuley, Lina Yao

We further propose a differentiable dataset-level balance and update the weights in a linear annealing schedule to simulate network pruning and thus obtain the optimal structure for BSNet with dataset-level balance achieved.

Meta-Learning Network Pruning +1

Exploring Inter-Channel Correlation for Diversity-preserved KnowledgeDistillation

1 code implementation8 Feb 2022 Li Liu, Qingle Huang, Sihao Lin, Hongwei Xie, Bing Wang, Xiaojun Chang, Xiaodan Liang

Extensive experiments on two vision tasks, includ-ing ImageNet classification and Pascal VOC segmentation, demonstrate the superiority of our ICKD, which consis-tently outperforms many existing methods, advancing thestate-of-the-art in the fields of Knowledge Distillation.

Knowledge Distillation

Voice-Face Homogeneity Tells Deepfake

no code implementations4 Mar 2022 Harry Cheng, Yangyang Guo, Tianyi Wang, Qi Li, Xiaojun Chang, Liqiang Nie

To this end, a voice-face matching method is devised to measure the matching degree of these two.

Beyond Fixation: Dynamic Window Visual Transformer

1 code implementation CVPR 2022 Pengzhen Ren, Changlin Li, Guangrun Wang, Yun Xiao, Qing Du, Xiaodan Liang, Xiaojun Chang

Recently, a surge of interest in visual transformers is to reduce the computational cost by limiting the calculation of self-attention to a local window.

Automated Progressive Learning for Efficient Training of Vision Transformers

1 code implementation CVPR 2022 Changlin Li, Bohan Zhuang, Guangrun Wang, Xiaodan Liang, Xiaojun Chang, Yi Yang

First, we develop a strong manual baseline for progressive learning of ViTs, by introducing momentum growth (MoGrow) to bridge the gap brought by model growth.

PRE-NAS: Predictor-assisted Evolutionary Neural Architecture Search

no code implementations27 Apr 2022 Yameng Peng, Andy Song, Vic Ciesielski, Haytham M. Fayek, Xiaojun Chang

This often requires a high computational overhead to evaluate a number of candidate networks from the set of all possible networks in the search space during the search.

Neural Architecture Search

Towards Explanation for Unsupervised Graph-Level Representation Learning

1 code implementation20 May 2022 Qinghua Zheng, Jihong Wang, Minnan Luo, YaoLiang Yu, Jundong Li, Lina Yao, Xiaojun Chang

Due to the superior performance of Graph Neural Networks (GNNs) in various domains, there is an increasing interest in the GNN explanation problem "\emph{which fraction of the input graph is the most crucial to decide the model's decision?}"

Decision Making Graph Classification +2

Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent RL

no code implementations1 Jun 2022 Siyi Hu, Chuanlong Xie, Xiaodan Liang, Xiaojun Chang

In this study, we quantify the agent's behavior difference and build its relationship with the policy performance via {\bf Role Diversity}, a metric to measure the characteristics of MARL tasks.

SMAC+ Starcraft

Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation

no code implementations CVPR 2022 Mingjie Li, Wenjia Cai, Karin Verspoor, Shirui Pan, Xiaodan Liang, Xiaojun Chang

To endow models with the capability of incorporating expert knowledge, we propose a Cross-modal clinical Graph Transformer (CGT) for ophthalmic report generation (ORG), in which clinical relation triples are injected into the visual features as prior knowledge to drive the decoding procedure.

Clinical Knowledge Medical Report Generation

Domain Adaptive Nuclei Instance Segmentation and Classification via Category-aware Feature Alignment and Pseudo-labelling

no code implementations4 Jul 2022 Canran Li, Dongnan Liu, Haoran Li, Zheng Zhang, Guangming Lu, Xiaojun Chang, Weidong Cai

In this work, we propose a novel deep neural network, namely Category-Aware feature alignment and Pseudo-Labelling Network (CAPL-Net) for UDA nuclei instance segmentation and classification.

Classification Instance Segmentation +3

An Efficient Spatio-Temporal Pyramid Transformer for Action Detection

no code implementations21 Jul 2022 Yuetian Weng, Zizheng Pan, Mingfei Han, Xiaojun Chang, Bohan Zhuang

The task of action detection aims at deducing both the action category and localization of the start and end moment for each action instance in a long, untrimmed video.

Action Detection Video Understanding

Spatial-temporal Analysis for Automated Concrete Workability Estimation

no code implementations24 Jul 2022 Litao Yu, Jian Zhang, Mohammed Bennamoun, Xiaojun Chang, Vute Sirivivatnanon, Ali Nezhad

Concrete workability measure is mostly determined based on subjective assessment of a certified assessor with visual inspections.

regression

Prompt-driven efficient Open-set Semi-supervised Learning

no code implementations28 Sep 2022 Haoran Li, Chun-Mei Feng, Tao Zhou, Yong Xu, Xiaojun Chang

In this paper, we propose a prompt-driven efficient OSSL framework, called OpenPrompt, which can propagate class information from labeled to unlabeled data with only a small number of trainable parameters.

Computational Efficiency Outlier Detection

MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning Library

1 code implementation11 Oct 2022 Siyi Hu, Yifan Zhong, Minquan Gao, Weixun Wang, Hao Dong, Xiaodan Liang, Zhihui Li, Xiaojun Chang, Yaodong Yang

A significant challenge facing researchers in the area of multi-agent reinforcement learning (MARL) pertains to the identification of a library that can offer fast and compatible development for multi-agent tasks and algorithm combinations, while obviating the need to consider compatibility issues.

Multi-agent Reinforcement Learning reinforcement-learning +1

PAR: Political Actor Representation Learning with Social Context and Expert Knowledge

1 code implementation15 Oct 2022 Shangbin Feng, Zhaoxuan Tan, Zilong Chen, Ningnan Wang, Peisheng Yu, Qinghua Zheng, Xiaojun Chang, Minnan Luo

Extensive experiments demonstrate that PAR is better at augmenting political text understanding and successfully advances the state-of-the-art in political perspective detection and roll call vote prediction.

Representation Learning

Learning Self-Regularized Adversarial Views for Self-Supervised Vision Transformers

1 code implementation16 Oct 2022 Tao Tang, Changlin Li, Guangrun Wang, Kaicheng Yu, Xiaojun Chang, Xiaodan Liang

Despite the success, its development and application on self-supervised vision transformers have been hindered by several barriers, including the high search cost, the lack of supervision, and the unsuitable search space.

Data Augmentation Image Retrieval +3

Simple Primitives with Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-shot Learning

no code implementations5 Nov 2022 Zhe Liu, Yun Li, Lina Yao, Xiaojun Chang, Wei Fang, XiaoJun Wu, Yi Yang

We design Semantic Attention (SA) and generative Knowledge Disentanglement (KD) to learn the dependence of feasibility and contextuality, respectively.

Compositional Zero-Shot Learning Disentanglement

3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation

no code implementations2 Dec 2022 Zutao Jiang, Guansong Lu, Xiaodan Liang, Jihua Zhu, Wei zhang, Xiaojun Chang, Hang Xu

Here, we make the first attempt to achieve generic text-guided cross-category 3D object generation via a new 3D-TOGO model, which integrates a text-to-views generation module and a views-to-3D generation module.

3D Generation Contrastive Learning +2

HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation

no code implementations ICCV 2023 Mingfei Han, Yali Wang, Zhihui Li, Lina Yao, Xiaojun Chang, Yu Qiao

To tackle this problem, we propose a concise Hybrid Temporal-scale Multimodal Learning (HTML) framework, which can effectively align lingual and visual features to discover core object semantics in the video, by learning multimodal interaction hierarchically from different temporal scales.

Ranked #6 on Referring Video Object Segmentation on Refer-YouTube-VOS (using extra training data)

Object Referring Video Object Segmentation +2

ViewCo: Discovering Text-Supervised Segmentation Masks via Multi-View Semantic Consistency

1 code implementation31 Jan 2023 Pengzhen Ren, Changlin Li, Hang Xu, Yi Zhu, Guangrun Wang, Jianzhuang Liu, Xiaojun Chang, Xiaodan Liang

Specifically, we first propose text-to-views consistency modeling to learn correspondence for multiple views of the same input image.

Segmentation Semantic Segmentation

Guided Image-to-Image Translation by Discriminator-Generator Communication

no code implementations7 Mar 2023 Yuanjiang Cao, Lina Yao, Le Pan, Quan Z. Sheng, Xiaojun Chang

The goal of Image-to-image (I2I) translation is to transfer an image from a source domain to a target domain, which has recently drawn increasing attention.

Generative Adversarial Network Image-to-Image Translation +1

Dynamic Graph Enhanced Contrastive Learning for Chest X-ray Report Generation

1 code implementation CVPR 2023 Mingjie Li, Bingqian Lin, Zicong Chen, Haokun Lin, Xiaodan Liang, Xiaojun Chang

To address the limitation, we propose a knowledge graph with Dynamic structure and nodes to facilitate medical report generation with Contrastive Learning, named DCL.

Contrastive Learning General Knowledge +2

Toward the Automated Construction of Probabilistic Knowledge Graphs for the Maritime Domain

no code implementations4 May 2023 Fatemeh Shiri, Teresa Wang, Shirui Pan, Xiaojun Chang, Yuan-Fang Li, Reza Haffari, Van Nguyen, Shuang Yu

In order to exploit the potentially useful and rich information from such sources, it is necessary to extract not only the relevant entities and concepts but also their semantic relations, together with the uncertainty associated with the extracted knowledge (i. e., in the form of probabilistic knowledge graphs).

Knowledge Graphs

Maximum Entropy Heterogeneous-Agent Reinforcement Learning

1 code implementation19 Jun 2023 Jiarong Liu, Yifan Zhong, Siyi Hu, Haobo Fu, Qiang Fu, Xiaojun Chang, Yaodong Yang

We embed cooperative MARL problems into probabilistic graphical models, from which we derive the maximum entropy (MaxEnt) objective for MARL.

Multi-agent Reinforcement Learning reinforcement-learning +1

Two-stream Multi-level Dynamic Point Transformer for Two-person Interaction Recognition

no code implementations22 Jul 2023 Yao Liu, Gangfeng Cui, Jiahui Luo, Lina Yao, Xiaojun Chang

Subsequently, a frame features learning module and a two-stream multi-level feature aggregation module extract global and partial features from the sampled frames, effectively representing the local-region spatial information, appearance information, and motion information related to the interactions.

Action Recognition Temporal Action Localization

FULLER: Unified Multi-modality Multi-task 3D Perception via Multi-level Gradient Calibration

no code implementations ICCV 2023 Zhijian Huang, Sihao Lin, Guiyu Liu, Mukun Luo, Chaoqiang Ye, Hang Xu, Xiaojun Chang, Xiaodan Liang

Specifically, the gradients, produced by the task heads and used to update the shared backbone, will be calibrated at the backbone's last layer to alleviate the task conflict.

Autonomous Driving Multi-Task Learning

SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form Layout-to-Image Generation

no code implementations20 Aug 2023 Chengyou Jia, Minnan Luo, Zhuohang Dang, Guang Dai, Xiaojun Chang, Mengmeng Wang, Jingdong Wang

Despite significant progress in Text-to-Image (T2I) generative models, even lengthy and complex text descriptions still struggle to convey detailed controls.

Layout-to-Image Generation

PSDiff: Diffusion Model for Person Search with Iterative and Collaborative Refinement

no code implementations20 Sep 2023 Chengyou Jia, Minnan Luo, Zhuohang Dang, Guang Dai, Xiaojun Chang, Jingdong Wang

Dominant Person Search methods aim to localize and recognize query persons in a unified network, which jointly optimizes two sub-tasks, \ie, pedestrian detection and Re-IDentification (ReID).

Denoising Pedestrian Detection +2

No Token Left Behind: Efficient Vision Transformer via Dynamic Token Idling

1 code implementation9 Oct 2023 Xuwei Xu, Changlin Li, Yudong Chen, Xiaojun Chang, Jiajun Liu, Sen Wang

By allowing the idle tokens to be re-selected in the following layers, IdleViT mitigates the negative impact of improper pruning in the early stages.

Mask Propagation for Efficient Video Semantic Segmentation

1 code implementation NeurIPS 2023 Yuetian Weng, Mingfei Han, Haoyu He, Mingjie Li, Lina Yao, Xiaojun Chang, Bohan Zhuang

By reusing predictions from key frames, we circumvent the need to process a large volume of video frames individually with resource-intensive segmentors, alleviating temporal redundancy and significantly reducing computational costs.

Semantic Segmentation Video Semantic Segmentation

Disentangled Representation Learning with Transmitted Information Bottleneck

no code implementations3 Nov 2023 Zhuohang Dang, Minnan Luo, Chengyou Jia, Guang Dai, Jihong Wang, Xiaojun Chang, Jingdong Wang, Qinghua Zheng

Encoding only the task-related information from the raw data, \ie, disentangled representation learning, can greatly contribute to the robustness and generalizability of models.

Disentanglement Variational Inference

Video Recognition in Portrait Mode

1 code implementation21 Dec 2023 Mingfei Han, Linjie Yang, Xiaojie Jin, Jiashi Feng, Xiaojun Chang, Heng Wang

While existing datasets mainly comprise landscape mode videos, our paper seeks to introduce portrait mode videos to the research community and highlight the unique challenges associated with this video format.

Data Augmentation Video Recognition

DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions

1 code implementation2 Mar 2024 Guangrun Wang, Changlin Li, Liuchun Yuan, Jiefeng Peng, Xiaoyu Xian, Xiaodan Liang, Xiaojun Chang, Liang Lin

Addressing this problem, we modularize a large search space into blocks with small search spaces and develop a family of models with the distilling neural architecture (DNA) techniques.

Neural Architecture Search

SWAP-NAS: Sample-Wise Activation Patterns for Ultra-fast NAS

no code implementations7 Mar 2024 Yameng Peng, Andy Song, Haytham M. Fayek, Vic Ciesielski, Xiaojun Chang

The SWAP-Score is strongly correlated with ground-truth performance across various search spaces and tasks, outperforming 15 existing training-free metrics on NAS-Bench-101/201/301 and TransNAS-Bench-101.

Neural Architecture Search

NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning

1 code implementation12 Mar 2024 Bingqian Lin, Yunshuang Nie, Ziming Wei, Jiaqi Chen, Shikui Ma, Jianhua Han, Hang Xu, Xiaojun Chang, Xiaodan Liang

Vision-and-Language Navigation (VLN), as a crucial research problem of Embodied AI, requires an embodied agent to navigate through complex 3D environments following natural language instructions.

Navigate Vision and Language Navigation

Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding

1 code implementation21 Mar 2024 Jingjing Hu, Dan Guo, Kun Li, Zhan Si, Xun Yang, Xiaojun Chang, Meng Wang

Inspired by the activity-silent and persistent activity mechanisms in human visual perception biology, we design a Unified Static and Dynamic Network (UniSDNet), to learn the semantic association between the video and text/audio queries in a cross-modal environment for efficient video grounding.

Video Grounding

Self-Supervised Multi-Frame Neural Scene Flow

no code implementations24 Mar 2024 Dongrui Liu, Daqi Liu, Xueqian Li, Sihao Lin, Hongwei Xie, Bing Wang, Xiaojun Chang, Lei Chu

Neural Scene Flow Prior (NSFP) and Fast Neural Scene Flow (FNSF) have shown remarkable adaptability in the context of large out-of-distribution autonomous driving.

Autonomous Driving Scene Flow Estimation

LongVLM: Efficient Long Video Understanding via Large Language Models

1 code implementation4 Apr 2024 Yuetian Weng, Mingfei Han, Haoyu He, Xiaojun Chang, Bohan Zhuang

In this way, we encode video representations that incorporate both local and global information, enabling the LLM to generate comprehensive responses for long-term videos.

Question Answering Video Question Answering +1

MLP Can Be A Good Transformer Learner

1 code implementation8 Apr 2024 Sihao Lin, Pumeng Lyu, Dongrui Liu, Tao Tang, Xiaodan Liang, Andy Song, Xiaojun Chang

We identify that regarding the attention layer in bottom blocks, their subsequent MLP layers, i. e. two feed-forward layers, can elicit the same entropy quantity.

Mining Inter-Video Proposal Relations for Video Object Detection

1 code implementation ECCV 2020 Mingfei Han, Yali Wang, Xiaojun Chang, Yu Qiao

Recent studies have shown that, context aggregating information from proposals in different frames can clearly enhance the performance of video object detection.

Object object-detection +3

Cannot find the paper you are looking for? You can Submit a new open access paper.