Search Results for author: Xiaojun Chang

Found 118 papers, 43 papers with code

Compound Rank-k Projections for Bilinear Analysis

no code implementations • 23 Nov 2014 • Xiaojun Chang, Feiping Nie, Sen Wang, Yi Yang, Xiaofang Zhou, Chengqi Zhang

In many real-world applications, data are represented by matrices or high-order tensors.

Paper
Add Code

Semi-supervised Feature Analysis by Mining Correlations among Multiple Tasks

no code implementations • 23 Nov 2014 • Xiaojun Chang, Yi Yang

In this paper, we propose a novel semi-supervised feature selection framework by mining correlations among multiple tasks and apply it to different multimedia applications.

feature selection

Paper
Add Code

Balanced k-Means and Min-Cut Clustering

no code implementations • 23 Nov 2014 • Xiaojun Chang, Feiping Nie, Zhigang Ma, Yi Yang

Clustering is an effective technique in data mining to generate groups that are the matter of interest.

Clustering

Paper
Add Code

Improved Spectral Clustering via Embedded Label Propagation

no code implementations • 23 Nov 2014 • Xiaojun Chang, Feiping Nie, Yi Yang, Heng Huang

Our algorithm is built upon two advancements of the state of the art:1) label propagation, which propagates a node\'s labels to neighboring nodes according to their proximity; and 2) manifold learning, which has been widely used in its capacity to leverage the manifold structure of data points.

Clustering

Paper
Add Code

A Convex Formulation for Spectral Shrunk Clustering

no code implementations • 23 Nov 2014 • Xiaojun Chang, Feiping Nie, Zhigang Ma, Yi Yang, Xiaofang Zhou

Thus, applying manifold information obtained from the original space to the clustering process in a low-dimensional subspace is prone to inferior performance.

Clustering Dimensionality Reduction

Paper
Add Code

A Convex Sparse PCA for Feature Analysis

no code implementations • 23 Nov 2014 • Xiaojun Chang, Feiping Nie, Yi Yang, Heng Huang

In addition, based on the sparse model used in CSPCA, an optimal weight is assigned to each of the original feature, which in turn provides the output with good interpretability.

Dimensionality Reduction feature selection +1

Paper
Add Code

Unsupervised Feature Analysis with Class Margin Optimization

no code implementations • 3 Jun 2015 • Sen Wang, Feiping Nie, Xiaojun Chang, Lina Yao, Xue Li, Quan Z. Sheng

In this paper, we propose an unsupervised feature selection method seeking a feature coefficient matrix to select the most distinctive features.

Clustering Feature Correlation +1

Paper
Add Code

Dynamic Concept Composition for Zero-Example Event Detection

no code implementations • 14 Jan 2016 • Xiaojun Chang, Yi Yang, Guodong Long, Chengqi Zhang, Alexander G. Hauptmann

In this paper, we focus on automatically detecting events in unconstrained videos without the use of any visual training exemplars.

Event Detection Zero-Shot Learning

Paper
Add Code

They Are Not Equally Reliable: Semantic Event Search Using Differentiated Concept Classifiers

no code implementations • CVPR 2016 • Xiaojun Chang, Yao-Liang Yu, Yi Yang, Eric P. Xing

Complex event detection on unconstrained Internet videos has seen much progress in recent years.

Event Detection

Paper
Add Code

Strategies for Searching Video Content with Text Queries or Video Examples

no code implementations • 17 Jun 2016 • Shoou-I Yu, Yi Yang, Zhongwen Xu, Shicheng Xu, Deyu Meng, Zexi Mao, Zhigang Ma, Ming Lin, Xuanchong Li, Huan Li, Zhenzhong Lan, Lu Jiang, Alexander G. Hauptmann, Chuang Gan, Xingzhong Du, Xiaojun Chang

The large number of user-generated videos uploaded on to the Internet everyday has led to many commercial video search engines, which mainly rely on text metadata for search.

Event Detection Retrieval +1

Paper
Add Code

Uncovering Locally Discriminative Structure for Feature Analysis

no code implementations • 9 Jul 2016 • Sen Wang, Feiping Nie, Xiaojun Chang, Xue Li, Quan Z. Sheng, Lina Yao

We propose a method that utilizes both the manifold structure of data and local discriminant information.

Paper
Add Code

Simple to Complex Cross-modal Learning to Rank

no code implementations • 4 Feb 2017 • Minnan Luo, Xiaojun Chang, Zhihui Li, Liqiang Nie, Alexander G. Hauptmann, Qinghua Zheng

The heterogeneity-gap between different modalities brings a significant challenge to multimedia information retrieval.

Cross-Modal Retrieval Information Retrieval +3

Paper
Add Code

Complex Event Detection by Identifying Reliable Shots From Untrimmed Videos

no code implementations • ICCV 2017 • Hehe Fan, Xiaojun Chang, De Cheng, Yi Yang, Dong Xu, Alexander G. Hauptmann

relevant) to the given event class, we formulate this task as a multi-instance learning (MIL) problem by taking each video as a bag and the video shots in each video as instances.

Event Detection

Paper
Add Code

Reinforcement Cutting-Agent Learning for Video Object Segmentation

no code implementations • CVPR 2018 • Junwei Han, Le Yang, Dingwen Zhang, Xiaojun Chang, Xiaodan Liang

In this paper, we formulate this problem as a Markov Decision Process, where agents are learned to segment object regions under a deep reinforcement learning framework.

Decision Making Object +5

Paper
Add Code

Multi-shot Person Re-identification through Set Distance with Visual Distributional Representation

no code implementations • 3 Aug 2018 • Ting-yao Hu, Xiaojun Chang, Alexander G. Hauptmann

In this work, we propose the idea of visual distributional representation, which interprets an image set as samples drawn from an unknown distribution in appearance feature space.

Person Re-Identification

Paper
Add Code

RCAA: Relational Context-Aware Agents for Person Search

no code implementations • ECCV 2018 • Xiaojun Chang, Po-Yao Huang, Yi-Dong Shen, Xiaodan Liang, Yi Yang, Alexander G. Hauptmann

In this paper, we address this problem by training relational context-aware agents which learn the actions to localize the target person from the gallery of whole scene images.

Person Search

Paper
Add Code

MMALFM: Explainable Recommendation by Leveraging Reviews and Images

no code implementations • 12 Nov 2018 • Zhiyong Cheng, Xiaojun Chang, Lei Zhu, Rose C. Kanjirathinkal, Mohan Kankanhalli

Then the aspect importance is integrated into a novel aspect-aware latent factor model (ALFM), which learns user's and item's latent factors based on ratings.

Explainable Recommendation

Paper
Add Code

Distributionally Robust Semi-Supervised Learning for People-Centric Sensing

no code implementations • 12 Nov 2018 • Kaixuan Chen, Lina Yao, Dalin Zhang, Xiaojun Chang, Guodong Long, Sen Wang

Semi-supervised learning is crucial for alleviating labelling burdens in people-centric sensing.

Activity Recognition Gesture Recognition +2

Paper
Add Code

Ensemble Teaching for Hybrid Label Propagation

no code implementations • 8 Apr 2019 • Chen Gong, DaCheng Tao, Xiaojun Chang, Jian Yang

More importantly, HyDEnT conducts propagation under the guidance of an ensemble of teachers.

Paper
Add Code

Continual Reinforcement Learning with Diversity Exploration and Adversarial Self-Correction

no code implementations • 21 Jun 2019 • Fengda Zhu, Xiaojun Chang, Runhao Zeng, Mingkui Tan

We first develop an unsupervised diversity exploration method to learn task-specific skills using an unsupervised objective.

Autonomous Driving Continuous Control +2

Paper
Add Code

Multi-Head Attention with Diversity for Learning Grounded Multilingual Multimodal Representations

no code implementations • IJCNLP 2019 • Po-Yao Huang, Xiaojun Chang, Alexander Hauptmann

With the aim of promoting and understanding the multilingual version of image search, we leverage visual object detection and propose a model with diverse multi-head attention to learn grounded multilingual multimodal representations.

Image Retrieval object-detection +2

Paper
Add Code

Vision-Language Navigation with Self-Supervised Auxiliary Reasoning Tasks

no code implementations • CVPR 2020 • Fengda Zhu, Yi Zhu, Xiaojun Chang, Xiaodan Liang

In this paper, we introduce Auxiliary Reasoning Navigation (AuxRN), a framework with four self-supervised auxiliary reasoning tasks to take advantage of the additional training signals derived from the semantic information.

Ranked #13 on Vision and Language Navigation on VLN Challenge

Navigate Vision-Language Navigation

Paper
Add Code

Blockwisely Supervised Neural Architecture Search with Knowledge Distillation

1 code implementation • 29 Nov 2019 • Changlin Li, Jiefeng Peng, Liuchun Yuan, Guangrun Wang, Xiaodan Liang, Liang Lin, Xiaojun Chang

Moreover, we find that the knowledge of a network model lies not only in the network parameters but also in the network architecture.

Ranked #1 on Neural Architecture Search on CIFAR-100

Knowledge Distillation Neural Architecture Search

230

Paper
Code

Argus: Efficient Activity Detection System for Extended Video Analysis

1 code implementation • Proceedings of the IEEE Winter Conference on Applications of Computer Vision Workshops 2020 • Wenhe Liu, Guoliang Kang, Po-Yao Huang, Xiaojun Chang, Yijun Qian, Junwei Liang, Liangke Gui, Jing Wen, Peng Chen

We propose an Efficient Activity Detection System, Argus, for Extended Video Analysis in the surveillance scenario.

Action Detection Activity Detection +5

455

Paper
Code

Unity Style Transfer for Person Re-Identification

no code implementations • CVPR 2020 • Chong Liu, Xiaojun Chang, Yi-Dong Shen

To solve this problem, we propose a UnityStyle adaption method, which can smooth the style disparities within the same camera and across different cameras.

Person Re-Identification Style Transfer +1

Paper
Add Code

ZSTAD: Zero-Shot Temporal Activity Detection

no code implementations • CVPR 2020 • Lingling Zhang, Xiaojun Chang, Jun Liu, Minnan Luo, Sen Wang, ZongYuan Ge, Alexander Hauptmann

An integral part of video analysis and surveillance is temporal activity detection, which means to simultaneously recognize and localize activities in long untrimmed videos.

Action Detection Activity Detection

Paper
Add Code

Vision-Dialog Navigation by Exploring Cross-modal Memory

1 code implementation • CVPR 2020 • Yi Zhu, Fengda Zhu, Zhaohuan Zhan, Bingqian Lin, Jianbin Jiao, Xiaojun Chang, Xiaodan Liang

Benefiting from the collaborative learning of the L-mem and the V-mem, our CMN is able to explore the memory about the decision making of historical navigation actions which is for the current step.

Decision Making

Paper
Code

Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting

no code implementations • ACL 2020 • Po-Yao Huang, Junjie Hu, Xiaojun Chang, Alexander Hauptmann

In this paper, we investigate how to utilize visual content for disambiguation and promoting latent space alignment in unsupervised MMT.

Translation Unsupervised Machine Translation

Paper
Add Code

Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks

2 code implementations • 24 May 2020 • Zonghan Wu, Shirui Pan, Guodong Long, Jing Jiang, Xiaojun Chang, Chengqi Zhang

Modeling multivariate time series has long been a subject that has attracted researchers from a diverse range of fields including economics, finance, and traffic.

Ranked #1 on Univariate Time Series Forecasting on Electricity

Graph Learning Multivariate Time Series Forecasting +3

2,476

Paper
Code

A Comprehensive Survey of Neural Architecture Search: Challenges and Solutions

no code implementations • 1 Jun 2020 • Pengzhen Ren, Yun Xiao, Xiaojun Chang, Po-Yao Huang, Zhihui Li, Xiaojiang Chen, Xin Wang

Neural Architecture Search (NAS) is just such a revolutionary algorithm, and the related research work is complicated and rich.

Neural Architecture Search

Paper
Add Code

Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report Generation

2 code implementations • 6 Jun 2020 • Mingjie Li, Fuyu Wang, Xiaojun Chang, Xiaodan Liang

Firstly, the regions of primary interest to radiologists are usually located in a small area of the global image, meaning that the remainder parts of the image could be considered as irrelevant noise in the training procedure.

Image Captioning Medical Report Generation +1

Paper
Code

Melanoma Diagnosis with Spatio-Temporal Feature Learning on Sequential Dermoscopic Images

no code implementations • 19 Jun 2020 • Zhen Yu, Jennifer Nguyen, Xiaojun Chang, John Kelly, Catriona Mclean, Lei Zhang, Victoria Mar, ZongYuan Ge

Existing studies for automated melanoma diagnosis are based on single-time point images of lesions.

Melanoma Diagnosis

Paper
Add Code

Multi-view Drone-based Geo-localization via Style and Spatial Alignment

no code implementations • 23 Jun 2020 • Siyi Hu, Xiaojun Chang

In this paper, we focus on the task of multi-view multi-source geo-localization, which serves as an important auxiliary method of GPS positioning by matching drone-view image and satellite-view image with pre-annotated GPS tag.

TAG

Paper
Add Code

Accurate Bounding-box Regression with Distance-IoU Loss for Visual Tracking

no code implementations • 3 Jul 2020 • Di Yuan, Xiu Shu, Nana Fan, Xiaojun Chang, Qiao Liu, Zhenyu He

Moreover, we introduce a classification part that is trained online and optimized with a Conjugate-Gradient-based strategy to guarantee real-time tracking speed.

regression Visual Tracking

Paper
Add Code

A Survey of Deep Active Learning

1 code implementation • 30 Aug 2020 • Pengzhen Ren, Yun Xiao, Xiaojun Chang, Po-Yao Huang, Zhihui Li, Brij B. Gupta, Xiaojiang Chen, Xin Wang

Therefore, deep active learning (DAL) has emerged.

Active Learning speech-recognition +1

Paper
Code

Self-Weighted Robust LDA for Multiclass Classification with Edge Classes

no code implementations • 24 Sep 2020 • Caixia Yan, Xiaojun Chang, Minnan Luo, Qinghua Zheng, Xiaoqin Zhang, Zhihui Li, Feiping Nie

In this regard, a novel self-weighted robust LDA with l21-norm based pairwise between-class distance criterion, called SWRLDA, is proposed for multi-class classification especially with edge classes.

Classification Computational Efficiency +2

Paper
Add Code

Hierarchical Neural Architecture Search for Deep Stereo Matching

1 code implementation • NeurIPS 2020 • Xuelian Cheng, Yiran Zhong, Mehrtash Harandi, Yuchao Dai, Xiaojun Chang, Tom Drummond, Hongdong Li, ZongYuan Ge

To reduce the human efforts in neural network design, Neural Architecture Search (NAS) has been applied with remarkable success to various high-level vision tasks such as classification and semantic segmentation.

Ranked #2 on Stereo Disparity Estimation on Scene Flow

Neural Architecture Search Semantic Segmentation +3

252

Paper
Code

Differentiable Neural Architecture Search in Equivalent Space with Exploration Enhancement

1 code implementation • NeurIPS 2020 • Miao Zhang, Huiqi Li, Shirui Pan, Xiaojun Chang, ZongYuan Ge, Steven Su

A probabilistic exploration enhancement method is accordingly devised to encourage intelligent exploration during the architecture search in the latent space, to avoid local optimal in architecture search.

Bilevel Optimization Neural Architecture Search

Paper
Code

Exploring Inter-Channel Correlation for Diversity-Preserved Knowledge Distillation

2 code implementations • ICCV 2021 • Li Liu, Qingle Huang, Sihao Lin, Hongwei Xie, Bing Wang, Xiaojun Chang, Xiaodan Liang

Extensive experiments on two vision tasks, including ImageNet classification and Pascal VOC segmentation, demonstrate the superiority of our ICKD, which consistently outperforms many existing methods, advancing the state-of-the-art in the fields of Knowledge Distillation.

Ranked #21 on Knowledge Distillation on ImageNet

Knowledge Distillation

1,258

Paper
Code

UPDeT: Universal Multi-agent RL via Policy Decoupling with Transformers

no code implementations • ICLR 2021 • Siyi Hu, Fengda Zhu, Xiaojun Chang, Xiaodan Liang

Recent advances in multi-agent reinforcement learning have been largely limited in training one model from scratch for every new task.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

EXPLORING VULNERABILITIES OF BERT-BASED APIS

no code implementations • 1 Jan 2021 • Xuanli He, Lingjuan Lyu, Lichao Sun, Xiaojun Chang, Jun Zhao

We then demonstrate how the extracted model can be exploited to develop effective attribute inference attack to expose sensitive information of the training data.

Attribute Inference Attack +4

Paper
Add Code

UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers

1 code implementation • 20 Jan 2021 • Siyi Hu, Fengda Zhu, Xiaojun Chang, Xiaodan Liang

Recent advances in multi-agent reinforcement learning have been largely limited in training one model from scratch for every new task.

reinforcement-learning Reinforcement Learning (RL) +1

125

Paper
Code

A Comprehensive Survey of Scene Graphs: Generation and Application

no code implementations • 17 Mar 2021 • Xiaojun Chang, Pengzhen Ren, Pengfei Xu, Zhihui Li, Xiaojiang Chen, Alex Hauptmann

For example, given an image, we want to not only detect and recognize objects in the image, but also know the relationship between objects (visual relationship detection), and generate a text description (image captioning) based on the image content.

Image Captioning Question Answering +4

Paper
Add Code

NAS-TC: Neural Architecture Search on Temporal Convolutions for Complex Action Recognition

no code implementations • 17 Mar 2021 • Pengzhen Ren, Gang Xiao, Xiaojun Chang, Yun Xiao, Zhihui Li, Xiaojiang Chen

Accordingly, because of the automated design of its network structure, Neural architecture search (NAS) has achieved great success in the image processing field and attracted substantial research attention in recent years.

Action Recognition In Videos Neural Architecture Search

Paper
Add Code

BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search

1 code implementation • ICCV 2021 • Changlin Li, Tao Tang, Guangrun Wang, Jiefeng Peng, Bing Wang, Xiaodan Liang, Xiaojun Chang

In this work, we present Block-wisely Self-supervised Neural Architecture Search (BossNAS), an unsupervised NAS method that addresses the problem of inaccurate architecture rating caused by large weight-sharing space and biased supervision in previous methods.

Ranked #1 on Neural Architecture Search on NATS-Bench Size, CIFAR-100

Image Classification Neural Architecture Search +1

133

Paper
Code

Dynamic Slimmable Network

1 code implementation • CVPR 2021 • Changlin Li, Guangrun Wang, Bing Wang, Xiaodan Liang, Zhihui Li, Xiaojun Chang

Here, we explore a dynamic network slimming regime, named Dynamic Slimmable Network (DS-Net), which aims to achieve good hardware-efficiency via dynamically adjusting filter numbers of networks at test time with respect to different inputs, while keeping filters stored statically and contiguously in hardware to prevent the extra burden.

Fairness Model Compression

220

Paper
Code

SOON: Scenario Oriented Object Navigation with Graph-based Exploration

1 code implementation • CVPR 2021 • Fengda Zhu, Xiwen Liang, Yi Zhu, Xiaojun Chang, Xiaodan Liang

In this task, an agent is required to navigate from an arbitrary position in a 3D embodied environment to localize a target following a scene description.

Ranked #5 on Visual Navigation on SOON Test

Attribute Navigate +2

Paper
Code

Attribute-Modulated Generative Meta Learning for Zero-Shot Classification

no code implementations • 22 Apr 2021 • Yun Li, Zhe Liu, Lina Yao, Xiaojun Chang

The promising strategies for ZSL are to synthesize visual features of unseen classes conditioned on semantic side information and to incorporate meta-learning to eliminate the model's inherent bias towards seen classes.

Attribute Classification +6

Paper
Add Code

Person Search Challenges and Solutions: A Survey

no code implementations • 1 May 2021 • Xiangtan Lin, Pengzhen Ren, Yun Xiao, Xiaojun Chang, Alex Hauptmann

This paper surveyed the recent works on image-based and text-based person search from the perspective of challenges and solutions.

Person Search Text based Person Search

Paper
Add Code

Vision-Language Navigation with Random Environmental Mixup

1 code implementation • ICCV 2021 • Chong Liu, Fengda Zhu, Xiaojun Chang, Xiaodan Liang, ZongYuan Ge, Yi-Dong Shen

Then, we cross-connect the key views of different scenes to construct augmented scenes.

Ranked #38 on Vision and Language Navigation on VLN Challenge

Data Augmentation Navigate +1

Paper
Code

iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients

1 code implementation • 21 Jun 2021 • Miao Zhang, Steven Su, Shirui Pan, Xiaojun Chang, Ehsan Abbasnejad, Reza Haffari

A key challenge to the scalability and quality of the learned architectures is the need for differentiating through the inner-loop optimisation.

Ranked #22 on Neural Architecture Search on NAS-Bench-201, CIFAR-10

Neural Architecture Search

Paper
Code

Deep Learning for Embodied Vision Navigation: A Survey

no code implementations • 7 Jul 2021 • Fengda Zhu, Yi Zhu, Vincent CS Lee, Xiaodan Liang, Xiaojun Chang

A navigation agent is supposed to have various intelligent skills, such as visual perceiving, mapping, planning, exploring and reasoning, etc.

Autonomous Driving Navigate +1

Paper
Add Code

Legislator Representation Learning with Social Context and Expert Knowledge

1 code implementation • 9 Aug 2021 • Shangbin Feng, Zhaoxuan Tan, Zilong Chen, Peisheng Yu, Qinghua Zheng, Xiaojun Chang, Minnan Luo

Modeling the ideological perspectives of political actors is an essential task in computational political science with applications in many downstream tasks.

Representation Learning Stance Detection

Paper
Code

KGAP: Knowledge Graph Augmented Political Perspective Detection in News Media

1 code implementation • 9 Aug 2021 • Shangbin Feng, Zilong Chen, Wenqian Zhang, Qingyao Li, Qinghua Zheng, Xiaojun Chang, Minnan Luo

Specifically, we construct a political knowledge graph to serve as domain-specific external knowledge.

Argument Mining Knowledge Graphs

Paper
Code

FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark

1 code implementation • Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) 2021 • Mingjie Li, Wenjia Cai, Rui Liu, Yuetian Weng, Xiaoyun Zhao, Cong Wang, Xin Chen, Zhong Liu, Caineng Pan, Mengke Li, Yizhi Liu, Flora D Salim, Karin Verspoor, Xiaodan Liang, Xiaojun Chang

Researchers have explored advanced methods from computer vision and natural language processing to incorporate medical domain knowledge for the generation of readable medical reports.

Medical Report Generation Text Generation

Paper
Code

Unsupervised Person Re-Identification: A Systematic Survey of Challenges and Solutions

no code implementations • 1 Sep 2021 • Xiangtan Lin, Pengzhen Ren, Chung-Hsing Yeh, Lina Yao, Andy Song, Xiaojun Chang

Therefore, comprehensive surveys on this topic are essential to summarise challenges and solutions to foster future research.

Unsupervised Person Re-Identification

Paper
Add Code

Semantics-Guided Contrastive Network for Zero-Shot Object detection

no code implementations • 4 Sep 2021 • Caixia Yan, Xiaojun Chang, Minnan Luo, Huan Liu, Xiaoqin Zhang, Qinghua Zheng

To address these issues, we develop a novel Semantics-Guided Contrastive Network for ZSD, named ContrastZSD, a detection framework that first brings contrastive learning mechanism into the realm of zero-shot detection.

Ranked #4 on Zero-Shot Object Detection on MS-COCO

Contrastive Learning Generalized Zero-Shot Object Detection +3

Paper
Add Code

DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Transformers

1 code implementation • 21 Sep 2021 • Changlin Li, Guangrun Wang, Bing Wang, Xiaodan Liang, Zhihui Li, Xiaojun Chang

Dynamic networks have shown their promising capability in reducing theoretical computation complexity by adapting their architectures to the input during inference.

Fairness Model Compression

220

Paper
Code

Role Diversity Matters: A Study of Cooperative Training Strategies for Multi-Agent RL

no code implementations • 29 Sep 2021 • Siyi Hu, Chuanlong Xie, Xiaodan Liang, Xiaojun Chang

In addition, role diversity can help to find a better training strategy and increase performance in cooperative MARL.

SMAC+ Starcraft +1

Paper
Add Code

Reliable Shot Identification for Complex Event Detection via Visual-Semantic Embedding

no code implementations • 12 Oct 2021 • Minnan Luo, Xiaojun Chang, Chen Gong

In this paper, we decompose the video into several segments and intuitively model the task of complex event detection as a multiple instance learning problem by representing each video as a "bag" of segments in which each segment is referred to as an instance.

Event Detection Multiple Instance Learning

Paper
Add Code

Active Learning for Deep Visual Tracking

no code implementations • 17 Oct 2021 • Di Yuan, Xiaojun Chang, Yi Yang, Qiao Liu, Dehua Wang, Zhenyu He

In this paper, we propose an active learning method for deep visual tracking, which selects and annotates the unlabeled samples to train the deep CNNs model.

Active Learning Visual Tracking

Paper
Add Code

Dynamic Slimmable Denoising Network

no code implementations • 17 Oct 2021 • Zutao Jiang, Changlin Li, Xiaojun Chang, Jihua Zhu, Yi Yang

Here, we present dynamic slimmable denoising network (DDS-Net), a general method to achieve good denoising quality with less computational complexity, via dynamically adjusting the channel configurations of networks at test time with respect to different noisy images.

Fairness Image Denoising

Paper
Add Code

Signature-Graph Networks

no code implementations • 22 Oct 2021 • Ali Hamdi, Flora Salim, Du Yong Kim, Xiaojun Chang

SGN constructs unique undirected graphs for each image based on the CNN feature maps.

Image Classification Representation Learning

Paper
Add Code

An Entropy-guided Reinforced Partial Convolutional Network for Zero-Shot Learning

no code implementations • 3 Nov 2021 • Yun Li, Zhe Liu, Lina Yao, Xianzhi Wang, Julian McAuley, Xiaojun Chang

Zero-Shot Learning (ZSL) aims to transfer learned knowledge from observed classes to unseen classes via semantic correlations.

Generalized Zero-Shot Learning

Paper
Add Code

BaLeNAS: Differentiable Architecture Search via the Bayesian Learning Rule

no code implementations • CVPR 2022 • Miao Zhang, Jilin Hu, Steven Su, Shirui Pan, Xiaojun Chang, Bin Yang, Gholamreza Haffari

Differentiable Architecture Search (DARTS) has received massive attention in recent years, mainly because it significantly reduces the computational cost through weight sharing and continuous relaxation.

Ranked #6 on Neural Architecture Search on NAS-Bench-201, ImageNet-16-120

Neural Architecture Search Variational Inference

Paper
Add Code

Self-Supervised Global-Local Structure Modeling for Point Cloud Domain Adaptation With Reliable Voted Pseudo Labels

no code implementations • CVPR 2022 • Hehe Fan, Xiaojun Chang, Wanyue Zhang, Yi Cheng, Ying Sun, Mohan Kankanhalli

In this paper, we propose an unsupervised domain adaptation method for deep point cloud representation learning.

Representation Learning Unsupervised Domain Adaptation

Paper
Add Code

Diversity-boosted Generalization-Specialization Balancing for Zero-shot Learning

no code implementations • 6 Jan 2022 • Yun Li, Zhe Liu, Xiaojun Chang, Julian McAuley, Lina Yao

We further propose a differentiable dataset-level balance and update the weights in a linear annealing schedule to simulate network pruning and thus obtain the optimal structure for BSNet with dataset-level balance achieved.

Meta-Learning Network Pruning +1

Paper
Add Code

Exploring Inter-Channel Correlation for Diversity-preserved KnowledgeDistillation

1 code implementation • 8 Feb 2022 • Li Liu, Qingle Huang, Sihao Lin, Hongwei Xie, Bing Wang, Xiaojun Chang, Xiaodan Liang

Extensive experiments on two vision tasks, includ-ing ImageNet classification and Pascal VOC segmentation, demonstrate the superiority of our ICKD, which consis-tently outperforms many existing methods, advancing thestate-of-the-art in the fields of Knowledge Distillation.

Knowledge Distillation

Paper
Code

Voice-Face Homogeneity Tells Deepfake

no code implementations • 4 Mar 2022 • Harry Cheng, Yangyang Guo, Tianyi Wang, Qi Li, Xiaojun Chang, Liqiang Nie

To this end, a voice-face matching method is devised to measure the matching degree of these two.

Paper
Add Code

Beyond Fixation: Dynamic Window Visual Transformer

1 code implementation • CVPR 2022 • Pengzhen Ren, Changlin Li, Guangrun Wang, Yun Xiao, Qing Du, Xiaodan Liang, Xiaojun Chang

Recently, a surge of interest in visual transformers is to reduce the computational cost by limiting the calculation of self-attention to a local window.

Paper
Code

CGUA: Context-Guided and Unpaired-Assisted Weakly Supervised Person Search

no code implementations • 27 Mar 2022 • Chengyou Jia, Minnan Luo, Caixia Yan, Xiaojun Chang, Qinghua Zheng

On the other hand, there are numerous unpaired persons in real-world scene images.

Person Search

Paper
Add Code

Automated Progressive Learning for Efficient Training of Vision Transformers

1 code implementation • CVPR 2022 • Changlin Li, Bohan Zhuang, Guangrun Wang, Xiaodan Liang, Xiaojun Chang, Yi Yang

First, we develop a strong manual baseline for progressive learning of ViTs, by introducing momentum growth (MoGrow) to bridge the gap brought by model growth.

Paper
Code

Dual-AI: Dual-path Actor Interaction Learning for Group Activity Recognition

no code implementations • CVPR 2022 • Mingfei Han, David Junhao Zhang, Yali Wang, Rui Yan, Lina Yao, Xiaojun Chang, Yu Qiao

Learning spatial-temporal relation among multiple actors is crucial for group activity recognition.

Group Activity Recognition

Paper
Add Code

PRE-NAS: Predictor-assisted Evolutionary Neural Architecture Search

no code implementations • 27 Apr 2022 • Yameng Peng, Andy Song, Vic Ciesielski, Haytham M. Fayek, Xiaojun Chang

This often requires a high computational overhead to evaluate a number of candidate networks from the set of all possible networks in the search space during the search.

Ranked #16 on Neural Architecture Search on NAS-Bench-201, CIFAR-10

Neural Architecture Search

Paper
Add Code

Towards Explanation for Unsupervised Graph-Level Representation Learning

1 code implementation • 20 May 2022 • Qinghua Zheng, Jihong Wang, Minnan Luo, YaoLiang Yu, Jundong Li, Lina Yao, Xiaojun Chang

Due to the superior performance of Graph Neural Networks (GNNs) in various domains, there is an increasing interest in the GNN explanation problem "\emph{which fraction of the input graph is the most crucial to decide the model's decision?}"

Decision Making Graph Classification +2

301

Paper
Code

Knowledge Distillation via the Target-aware Transformer

2 code implementations • CVPR 2022 • Sihao Lin, Hongwei Xie, Bing Wang, Kaicheng Yu, Xiaojun Chang, Xiaodan Liang, Gang Wang

To this end, we propose a novel one-to-all spatial matching knowledge distillation approach.

Knowledge Distillation

Paper
Code

Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent RL

no code implementations • 1 Jun 2022 • Siyi Hu, Chuanlong Xie, Xiaodan Liang, Xiaojun Chang

In this study, we quantify the agent's behavior difference and build its relationship with the policy performance via {\bf Role Diversity}, a metric to measure the characteristics of MARL tasks.

SMAC+ Starcraft

Paper
Add Code

Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation

no code implementations • CVPR 2022 • Mingjie Li, Wenjia Cai, Karin Verspoor, Shirui Pan, Xiaodan Liang, Xiaojun Chang

To endow models with the capability of incorporating expert knowledge, we propose a Cross-modal clinical Graph Transformer (CGT) for ophthalmic report generation (ORG), in which clinical relation triples are injected into the visual features as prior knowledge to drive the decoding procedure.

Clinical Knowledge Medical Report Generation

Paper
Add Code

Domain Adaptive Nuclei Instance Segmentation and Classification via Category-aware Feature Alignment and Pseudo-labelling

no code implementations • 4 Jul 2022 • Canran Li, Dongnan Liu, Haoran Li, Zheng Zhang, Guangming Lu, Xiaojun Chang, Weidong Cai

In this work, we propose a novel deep neural network, namely Category-Aware feature alignment and Pseudo-Labelling Network (CAPL-Net) for UDA nuclei instance segmentation and classification.

Classification Instance Segmentation +3

Paper
Add Code

CLMFormer: Mitigating Data Redundancy to Revitalize Transformer-based Long-Term Time Series Forecasting System

2 code implementations • 16 Jul 2022 • Mingjie Li, Rui Liu, Guangsi Shi, Mingfei Han, Changling Li, Lina Yao, Xiaojun Chang, Ling Chen

To further enhance forecasting accuracy, we introduce a memory-driven decoder.

Data Augmentation Time Series +1

Paper
Code

An Efficient Spatio-Temporal Pyramid Transformer for Action Detection

no code implementations • 21 Jul 2022 • Yuetian Weng, Zizheng Pan, Mingfei Han, Xiaojun Chang, Bohan Zhuang

The task of action detection aims at deducing both the action category and localization of the start and end moment for each action instance in a long, untrimmed video.

Action Detection Video Understanding

Paper
Add Code

Spatial-temporal Analysis for Automated Concrete Workability Estimation

no code implementations • 24 Jul 2022 • Litao Yu, Jian Zhang, Mohammed Bennamoun, Xiaojun Chang, Vute Sirivivatnanon, Ali Nezhad

Concrete workability measure is mostly determined based on subjective assessment of a certified assessor with visual inspections.

regression

Paper
Add Code

Prompt-driven efficient Open-set Semi-supervised Learning

no code implementations • 28 Sep 2022 • Haoran Li, Chun-Mei Feng, Tao Zhou, Yong Xu, Xiaojun Chang

In this paper, we propose a prompt-driven efficient OSSL framework, called OpenPrompt, which can propagate class information from labeled to unlabeled data with only a small number of trainable parameters.

Computational Efficiency Outlier Detection

Paper
Add Code

MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning Library

1 code implementation • 11 Oct 2022 • Siyi Hu, Yifan Zhong, Minquan Gao, Weixun Wang, Hao Dong, Xiaodan Liang, Zhihui Li, Xiaojun Chang, Yaodong Yang

A significant challenge facing researchers in the area of multi-agent reinforcement learning (MARL) pertains to the identification of a library that can offer fast and compatible development for multi-agent tasks and algorithm combinations, while obviating the need to consider compatibility issues.

Multi-agent Reinforcement Learning reinforcement-learning +1

762

Paper
Code

ViLPAct: A Benchmark for Compositional Generalization on Multimodal Human Activities

no code implementations • 11 Oct 2022 • Terry Yue Zhuo, Yaqing Liao, Yuecheng Lei, Lizhen Qu, Gerard de Melo, Xiaojun Chang, Yazhou Ren, Zenglin Xu

We introduce ViLPAct, a novel vision-language benchmark for human activity planning.

Paper
Add Code

PAR: Political Actor Representation Learning with Social Context and Expert Knowledge

1 code implementation • 15 Oct 2022 • Shangbin Feng, Zhaoxuan Tan, Zilong Chen, Ningnan Wang, Peisheng Yu, Qinghua Zheng, Xiaojun Chang, Minnan Luo

Extensive experiments demonstrate that PAR is better at augmenting political text understanding and successfully advances the state-of-the-art in political perspective detection and roll call vote prediction.

Representation Learning

Paper
Code

Learning Self-Regularized Adversarial Views for Self-Supervised Vision Transformers

1 code implementation • 16 Oct 2022 • Tao Tang, Changlin Li, Guangrun Wang, Kaicheng Yu, Xiaojun Chang, Xiaodan Liang

Despite the success, its development and application on self-supervised vision transformers have been hindered by several barriers, including the high search cost, the lack of supervision, and the unsuitable search space.

Data Augmentation Image Retrieval +3

Paper
Code

Simple Primitives with Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-shot Learning

no code implementations • 5 Nov 2022 • Zhe Liu, Yun Li, Lina Yao, Xiaojun Chang, Wei Fang, XiaoJun Wu, Yi Yang

We design Semantic Attention (SA) and generative Knowledge Disentanglement (KD) to learn the dependence of feasibility and contextuality, respectively.

Compositional Zero-Shot Learning Disentanglement

Paper
Add Code

3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation

no code implementations • 2 Dec 2022 • Zutao Jiang, Guansong Lu, Xiaodan Liang, Jihua Zhu, Wei zhang, Xiaojun Chang, Hang Xu

Here, we make the first attempt to achieve generic text-guided cross-category 3D object generation via a new 3D-TOGO model, which integrates a text-to-views generation module and a views-to-3D generation module.

3D Generation Contrastive Learning +2

Paper
Add Code

HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation

no code implementations • ICCV 2023 • Mingfei Han, Yali Wang, Zhihui Li, Lina Yao, Xiaojun Chang, Yu Qiao

To tackle this problem, we propose a concise Hybrid Temporal-scale Multimodal Learning (HTML) framework, which can effectively align lingual and visual features to discover core object semantics in the video, by learning multimodal interaction hierarchically from different temporal scales.

Ranked #6 on Referring Video Object Segmentation on Refer-YouTube-VOS (using extra training data)

Object Referring Video Object Segmentation +2

Paper
Add Code

ViewCo: Discovering Text-Supervised Segmentation Masks via Multi-View Semantic Consistency

1 code implementation • 31 Jan 2023 • Pengzhen Ren, Changlin Li, Hang Xu, Yi Zhu, Guangrun Wang, Jianzhuang Liu, Xiaojun Chang, Xiaodan Liang

Specifically, we first propose text-to-views consistency modeling to learn correspondence for multiple views of the same input image.

Segmentation Semantic Segmentation

Paper
Code

Guided Image-to-Image Translation by Discriminator-Generator Communication

no code implementations • 7 Mar 2023 • Yuanjiang Cao, Lina Yao, Le Pan, Quan Z. Sheng, Xiaojun Chang

The goal of Image-to-image (I2I) translation is to transfer an image from a source domain to a target domain, which has recently drawn increasing attention.

Generative Adversarial Network Image-to-Image Translation +1

Paper
Add Code

Dynamic Graph Enhanced Contrastive Learning for Chest X-ray Report Generation

1 code implementation • CVPR 2023 • Mingjie Li, Bingqian Lin, Zicong Chen, Haokun Lin, Xiaodan Liang, Xiaojun Chang

To address the limitation, we propose a knowledge graph with Dynamic structure and nodes to facilitate medical report generation with Contrastive Learning, named DCL.

Contrastive Learning General Knowledge +2

Paper
Code

A Benchmark for Cycling Close Pass Near Miss Event Detection from Video Streams

1 code implementation • 24 Apr 2023 • Mingjie Li, Tharindu Rathnayake, Ben Beck, Lingheng Meng, Zijue Chen, Akansel Cosgun, Xiaojun Chang, Dana Kulić

Instance-level detection aims to detect which vehicle in the scene gives rise to a close pass near miss.

Event Detection

Paper
Code

Towards Medical Artificial General Intelligence via Knowledge-Enhanced Multimodal Pretraining

1 code implementation • 26 Apr 2023 • Bingqian Lin, Zicong Chen, Mingjie Li, Haokun Lin, Hang Xu, Yi Zhu, Jianzhuang Liu, Wenjia Cai, Lei Yang, Shen Zhao, Chenfei Wu, Ling Chen, Xiaojun Chang, Yi Yang, Lei Xing, Xiaodan Liang

In MOTOR, we combine two kinds of basic medical knowledge, i. e., general and specific knowledge, in a complementary manner to boost the general pretraining process.

Medical Visual Question Answering Question Answering +1

Paper
Code

Toward the Automated Construction of Probabilistic Knowledge Graphs for the Maritime Domain

no code implementations • 4 May 2023 • Fatemeh Shiri, Teresa Wang, Shirui Pan, Xiaojun Chang, Yuan-Fang Li, Reza Haffari, Van Nguyen, Shuang Yu

In order to exploit the potentially useful and rich information from such sources, it is necessary to extract not only the relevant entities and concepts but also their semantic relations, together with the uncertainty associated with the extracted knowledge (i. e., in the form of probabilistic knowledge graphs).

Knowledge Graphs

Paper
Add Code

Maximum Entropy Heterogeneous-Agent Reinforcement Learning

1 code implementation • 19 Jun 2023 • Jiarong Liu, Yifan Zhong, Siyi Hu, Haobo Fu, Qiang Fu, Xiaojun Chang, Yaodong Yang

We embed cooperative MARL problems into probabilistic graphical models, from which we derive the maximum entropy (MaxEnt) objective for MARL.

Multi-agent Reinforcement Learning reinforcement-learning +1

339

Paper
Code

Two-stream Multi-level Dynamic Point Transformer for Two-person Interaction Recognition

no code implementations • 22 Jul 2023 • Yao Liu, Gangfeng Cui, Jiahui Luo, Lina Yao, Xiaojun Chang

Subsequently, a frame features learning module and a two-stream multi-level feature aggregation module extract global and partial features from the sampled frames, effectively representing the local-region spatial information, appearance information, and motion information related to the interactions.

Action Recognition Temporal Action Localization

Paper
Add Code

FULLER: Unified Multi-modality Multi-task 3D Perception via Multi-level Gradient Calibration

no code implementations • ICCV 2023 • Zhijian Huang, Sihao Lin, Guiyu Liu, Mukun Luo, Chaoqiang Ye, Hang Xu, Xiaojun Chang, Xiaodan Liang

Specifically, the gradients, produced by the task heads and used to update the shared backbone, will be calibrated at the backbone's last layer to alleviate the task conflict.

Autonomous Driving Multi-Task Learning

Paper
Add Code

SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form Layout-to-Image Generation

no code implementations • 20 Aug 2023 • Chengyou Jia, Minnan Luo, Zhuohang Dang, Guang Dai, Xiaojun Chang, Mengmeng Wang, Jingdong Wang

Despite significant progress in Text-to-Image (T2I) generative models, even lengthy and complex text descriptions still struggle to convey detailed controls.

Layout-to-Image Generation

Paper
Add Code

ProAgent: Building Proactive Cooperative Agents with Large Language Models

no code implementations • 22 Aug 2023 • Ceyao Zhang, Kaijie Yang, Siyi Hu, ZiHao Wang, Guanghe Li, Yihang Sun, Cheng Zhang, Zhaowei Zhang, Anji Liu, Song-Chun Zhu, Xiaojun Chang, Junge Zhang, Feng Yin, Yitao Liang, Yaodong Yang

Building agents with adaptive behavior in cooperative tasks stands as a paramount goal in the realm of multi-agent systems.

Paper
Add Code

PSDiff: Diffusion Model for Person Search with Iterative and Collaborative Refinement

no code implementations • 20 Sep 2023 • Chengyou Jia, Minnan Luo, Zhuohang Dang, Guang Dai, Xiaojun Chang, Jingdong Wang

Dominant Person Search methods aim to localize and recognize query persons in a unified network, which jointly optimizes two sub-tasks, \ie, pedestrian detection and Re-IDentification (ReID).

Denoising Pedestrian Detection +2

Paper
Add Code

No Token Left Behind: Efficient Vision Transformer via Dynamic Token Idling

1 code implementation • 9 Oct 2023 • Xuwei Xu, Changlin Li, Yudong Chen, Xiaojun Chang, Jiajun Liu, Sen Wang

By allowing the idle tokens to be re-selected in the following layers, IdleViT mitigates the negative impact of improper pruning in the early stages.

Paper
Code

Mask Propagation for Efficient Video Semantic Segmentation

1 code implementation • NeurIPS 2023 • Yuetian Weng, Mingfei Han, Haoyu He, Mingjie Li, Lina Yao, Xiaojun Chang, Bohan Zhuang

By reusing predictions from key frames, we circumvent the need to process a large volume of video frames individually with resource-intensive segmentors, alleviating temporal redundancy and significantly reducing computational costs.

Semantic Segmentation Video Semantic Segmentation

Paper
Code

Disentangled Representation Learning with Transmitted Information Bottleneck

no code implementations • 3 Nov 2023 • Zhuohang Dang, Minnan Luo, Chengyou Jia, Guang Dai, Jihong Wang, Xiaojun Chang, Jingdong Wang, Qinghua Zheng

Encoding only the task-related information from the raw data, \ie, disentangled representation learning, can greatly contribute to the robustness and generalizability of models.

Disentanglement Variational Inference

Paper
Add Code

Generating Action-conditioned Prompts for Open-vocabulary Video Action Recognition

no code implementations • 4 Dec 2023 • Chengyou Jia, Minnan Luo, Xiaojun Chang, Zhuohang Dang, Mingfei Han, Mengmeng Wang, Guang Dai, Sizhe Dang, Jingdong Wang

To realize this, we innovatively blend video models with Large Language Models (LLMs) to devise Action-conditioned Prompts.

Action Recognition Descriptive +1

Paper
Add Code

Shot2Story20K: A New Benchmark for Comprehensive Understanding of Multi-shot Videos

1 code implementation • 16 Dec 2023 • Mingfei Han, Linjie Yang, Xiaojun Chang, Heng Wang

A human need to capture both the event in every shot and associate them together to understand the story behind it.

Ranked #1 on video narration captioning on Shot2Story20K

Video Captioning video narration captioning +4

Paper
Code

Video Recognition in Portrait Mode

1 code implementation • 21 Dec 2023 • Mingfei Han, Linjie Yang, Xiaojie Jin, Jiashi Feng, Xiaojun Chang, Heng Wang

While existing datasets mainly comprise landscape mode videos, our paper seeks to introduce portrait mode videos to the research community and highlight the unique challenges associated with this video format.

Data Augmentation Video Recognition

Paper
Code

Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation

no code implementations • 27 Dec 2023 • Zhuohang Dang, Minnan Luo, Chengyou Jia, Guang Dai, Xiaojun Chang, Jingdong Wang

Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.

Cross-Modal Retrieval Memorization +2

Paper
Add Code

MatchNAS: Optimizing Edge AI in Sparse-Label Data Contexts via Automating Deep Neural Network Porting for Mobile Deployment

no code implementations • 21 Feb 2024 • Hongtao Huang, Xiaojun Chang, Wen Hu, Lina Yao

In this paper, we propose MatchNAS, a novel scheme for porting DNNs to mobile devices.

Paper
Add Code

DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions

1 code implementation • 2 Mar 2024 • Guangrun Wang, Changlin Li, Liuchun Yuan, Jiefeng Peng, Xiaoyu Xian, Xiaodan Liang, Xiaojun Chang, Liang Lin

Addressing this problem, we modularize a large search space into blocks with small search spaces and develop a family of models with the distilling neural architecture (DNA) techniques.

Neural Architecture Search

230

Paper
Code

SWAP-NAS: Sample-Wise Activation Patterns for Ultra-fast NAS

no code implementations • 7 Mar 2024 • Yameng Peng, Andy Song, Haytham M. Fayek, Vic Ciesielski, Xiaojun Chang

The SWAP-Score is strongly correlated with ground-truth performance across various search spaces and tasks, outperforming 15 existing training-free metrics on NAS-Bench-101/201/301 and TransNAS-Bench-101.

Neural Architecture Search

Paper
Add Code

NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning

1 code implementation • 12 Mar 2024 • Bingqian Lin, Yunshuang Nie, Ziming Wei, Jiaqi Chen, Shikui Ma, Jianhua Han, Hang Xu, Xiaojun Chang, Xiaodan Liang

Vision-and-Language Navigation (VLN), as a crucial research problem of Embodied AI, requires an embodied agent to navigate through complex 3D environments following natural language instructions.

Navigate Vision and Language Navigation

Paper
Code

Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding

1 code implementation • 21 Mar 2024 • Jingjing Hu, Dan Guo, Kun Li, Zhan Si, Xun Yang, Xiaojun Chang, Meng Wang

Inspired by the activity-silent and persistent activity mechanisms in human visual perception biology, we design a Unified Static and Dynamic Network (UniSDNet), to learn the semantic association between the video and text/audio queries in a cross-modal environment for efficient video grounding.

Video Grounding

Paper
Code

Self-Supervised Multi-Frame Neural Scene Flow

no code implementations • 24 Mar 2024 • Dongrui Liu, Daqi Liu, Xueqian Li, Sihao Lin, Hongwei Xie, Bing Wang, Xiaojun Chang, Lei Chu

Neural Scene Flow Prior (NSFP) and Fast Neural Scene Flow (FNSF) have shown remarkable adaptability in the context of large out-of-distribution autonomous driving.

Autonomous Driving Scene Flow Estimation

Paper
Add Code

LongVLM: Efficient Long Video Understanding via Large Language Models

1 code implementation • 4 Apr 2024 • Yuetian Weng, Mingfei Han, Haoyu He, Xiaojun Chang, Bohan Zhuang

In this way, we encode video representations that incorporate both local and global information, enabling the LLM to generate comprehensive responses for long-term videos.

Question Answering Video Question Answering +1

Paper
Code

MLP Can Be A Good Transformer Learner

1 code implementation • 8 Apr 2024 • Sihao Lin, Pumeng Lyu, Dongrui Liu, Tao Tang, Xiaodan Liang, Andy Song, Xiaojun Chang

We identify that regarding the attention layer in bottom blocks, their subsequent MLP layers, i. e. two feed-forward layers, can elicit the same entropy quantity.

Paper
Code

Mining Inter-Video Proposal Relations for Video Object Detection

1 code implementation • ECCV 2020 • Mingfei Han, Yali Wang, Xiaojun Chang, Yu Qiao

Recent studies have shown that, context aggregating information from proposals in different frames can clearly enhance the performance of video object detection.

Ranked #11 on Video Object Detection on ImageNet VID

Object object-detection +3

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.