Search Results for author: Mingkui Tan

Found 115 papers, 66 papers with code

Towards Ultrahigh Dimensional Feature Selection for Big Data

no code implementations24 Sep 2012 Mingkui Tan, Ivor W. Tsang, Li Wang

In this paper, we present a new adaptive feature scaling scheme for ultrahigh-dimensional feature selection on Big Data.

feature selection Selection bias

Matching Pursuit LASSO Part II: Applications and Sparse Recovery over Batch Signals

no code implementations20 Feb 2013 Mingkui Tan, Ivor W. Tsang, Li Wang

Matching Pursuit LASSIn Part I \cite{TanPMLPart1}, a Matching Pursuit LASSO ({MPL}) algorithm has been presented for solving large-scale sparse recovery (SR) problems.

Compressive Sensing Face Recognition

Scalable Nuclear-norm Minimization by Subspace Pursuit Proximal Riemannian Gradient

no code implementations10 Mar 2015 Mingkui Tan, Shijie Xiao, Junbin Gao, Dong Xu, Anton Van Den Hengel, Qinfeng Shi

Nuclear-norm regularization plays a vital role in many learning tasks, such as low-rank matrix recovery (MR), and low-rank representation (LRR).

Clustering Matrix Completion

Fast Algorithms for Linear and Kernel SVM+

no code implementations CVPR 2016 Wen Li, Dengxin Dai, Mingkui Tan, Dong Xu, Luc van Gool

The SVM+ approach has shown excellent performance in visual recognition tasks for exploiting privileged information in the training data.

Blind Image Deconvolution by Automatic Gradient Activation

no code implementations CVPR 2016 Dong Gong, Mingkui Tan, Yanning Zhang, Anton Van Den Hengel, Qinfeng Shi

We show here that a subset of the image gradients are adequate to estimate the blur kernel robustly, no matter the gradient image is sparse or not.

Image Deconvolution

The Shallow End: Empowering Shallower Deep-Convolutional Networks through Auxiliary Outputs

1 code implementation6 Nov 2016 Yong Guo, Jian Chen, Qing Du, Anton Van Den Hengel, Qinfeng Shi, Mingkui Tan

As a result, the representation power of intermediate layers can be very weak and the model becomes very redundant with limited performance.

Model Compression Model Selection

Self-Paced Kernel Estimation for Robust Blind Image Deblurring

no code implementations ICCV 2017 Dong Gong, Mingkui Tan, Yanning Zhang, Anton Van Den Hengel, Qinfeng Shi

Rather than attempt to identify outliers to the model a priori, we instead propose to sequentially identify inliers, and gradually incorporate them into the estimation process.

Blind Image Deblurring Image Deblurring

Towards Effective Low-bitwidth Convolutional Neural Networks

2 code implementations CVPR 2018 Bohan Zhuang, Chunhua Shen, Mingkui Tan, Lingqiao Liu, Ian Reid

This paper tackles the problem of training a deep convolutional neural network with both low-precision weights and low-bitwidth activations.

Quantization

Adaptive Cost-sensitive Online Classification

no code implementations6 Apr 2018 Peilin Zhao, Yifan Zhang, Min Wu, Steven C. H. Hoi, Mingkui Tan, Junzhou Huang

Cost-Sensitive Online Classification has drawn extensive attention in recent years, where the main approach is to directly online optimize two well-known cost-sensitive metrics: (i) weighted sum of sensitivity and specificity; (ii) weighted misclassification cost.

Anomaly Detection Classification +2

Multi-modality Sensor Data Classification with Selective Attention

no code implementations16 Apr 2018 Xiang Zhang, Lina Yao, Chaoran Huang, Sen Wang, Mingkui Tan, Guodong Long, Can Wang

Multimodal wearable sensor data classification plays an important role in ubiquitous computing and has a wide range of applications in scenarios from healthcare to entertainment.

Classification General Classification

Visual Grounding via Accumulated Attention

no code implementations CVPR 2018 Chaorui Deng, Qi Wu, Qingyao Wu, Fuyuan Hu, Fan Lyu, Mingkui Tan

There are three main challenges in VG: 1) what is the main focus in a query; 2) how to understand an image; 3) how to locate an object.

Sentence Visual Grounding

Adversarial Learning with Local Coordinate Coding

no code implementations ICML 2018 Jiezhang Cao, Yong Guo, Qingyao Wu, Chunhua Shen, Junzhou Huang, Mingkui Tan

Generative adversarial networks (GANs) aim to generate realistic data from some prior distribution (e. g., Gaussian noises).

Dual Reconstruction Nets for Image Super-Resolution with Gradient Sensitive Loss

no code implementations19 Sep 2018 Yong Guo, Qi Chen, Jian Chen, Junzhou Huang, Yanwu Xu, JieZhang Cao, Peilin Zhao, Mingkui Tan

However, most deep learning methods employ feed-forward architectures, and thus the dependencies between LR and HR images are not fully exploited, leading to limited learning performance.

Image Super-Resolution

Learning Joint Wasserstein Auto-Encoders for Joint Distribution Matching

no code implementations27 Sep 2018 JieZhang Cao, Yong Guo, Langyuan Mo, Peilin Zhao, Junzhou Huang, Mingkui Tan

We study the joint distribution matching problem which aims at learning bidirectional mappings to match the joint distribution of two domains.

Open-Ended Question Answering Unsupervised Image-To-Image Translation +2

MPTV: Matching Pursuit Based Total Variation Minimization for Image Deconvolution

no code implementations12 Oct 2018 Dong Gong, Mingkui Tan, Qinfeng Shi, Anton Van Den Hengel, Yanning Zhang

Compared to existing methods, MPTV is less sensitive to the choice of the trade-off parameter between data fitting and regularization.

Image Deconvolution

Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation

no code implementations CVPR 2019 Bohan Zhuang, Chunhua Shen, Mingkui Tan, Lingqiao Liu, Ian Reid

In this paper, we propose to train convolutional neural networks (CNNs) with both binarized weights and activations, leading to quantized models specifically} for mobile devices with limited power capacity and computation resources.

General Classification Image Classification +2

You Only Look & Listen Once: Towards Fast and Accurate Visual Grounding

no code implementations12 Feb 2019 Chaorui Deng, Qi Wu, Guanghui Xu, Zhuliang Yu, Yanwu Xu, Kui Jia, Mingkui Tan

Most state-of-the-art methods in VG operate in a two-stage manner, wherein the first stage an object detector is adopted to generate a set of object proposals from the input image and the second stage is simply formulated as a cross-modal matching problem that finds the best match between the language query and all region proposals.

object-detection Object Detection +2

Training Quantized Neural Networks with a Full-precision Auxiliary Module

no code implementations CVPR 2020 Bohan Zhuang, Lingqiao Liu, Mingkui Tan, Chunhua Shen, Ian Reid

In this paper, we seek to tackle a challenge in training low-precision networks: the notorious difficulty in propagating gradient through a low-precision network due to the non-differentiable quantization function.

Image Classification object-detection +2

Auto-Embedding Generative Adversarial Networks for High Resolution Image Synthesis

1 code implementation27 Mar 2019 Yong Guo, Qi Chen, Jian Chen, Qingyao Wu, Qinfeng Shi, Mingkui Tan

To address this issue, we develop a novel GAN called Auto-Embedding Generative Adversarial Network (AEGAN), which simultaneously encodes the global structure features and captures the fine-grained details.

Generative Adversarial Network Image Generation +2

Domain-Symmetric Networks for Adversarial Domain Adaptation

1 code implementation CVPR 2019 Yabin Zhang, Hui Tang, Kui Jia, Mingkui Tan

Since target samples are unlabeled, we also propose a scheme of cross-domain training to help learn the target classifier.

Unsupervised Domain Adaptation

Deep Multi-View Learning using Neuron-Wise Correlation-Maximizing Regularizers

no code implementations25 Apr 2019 Kui Jia, Jiehong Lin, Mingkui Tan, DaCheng Tao

Such a perspective enables us to study deep multi-view learning in the context of regularized network training, for which we present control experiments of benchmark image classification to show the efficacy of our proposed CorrReg.

3D Object Recognition General Classification +3

Continual Reinforcement Learning with Diversity Exploration and Adversarial Self-Correction

no code implementations21 Jun 2019 Fengda Zhu, Xiaojun Chang, Runhao Zeng, Mingkui Tan

We first develop an unsupervised diversity exploration method to learn task-specific skills using an unsupervised objective.

Autonomous Driving Continuous Control +2

Attention Guided Network for Retinal Image Segmentation

2 code implementations25 Jul 2019 Shihao Zhang, Huazhu Fu, Yuguang Yan, Yubing Zhang, Qingyao Wu, Ming Yang, Mingkui Tan, Yanwu Xu

Learning structural information is critical for producing an ideal result in retinal image segmentation.

Image Segmentation Segmentation +1

Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations

no code implementations10 Aug 2019 Bohan Zhuang, Jing Liu, Mingkui Tan, Lingqiao Liu, Ian Reid, Chunhua Shen

Furthermore, we propose a second progressive quantization scheme which gradually decreases the bit-width from high-precision to low-precision during training.

Knowledge Distillation Quantization

Deep High-Resolution Representation Learning for Visual Recognition

42 code implementations20 Aug 2019 Jingdong Wang, Ke Sun, Tianheng Cheng, Borui Jiang, Chaorui Deng, Yang Zhao, Dong Liu, Yadong Mu, Mingkui Tan, Xinggang Wang, Wenyu Liu, Bin Xiao

High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection.

 Ranked #1 on Object Detection on COCO test-dev (Hardware Burden metric)

Dichotomous Image Segmentation Face Alignment +7

Graph Convolutional Networks for Temporal Action Localization

1 code implementation ICCV 2019 Runhao Zeng, Wenbing Huang, Mingkui Tan, Yu Rong, Peilin Zhao, Junzhou Huang, Chuang Gan

Then we apply the GCNs over the graph to model the relations among different proposals and learn powerful representations for the action classification and localization.

Ranked #4 on Temporal Action Localization on THUMOS’14 (mAP IOU@0.1 metric)

Action Classification Temporal Action Localization

Structured Binary Neural Networks for Image Recognition

no code implementations22 Sep 2019 Bohan Zhuang, Chunhua Shen, Mingkui Tan, Peng Chen, Lingqiao Liu, Ian Reid

Experiments on both classification, semantic segmentation and object detection tasks demonstrate the superior performance of the proposed methods over various quantized networks in the literature.

object-detection Object Detection +2

Towards Interpreting Deep Neural Networks via Understanding Layer Behaviors

no code implementations25 Sep 2019 JieZhang Cao, Jincheng Li, Xiping Hu, Peilin Zhao, Mingkui Tan

ii) the $W$-distance of a specific layer to the target distribution tends to decrease along training iterations.

NAT: Neural Architecture Transformer for Accurate and Compact Architectures

1 code implementation NeurIPS 2019 Yong Guo, Yin Zheng, Mingkui Tan, Qi Chen, Jian Chen, Peilin Zhao, Junzhou Huang

To verify the effectiveness of the proposed strategies, we apply NAT on both hand-crafted architectures and NAS based architectures.

Neural Architecture Search

Multi-marginal Wasserstein GAN

3 code implementations NeurIPS 2019 Jiezhang Cao, Langyuan Mo, Yifan Zhang, Kui Jia, Chunhua Shen, Mingkui Tan

Multiple marginal matching problem aims at learning mappings to match a source domain to multiple target domains and it has attracted great attention in many applications, such as multi-domain image translation.

Image Generation Translation

Collaborative Unsupervised Domain Adaptation for Medical Image Diagnosis

1 code implementation17 Nov 2019 Yifan Zhang, Ying WEI, Peilin Zhao, Shuaicheng Niu, Qingyao Wu, Mingkui Tan, Junzhou Huang

In this paper, we seek to exploit rich labeled data from relevant domains to help the learning in the target task with unsupervised domain adaptation (UDA).

Unsupervised Domain Adaptation

Online Adaptive Asymmetric Active Learning with Limited Budgets

1 code implementation18 Nov 2019 Yifan Zhang, Peilin Zhao, Shuaicheng Niu, Qingyao Wu, JieZhang Cao, Junzhou Huang, Mingkui Tan

In these problems, there are two key challenges: the query budget is often limited; the ratio between classes is highly imbalanced.

Active Learning Anomaly Detection

Discrimination-aware Network Pruning for Deep Model Compression

1 code implementation4 Jan 2020 Jing Liu, Bohan Zhuang, Zhuangwei Zhuang, Yong Guo, Junzhou Huang, Jinhui Zhu, Mingkui Tan

In this paper, we propose a simple-yet-effective method called discrimination-aware channel pruning (DCP) to choose the channels that actually contribute to the discriminative power.

Face Recognition Image Classification +2

Intelligent Home 3D: Automatic 3D-House Design from Linguistic Descriptions Only

1 code implementation CVPR 2020 Qi Chen, Qi Wu, Rui Tang, Yu-Han Wang, Shuai Wang, Mingkui Tan

To this end, we propose a House Plan Generative Model (HPGM) that first translates the language input to a structural graph representation and then predicts the layout of rooms with a Graph Conditioned Layout Prediction Network (GC LPN) and generates the interior texture with a Language Conditioned Texture GAN (LCT-GAN).

Text to 3D

Cost-Sensitive Portfolio Selection via Deep Reinforcement Learning

no code implementations6 Mar 2020 Yifan Zhang, Peilin Zhao, Qingyao Wu, Bin Li, Junzhou Huang, Mingkui Tan

This task, however, has two main difficulties: (i) the non-stationary price series and complex asset correlations make the learning of feature representation very hard; (ii) the practicality principle in financial markets requires controlling both transaction and risk costs.

reinforcement-learning Reinforcement Learning (RL)

Generative Low-bitwidth Data Free Quantization

3 code implementations ECCV 2020 Shoukai Xu, Haokun Li, Bohan Zhuang, Jing Liu, JieZhang Cao, Chuangrun Liang, Mingkui Tan

More critically, our method achieves much higher accuracy on 4-bit quantization than the existing data free quantization method.

Data Free Quantization

Disturbance-immune Weight Sharing for Neural Architecture Search

no code implementations29 Mar 2020 Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Yong Guo, Peilin Zhao, Junzhou Huang, Mingkui Tan

To alleviate the performance disturbance issue, we propose a new disturbance-immune update strategy for model updating.

Neural Architecture Search

Dense Regression Network for Video Grounding

1 code implementation CVPR 2020 Runhao Zeng, Haoming Xu, Wenbing Huang, Peihao Chen, Mingkui Tan, Chuang Gan

The key idea of this paper is to use the distances between the frame within the ground truth and the starting (ending) frame as dense supervisions to improve the video grounding accuracy.

Natural Language Moment Retrieval Natural Language Queries +2

COVID-DA: Deep Domain Adaptation from Typical Pneumonia to COVID-19

1 code implementation30 Apr 2020 Yifan Zhang, Shuaicheng Niu, Zhen Qiu, Ying WEI, Peilin Zhao, Jianhua Yao, Junzhou Huang, Qingyao Wu, Mingkui Tan

There are two main challenges: 1) the discrepancy of data distributions between domains; 2) the task difference between the diagnosis of typical pneumonia and COVID-19.

COVID-19 Diagnosis Domain Adaptation

A Real-time Action Representation with Temporal Encoding and Deep Compression

no code implementations17 Jun 2020 Kun Liu, Wu Liu, Huadong Ma, Mingkui Tan, Chuang Gan

Our method achieves clear improvements on UCF101 action recognition benchmark against state-of-the-art real-time methods by 5. 4% in terms of accuracy and 2 times faster in terms of inference speed with a less than 5MB storage model.

Action Recognition

Relation-Aware Transformer for Portfolio Policy Learning

2 code implementations IJCAI 2020 Ke Xu, Yifan Zhang, Deheng Ye, Peilin Zhao, Mingkui Tan

One of the key issues is how to represent the non-stationary price series of assets in a portfolio, which is important for portfolio decisions.

Relation

Breaking the Curse of Space Explosion: Towards Efficient NAS with Curriculum Search

1 code implementation ICML 2020 Yong Guo, Yaofo Chen, Yin Zheng, Peilin Zhao, Jian Chen, Junzhou Huang, Mingkui Tan

With the proposed search strategy, our Curriculum Neural Architecture Search (CNAS) method significantly improves the search efficiency and finds better architectures than existing NAS methods.

Neural Architecture Search

AQD: Towards Accurate Fully-Quantized Object Detection

1 code implementation CVPR 2021 Peng Chen, Jing Liu, Bohan Zhuang, Mingkui Tan, Chunhua Shen

Network quantization allows inference to be conducted using low-precision arithmetic for improved inference efficiency of deep neural networks on edge devices.

Image Classification Object +3

Generating Visually Aligned Sound from Videos

1 code implementation14 Jul 2020 Peihao Chen, Yang Zhang, Mingkui Tan, Hongdong Xiao, Deng Huang, Chuang Gan

During testing, the audio forwarding regularizer is removed to ensure that REGNET can produce purely aligned sound only from visual features.

Length-Controllable Image Captioning

1 code implementation ECCV 2020 Chaorui Deng, Ning Ding, Mingkui Tan, Qi Wu

We verify the merit of the proposed length level embedding on three models: two state-of-the-art (SOTA) autoregressive models with different types of decoder, as well as our proposed non-autoregressive model, to show its generalization ability.

controllable image captioning

Improving Generative Adversarial Networks with Local Coordinate Coding

1 code implementation28 Jul 2020 Jiezhang Cao, Yong Guo, Qingyao Wu, Chunhua Shen, Junzhou Huang, Mingkui Tan

In this paper, rather than sampling from the predefined prior distribution, we propose an LCCGAN model with local coordinate coding (LCC) to improve the performance of generating data.

Location-aware Graph Convolutional Networks for Video Question Answering

1 code implementation7 Aug 2020 Deng Huang, Peihao Chen, Runhao Zeng, Qing Du, Mingkui Tan, Chuang Gan

In this work, we propose to represent the contents in the video as a location-aware graph by incorporating the location information of an object into the graph construction.

Action Recognition graph construction +3

SkeletonNet: A Topology-Preserving Solution for Learning Mesh Reconstruction of Object Surfaces from RGB Images

1 code implementation13 Aug 2020 Jiapeng Tang, Xiaoguang Han, Mingkui Tan, Xin Tong, Kui Jia

However, they all have their own drawbacks, and cannot properly reconstruct the surface shapes of complex topologies, arguably due to a lack of constraints on the topologicalstructures in their learning frameworks.

Surface Reconstruction

Self-Supervised Gait Encoding with Locality-Aware Attention for Person Re-Identification

1 code implementation21 Aug 2020 Haocong Rao, Siqi Wang, Xiping Hu, Mingkui Tan, Huang Da, Jun Cheng, Bin Hu

Unlike previous methods, we for the first time propose a generic gait encoding approach that can utilize unlabeled skeleton data to learn gait representations in a self-supervised manner.

Person Re-Identification

A Self-Supervised Gait Encoding Approach with Locality-Awareness for 3D Skeleton Based Person Re-Identification

1 code implementation5 Sep 2020 Haocong Rao, Siqi Wang, Xiping Hu, Mingkui Tan, Yi Guo, Jun Cheng, Xinwang Liu, Bin Hu

This paper proposes a self-supervised gait encoding approach that can leverage unlabeled skeleton data to learn gait representations for person Re-ID.

Contrastive Learning Person Re-Identification +2

Double Forward Propagation for Memorized Batch Normalization

no code implementations10 Oct 2020 Yong Guo, Qingyao Wu, Chaorui Deng, Jian Chen, Mingkui Tan

Although the standard BN can significantly accelerate the training of DNNs and improve the generalization performance, it has several underlying limitations which may hamper the performance in both training and inference.

RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning

1 code implementation27 Oct 2020 Peihao Chen, Deng Huang, Dongliang He, Xiang Long, Runhao Zeng, Shilei Wen, Mingkui Tan, Chuang Gan

We study unsupervised video representation learning that seeks to learn both motion and appearance features from unlabeled video only, which can be reused for downstream tasks such as action recognition.

Representation Learning Retrieval +2

Modular Graph Attention Network for Complex Visual Relational Reasoning

no code implementations22 Nov 2020 Yihan Zheng, Zhiquan Wen, Mingkui Tan, Runhao Zeng, Qi Chen, YaoWei Wang, Qi Wu

Moreover, to capture the complex logic in a query, we construct a relational graph to represent the visual objects and their relationships, and propose a multi-step reasoning method to progressively understand the complex logic.

Graph Attention Question Answering +5

Pareto-Frontier-aware Neural Architecture Search

no code implementations1 Jan 2021 Yong Guo, Yaofo Chen, Yin Zheng, Peilin Zhao, Jian Chen, Junzhou Huang, Mingkui Tan

To find promising architectures under different budgets, existing methods may have to perform an independent search for each budget, which is very inefficient and unnecessary.

Neural Architecture Search

How to Train Your Agent to Read and Write

1 code implementation4 Jan 2021 Li Liu, Mengge He, Guanghui Xu, Mingkui Tan, Qi Wu

Typically, this requires an agent to fully understand the knowledge from the given text materials and generate correct and fluent novel paragraphs, which is very challenging in practice.

KG-to-Text Generation Knowledge Graphs

Single-path Bit Sharing for Automatic Loss-aware Model Compression

no code implementations13 Jan 2021 Jing Liu, Bohan Zhuang, Peng Chen, Chunhua Shen, Jianfei Cai, Mingkui Tan

By jointly training the binary gates in conjunction with network parameters, the compression configurations of each layer can be automatically determined.

Model Compression Network Pruning +1

Deep View Synthesis via Self-Consistent Generative Network

1 code implementation19 Jan 2021 Zhuoman Liu, Wei Jia, Ming Yang, Peiyao Luo, Yong Guo, Mingkui Tan

To address the above issues, in this paper, we propose a novel deep generative model, called Self-Consistent Generative Network (SCGN), which synthesizes novel views from the given input views without explicitly exploiting the geometric information.

Towards Accurate and Compact Architectures via Neural Architecture Transformer

2 code implementations20 Feb 2021 Yong Guo, Yin Zheng, Mingkui Tan, Qi Chen, Zhipeng Li, Jian Chen, Peilin Zhao, Junzhou Huang

To address this issue, we propose a Neural Architecture Transformer++ (NAT++) method which further enlarges the set of candidate transitions to improve the performance of architecture optimization.

Neural Architecture Search valid

Pareto-Frontier-aware Neural Architecture Generation for Diverse Budgets

no code implementations27 Feb 2021 Yong Guo, Yaofo Chen, Yin Zheng, Qi Chen, Peilin Zhao, Jian Chen, Junzhou Huang, Mingkui Tan

To this end, we propose a Pareto-Frontier-aware Neural Architecture Generator (NAG) which takes an arbitrary budget as input and produces the Pareto optimal architecture for the target budget.

Internal Wasserstein Distance for Adversarial Attack and Defense

no code implementations13 Mar 2021 Qicheng Wang, Shuhai Zhang, JieZhang Cao, Jincheng Li, Mingkui Tan, Yang Xiang

Existing attack methods often construct adversarial examples relying on some metrics like the $\ell_p$ distance to perturb samples.

Adversarial Attack Adversarial Defense +2

Learning Defense Transformers for Counterattacking Adversarial Examples

1 code implementation13 Mar 2021 Jincheng Li, JieZhang Cao, Yifan Zhang, Jian Chen, Mingkui Tan

Relying on this, we learn a defense transformer to counterattack the adversarial examples by parameterizing the affine transformations and exploiting the boundary information of DNNs.

Adversarial Defense

Towards Accurate Text-based Image Captioning with Content Diversity Exploration

1 code implementation CVPR 2021 Guanghui Xu, Shuaicheng Niu, Mingkui Tan, Yucheng Luo, Qing Du, Qi Wu

This task, however, is very challenging because an image often contains complex texts and visual information that is hard to be described comprehensively.

Caption Generation Image Captioning

Source-free Domain Adaptation via Avatar Prototype Generation and Adaptation

1 code implementation18 Jun 2021 Zhen Qiu, Yifan Zhang, Hongbin Lin, Shuaicheng Niu, Yanxia Liu, Qing Du, Mingkui Tan

(2) prototype adaptation: based on the generated source prototypes and target pseudo labels, we develop a new robust contrastive prototype adaptation strategy to align each pseudo-labeled target data to the corresponding source prototypes.

Contrastive Learning Source-Free Domain Adaptation +1

Perception-Aware Multi-Sensor Fusion for 3D LiDAR Semantic Segmentation

1 code implementation ICCV 2021 Zhuangwei Zhuang, Rong Li, Kui Jia, Qicheng Wang, Yuanqing Li, Mingkui Tan

In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF) to exploit perceptual information from two modalities, namely, appearance information from RGB images and spatio-depth information from point clouds.

LIDAR Semantic Segmentation Scene Understanding +2

Content-Aware Convolutional Neural Networks

1 code implementation30 Jun 2021 Yong Guo, Yaofo Chen, Mingkui Tan, Kui Jia, Jian Chen, Jingdong Wang

In practice, the convolutional operation on some of the windows (e. g., smooth windows that contain very similar pixels) can be very redundant and may introduce noises into the computation.

AdaXpert: Adapting Neural Architecture for Growing Data

1 code implementation1 Jul 2021 Shuaicheng Niu, Jiaxiang Wu, Guanghui Xu, Yifan Zhang, Yong Guo, Peilin Zhao, Peng Wang, Mingkui Tan

To address this, we present a neural architecture adaptation method, namely Adaptation eXpert (AdaXpert), to efficiently adjust previous architectures on the growing data.

Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks

1 code implementation ICCV 2021 Zhihao Liang, Zhihao LI, Songcen Xu, Mingkui Tan, Kui Jia

State-of-the-art methods largely rely on a general pipeline that first learns point-wise features discriminative at semantic and instance levels, followed by a separate step of point grouping for proposing object instances.

3D Instance Segmentation Scene Understanding +1

V2C: Visual Voice Cloning

no code implementations CVPR 2022 Qi Chen, Yuanqing Li, Yuankai Qi, Jiaqiu Zhou, Mingkui Tan, Qi Wu

Existing Voice Cloning (VC) tasks aim to convert a paragraph text to a speech with desired voice specified by a reference audio.

Voice Cloning

Debiased Visual Question Answering from Feature and Sample Perspectives

1 code implementation NeurIPS 2021 Zhiquan Wen, Guanghui Xu, Mingkui Tan, Qingyao Wu, Qi Wu

From the sample perspective, we construct two types of negative samples to assist the training of the models, without introducing additional annotations.

Bias Detection Question Answering +1

Graph Convolutional Module for Temporal Action Localization in Videos

no code implementations1 Dec 2021 Runhao Zeng, Wenbing Huang, Mingkui Tan, Yu Rong, Peilin Zhao, Junzhou Huang, Chuang Gan

To this end, we propose a general graph convolutional module (GCM) that can be easily plugged into existing action localization methods, including two-stage and one-stage paradigms.

Ranked #2 on Temporal Action Localization on THUMOS’14 (mAP IOU@0.1 metric)

Action Recognition Temporal Action Localization

Boost Test-Time Performance with Closed-Loop Inference

no code implementations21 Mar 2022 Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Guanghui Xu, Haokun Li, Peilin Zhao, Junzhou Huang, YaoWei Wang, Mingkui Tan

Motivated by this, we propose to predict those hard-classified test samples in a looped manner to boost the model performance.

Auxiliary Learning

Efficient Test-Time Model Adaptation without Forgetting

1 code implementation6 Apr 2022 Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Yaofo Chen, Shijian Zheng, Peilin Zhao, Mingkui Tan

Test-time adaptation (TTA) seeks to tackle potential distribution shifts between training and testing data by adapting a given model w. r. t.

Test-time Adaptation

Towards Lightweight Super-Resolution with Dual Regression Learning

2 code implementations16 Jul 2022 Yong Guo, Jingdong Wang, Qi Chen, JieZhang Cao, Zeshuai Deng, Yanwu Xu, Jian Chen, Mingkui Tan

Nevertheless, it is hard for existing model compression methods to accurately identify the redundant components due to the extremely large SR mapping space.

Image Super-Resolution Model Compression +1

Prototype-Guided Continual Adaptation for Class-Incremental Unsupervised Domain Adaptation

1 code implementation22 Jul 2022 Hongbin Lin, Yifan Zhang, Zhen Qiu, Shuaicheng Niu, Chuang Gan, Yanxia Liu, Mingkui Tan

2) Prototype-based alignment and replay: based on the identified label prototypes, we align both domains and enforce the model to retain previous knowledge.

Unsupervised Domain Adaptation

DAS: Densely-Anchored Sampling for Deep Metric Learning

1 code implementation30 Jul 2022 Lizhao Liu, Shangxin Huang, Zhuangwei Zhuang, Ran Yang, Mingkui Tan, YaoWei Wang

To this end, we propose a Densely-Anchored Sampling (DAS) scheme that considers the embedding with corresponding data point as "anchor" and exploits the anchor's nearby embedding space to densely produce embeddings without data points.

Face Recognition Image Retrieval +2

Calibrate the inter-observer segmentation uncertainty via diagnosis-first principle

2 code implementations5 Aug 2022 Junde Wu, Huihui Fang, Hoayi Xiong, Lixin Duan, Mingkui Tan, Weihua Yang, Huiying Liu, Yanwu Xu

Inspired by this observation, we propose diagnosis-first principle, which is to take disease diagnosis as the criterion to calibrate the inter-observer segmentation uncertainty.

Image Segmentation Lesion Segmentation +3

Multi-Scale Multi-Target Domain Adaptation for Angle Closure Classification

no code implementations25 Aug 2022 Zhen Qiu, Yifan Zhang, Fei Li, Xiulan Zhang, Yanwu Xu, Mingkui Tan

Based on these domain-invariant features at different scales, the deep model trained on the source domain is able to classify angle closure on multiple target domains even without any annotations in these domains.

Domain Adaptation Multi-target Domain Adaptation

Pareto-aware Neural Architecture Generation for Diverse Computational Budgets

1 code implementation14 Oct 2022 Yong Guo, Yaofo Chen, Yin Zheng, Qi Chen, Peilin Zhao, Jian Chen, Junzhou Huang, Mingkui Tan

More critically, these independent search processes cannot share their learned knowledge (i. e., the distribution of good architectures) with each other and thus often result in limited search results.

Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation

1 code implementation14 Oct 2022 Peihao Chen, Dongyu Ji, Kunyang Lin, Runhao Zeng, Thomas H. Li, Mingkui Tan, Chuang Gan

To achieve accurate and efficient navigation, it is critical to build a map that accurately represents both spatial location and the semantic information of the environment objects.

Navigate Vision and Language Navigation

Learning Active Camera for Multi-Object Navigation

no code implementations14 Oct 2022 Peihao Chen, Dongyu Ji, Kunyang Lin, Weiwen Hu, Wenbing Huang, Thomas H. Li, Mingkui Tan, Chuang Gan

How to make robots perceive the environment as efficiently as humans is a fundamental problem in robotics.

Navigate Object

Towards Stable Test-Time Adaptation in Dynamic Wild World

1 code implementation24 Feb 2023 Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Zhiquan Wen, Yaofo Chen, Peilin Zhao, Mingkui Tan

In this paper, we investigate the unstable reasons and find that the batch norm layer is a crucial factor hindering TTA stability.

Test-time Adaptation

Hard Sample Matters a Lot in Zero-Shot Quantization

1 code implementation CVPR 2023 Huantong Li, Xiangmiao Wu, Fanbing Lv, Daihai Liao, Thomas H. Li, Yonggang Zhang, Bo Han, Mingkui Tan

Nonetheless, we find that the synthetic samples constructed in existing ZSQ methods can be easily fitted by models.

Quantization

Towards Efficient Task-Driven Model Reprogramming with Foundation Models

no code implementations5 Apr 2023 Shoukai Xu, Jiangchao Yao, Ran Luo, Shuhai Zhang, Zihao Lian, Mingkui Tan, Bo Han, YaoWei Wang

Moreover, the data used for pretraining foundation models are usually invisible and very different from the target data of downstream tasks.

Knowledge Distillation Transfer Learning

Imbalance-Agnostic Source-Free Domain Adaptation via Avatar Prototype Alignment

no code implementations22 May 2023 Hongbin Lin, Mingkui Tan, Yifan Zhang, Zhen Qiu, Shuaicheng Niu, Dong Liu, Qing Du, Yanxia Liu

To address this issue, we study a more practical SF-UDA task, termed imbalance-agnostic SF-UDA, where the class distributions of both the unseen source domain and unlabeled target domain are unknown and could be arbitrarily skewed.

Pseudo Label Source-Free Domain Adaptation +1

Detecting Adversarial Data by Probing Multiple Perturbations Using Expected Perturbation Score

1 code implementation25 May 2023 Shuhai Zhang, Feng Liu, Jiahao Yang, Yifan Yang, Changsheng Li, Bo Han, Mingkui Tan

Last, we propose an EPS-based adversarial detection (EPS-AD) method, in which we develop EPS-based maximum mean discrepancy (MMD) as a metric to measure the discrepancy between the test sample and natural samples.

Cross-Ray Neural Radiance Fields for Novel-view Synthesis from Unconstrained Image Collections

1 code implementation ICCV 2023 Yifan Yang, Shuhai Zhang, Zixiong Huang, Yubing Zhang, Mingkui Tan

To mimic the perception process of humans, in this paper, we propose Cross-Ray NeRF (CR-NeRF) that leverages interactive information across multiple rays to synthesize occlusion-free novel views with the same appearances as the images.

Novel View Synthesis

CPCM: Contextual Point Cloud Modeling for Weakly-supervised Point Cloud Semantic Segmentation

1 code implementation ICCV 2023 Lizhao Liu, Zhuangwei Zhuang, Shangxin Huang, Xunlong Xiao, Tianhang Xiang, Cen Chen, Jingdong Wang, Mingkui Tan

CMT disentangles the learning of supervised segmentation and unsupervised masked context prediction for effectively learning the very limited labeled points and mass unlabeled points, respectively.

Representation Learning Scene Understanding +2

Learning Vision-and-Language Navigation from YouTube Videos

1 code implementation ICCV 2023 Kunyang Lin, Peihao Chen, Diwei Huang, Thomas H. Li, Mingkui Tan, Chuang Gan

In this paper, we propose to learn an agent from these videos by creating a large-scale dataset which comprises reasonable path-instruction pairs from house tour videos and pre-training the agent on it.

Navigate Vision and Language Navigation

$A^2$Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting Vision-and-Language Ability of Foundation Models

no code implementations15 Aug 2023 Peihao Chen, Xinyu Sun, Hongyan Zhi, Runhao Zeng, Thomas H. Li, Gaowen Liu, Mingkui Tan, Chuang Gan

We study the task of zero-shot vision-and-language navigation (ZS-VLN), a practical yet challenging problem in which an agent learns to navigate following a path described by language instructions without requiring any path-instruction annotation data.

Navigate Robot Navigation +1

Likelihood-Based Text-to-Image Evaluation with Patch-Level Perceptual and Semantic Credit Assignment

1 code implementation16 Aug 2023 Qi Chen, Chaorui Deng, Zixiong Huang, BoWen Zhang, Mingkui Tan, Qi Wu

In this paper, we propose to evaluate text-to-image generation performance by directly estimating the likelihood of the generated images using a pre-trained likelihood-based text-to-image generative model, i. e., a higher likelihood indicates better perceptual quality and better text-image alignment.

Text-to-Image Generation

Efficient Test-Time Adaptation for Super-Resolution with Second-Order Degradation and Reconstruction

1 code implementation NeurIPS 2023 Zeshuai Deng, Zhuokun Chen, Shuaicheng Niu, Thomas H. Li, Bohan Zhuang, Mingkui Tan

Then, we adapt the SR model by implementing feature-level reconstruction learning from the initial test image to its second-order degraded counterparts, which helps the SR model generate plausible HR images.

Image Super-Resolution Test-time Adaptation

Contrastive Vision-Language Alignment Makes Efficient Instruction Learner

1 code implementation29 Nov 2023 Lizhao Liu, Xinyu Sun, Tianhang Xiang, Zhuangwei Zhuang, Liuren Yin, Mingkui Tan

To address this, existing methods typically train a visual adapter to align the representation between a pre-trained vision transformer (ViT) and the LLM by a generative image captioning loss.

Contrastive Learning Image Captioning +4

DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning

no code implementations10 Dec 2023 Kunyang Lin, Yufeng Wang, Peihao Chen, Runhao Zeng, Siyuan Zhou, Mingkui Tan, Chuang Gan

In this paper, we propose a new approach that enables agents to learn whether their behaviors should be consistent with that of other agents by utilizing intrinsic rewards to learn the optimal policy for each agent.

Multi-agent Reinforcement Learning reinforcement-learning +2

Decoupled Prototype Learning for Reliable Test-Time Adaptation

no code implementations15 Jan 2024 Guowei Wang, Changxing Ding, Wentao Tan, Mingkui Tan

Second, we propose a memory-based strategy to enhance DPL's robustness for the small batch sizes often encountered in TTA.

Domain Generalization Test-time Adaptation

Detecting Machine-Generated Texts by Multi-Population Aware Optimization for Maximum Mean Discrepancy

1 code implementation25 Feb 2024 Shuhai Zhang, Yiliao Song, Jiahao Yang, Yuanqing Li, Bo Han, Mingkui Tan

Unfortunately, it is challenging to distinguish MGTs and human-written texts because the distributional discrepancy between them is often very subtle due to the remarkable performance of LLMs.

Hallucination Sentence

Towards Robust and Efficient Cloud-Edge Elastic Model Adaptation via Selective Entropy Distillation

1 code implementation27 Feb 2024 Yaofo Chen, Shuaicheng Niu, Shoukai Xu, Hengjie Song, YaoWei Wang, Mingkui Tan

Moreover, with the increasing data collected at the edge, this paradigm also fails to further adapt the cloud model for better performance.

Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting

no code implementations18 Mar 2024 Mingkui Tan, Guohao Chen, Jiaxiang Wu, Yifan Zhang, Yaofo Chen, Peilin Zhao, Shuaicheng Niu

To tackle this, we further propose EATA with Calibration (EATA-C) to separately exploit the reducible model uncertainty and the inherent data uncertainty for calibrated TTA.

Image Classification Semantic Segmentation +1

AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework

1 code implementation19 Mar 2024 Xiang Li, Zhenyu Li, Chen Shi, Yong Xu, Qing Du, Mingkui Tan, Jun Huang, Wei Lin

The task of financial analysis primarily encompasses two key areas: stock trend prediction and the corresponding financial question answering.

Benchmarking Question Answering +2

HiLo: Detailed and Robust 3D Clothed Human Reconstruction with High-and Low-Frequency Information of Parametric Models

2 code implementations7 Apr 2024 Yifan Yang, Dong Liu, Shuhai Zhang, Zeshuai Deng, Zixiong Huang, Mingkui Tan

We empirically find that the high-frequency (HF) and low-frequency (LF) information from a parametric model has the potential to enhance geometry details and improve robustness to noise, respectively.

Virtual Try-on

Deep Transferring Quantization

1 code implementation ECCV 2020 Zheng Xie, Zhiquan Wen, Jing Liu, Zhi-Qiang Liu, Xixian Wu, Mingkui Tan

Specifically, we propose a method named deep transferring quantization (DTQ) to effectively exploit the knowledge in a pre-trained full-precision model.

Face Recognition Image Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.