1 code implementation • ECCV 2020 • Zheng Xie, Zhiquan Wen, Jing Liu, Zhi-Qiang Liu, Xixian Wu, Mingkui Tan
Specifically, we propose a method named deep transferring quantization (DTQ) to effectively exploit the knowledge in a pre-trained full-precision model.
no code implementations • 15 Apr 2025 • Jinwu Hu, Wei zhang, Yufeng Wang, Yu Hu, Bin Xiao, Mingkui Tan, Qing Du
We model prompt compression as a Markov Decision Process (MDP), enabling the DCP-Agent to sequentially remove redundant tokens by adapting to dynamic contexts and retaining crucial content.
1 code implementation • 18 Dec 2024 • Qianyue Wang, Jinwu Hu, ZhengPing Li, Yufeng Wang, daiyuan li, Yu Hu, Mingkui Tan
Long-form story generation task aims to produce coherent and sufficiently lengthy text, essential for applications such as novel writingand interactive storytelling.
no code implementations • 17 Dec 2024 • Yaofo Chen, Zeng You, Shuhai Zhang, Haokun Li, Yirui Li, YaoWei Wang, Mingkui Tan
As a result, the computational and memory complexity will be significantly reduced.
no code implementations • 11 Dec 2024 • Shuhai Zhang, Jiahao Yang, Hui Luo, Jie Chen, Li Wang, Feng Liu, Bo Han, Mingkui Tan
Leveraging this insight, we propose Consistency Model-based Adversarial Purification (CMAP), which optimizes vectors within the latent space of a pre-trained consistency model to generate samples for restoring clean data.
no code implementations • 10 Dec 2024 • Jinwu Hu, Yufeng Wang, Shuhai Zhang, Kai Zhou, Guohao Chen, Yu Hu, Bin Xiao, Mingkui Tan
Ensemble reasoning for the strengths of different LLM experts is critical to achieving consistent and satisfactory performance on diverse inputs across a wide range of tasks.
no code implementations • 9 Dec 2024 • Zeng You, Zhiquan Wen, Yaofo Chen, Xin Li, Runhao Zeng, YaoWei Wang, Mingkui Tan
To avoid interference from redundant information in videos, we introduce a Semantic Redundancy Reduction mechanism that removes redundancy at both the visual and textual levels.
no code implementations • 2 Dec 2024 • Zhuokun Chen, Jinwu Hu, Zeshuai Deng, Yufeng Wang, Bohan Zhuang, Mingkui Tan
By merging the parameters of language models from these MLLMs, VisionFuse allows a single language model to align with various vision encoders, significantly reducing deployment overhead.
1 code implementation • 2 Dec 2024 • Hongyan Zhi, Peihao Chen, Junyan Li, Shuailei Ma, Xinyu Sun, Tianhang Xiang, Yinjie Lei, Mingkui Tan, Chuang Gan
Experiments show that our method surpasses existing methods on both large scene understanding and existing scene understanding benchmarks.
no code implementations • 1 Dec 2024 • Haowei Sun, Jinwu Hu, Zhirui Zhang, Haoyuan Tian, Xinze Xie, Yufeng Wang, Zhuliang Yu, Xiaohua Xie, Mingkui Tan
However, accurate Drone Visual Active Tracking using reinforcement learning remains challenging due to the absence of a unified benchmark, the complexity of open-world environments with frequent interference, and the diverse motion behavior of dynamic targets.
1 code implementation • 19 Nov 2024 • Zhuangwei Zhuang, Ziyin Wang, Sitao Chen, Lizhao Liu, Hui Luo, Mingkui Tan
Recent methods are mainly built on the 2D-to-3D transformation that relies on sensor calibration to project the 2D image information into the 3D space.
no code implementations • 27 Sep 2024 • Yanyuan Qiao, Wenqi Lyu, Hui Wang, Zixu Wang, Zerui Li, Yuan Zhang, Mingkui Tan, Qi Wu
Vision-and-Language Navigation (VLN) tasks require an agent to follow textual instructions to navigate through 3D environments.
1 code implementation • 4 Jun 2024 • Changhao Li, Xinyu Sun, Peihao Chen, Jugang Fan, Zixu Wang, Yanxia Liu, Jinhui Zhu, Chuang Gan, Mingkui Tan
To achieve this goal, the agent needs to be equipped with a fundamental collaborative navigation ability, where the agent should reason human intention by observing human activities and then navigate to the human's intended destination in advance of the human.
no code implementations • 22 May 2024 • Diwei Huang, Kunyang Lin, Peihao Chen, Qing Du, Mingkui Tan
Few-shot audio-visual acoustics modeling seeks to synthesize the room impulse response in arbitrary locations with few-shot observations.
1 code implementation • CVPR 2024 • Zixiong Huang, Qi Chen, Libo Sun, Yifan Yang, Naizhou Wang, Mingkui Tan, Qi Wu
Novel view synthesis aims to generate new view images of a given view image collection.
1 code implementation • CVPR 2024 • Yifan Yang, Dong Liu, Shuhai Zhang, Zeshuai Deng, Zixiong Huang, Mingkui Tan
We empirically find that the high-frequency (HF) and low-frequency (LF) information from a parametric model has the potential to enhance geometry details and improve robustness to noise, respectively.
1 code implementation • 19 Mar 2024 • Xiang Li, Zhenyu Li, Chen Shi, Yong Xu, Qing Du, Mingkui Tan, Jun Huang, Wei Lin
The task of financial analysis primarily encompasses two key areas: stock trend prediction and the corresponding financial question answering.
no code implementations • 18 Mar 2024 • Mingkui Tan, Guohao Chen, Jiaxiang Wu, Yifan Zhang, Yaofo Chen, Peilin Zhao, Shuaicheng Niu
To tackle this, we further propose EATA with Calibration (EATA-C) to separately exploit the reducible model uncertainty and the inherent data uncertainty for calibrated TTA.
1 code implementation • 27 Feb 2024 • Yaofo Chen, Shuaicheng Niu, YaoWei Wang, Shoukai Xu, Hengjie Song, Mingkui Tan
Moreover, with the increasing data collected at the edge, this paradigm also fails to further adapt the cloud model for better performance.
1 code implementation • 25 Feb 2024 • Shuhai Zhang, Yiliao Song, Jiahao Yang, Yuanqing Li, Bo Han, Mingkui Tan
Unfortunately, it is challenging to distinguish MGTs and human-written texts because the distributional discrepancy between them is often very subtle due to the remarkable performance of LLMs.
no code implementations • 15 Jan 2024 • Guowei Wang, Changxing Ding, Wentao Tan, Mingkui Tan
Second, we propose a memory-based strategy to enhance DPL's robustness for the small batch sizes often encountered in TTA.
no code implementations • 10 Dec 2023 • Kunyang Lin, Yufeng Wang, Peihao Chen, Runhao Zeng, Siyuan Zhou, Mingkui Tan, Chuang Gan
In this paper, we propose a new approach that enables agents to learn whether their behaviors should be consistent with that of other agents by utilizing intrinsic rewards to learn the optimal policy for each agent.
Multi-agent Reinforcement Learning
reinforcement-learning
+3
1 code implementation • 29 Nov 2023 • Lizhao Liu, Xinyu Sun, Tianhang Xiang, Zhuangwei Zhuang, Liuren Yin, Mingkui Tan
To address this, existing methods typically train a visual adapter to align the representation between a pre-trained vision transformer (ViT) and the LLM by a generative image captioning loss.
1 code implementation • NeurIPS 2023 • Zeshuai Deng, Zhuokun Chen, Shuaicheng Niu, Thomas H. Li, Bohan Zhuang, Mingkui Tan
Then, we adapt the SR model by implementing feature-level reconstruction learning from the initial test image to its second-order degraded counterparts, which helps the SR model generate plausible HR images.
1 code implementation • 16 Aug 2023 • Qi Chen, Chaorui Deng, Zixiong Huang, BoWen Zhang, Mingkui Tan, Qi Wu
In this paper, we propose to evaluate text-to-image generation performance by directly estimating the likelihood of the generated images using a pre-trained likelihood-based text-to-image generative model, i. e., a higher likelihood indicates better perceptual quality and better text-image alignment.
no code implementations • 15 Aug 2023 • Peihao Chen, Xinyu Sun, Hongyan Zhi, Runhao Zeng, Thomas H. Li, Gaowen Liu, Mingkui Tan, Chuang Gan
We study the task of zero-shot vision-and-language navigation (ZS-VLN), a practical yet challenging problem in which an agent learns to navigate following a path described by language instructions without requiring any path-instruction annotation data.
1 code implementation • ICCV 2023 • Kunyang Lin, Peihao Chen, Diwei Huang, Thomas H. Li, Mingkui Tan, Chuang Gan
In this paper, we propose to learn an agent from these videos by creating a large-scale dataset which comprises reasonable path-instruction pairs from house tour videos and pre-training the agent on it.
1 code implementation • ICCV 2023 • Lizhao Liu, Zhuangwei Zhuang, Shangxin Huang, Xunlong Xiao, Tianhang Xiang, Cen Chen, Jingdong Wang, Mingkui Tan
CMT disentangles the learning of supervised segmentation and unsupervised masked context prediction for effectively learning the very limited labeled points and mass unlabeled points, respectively.
1 code implementation • ICCV 2023 • Yifan Yang, Shuhai Zhang, Zixiong Huang, Yubing Zhang, Mingkui Tan
To mimic the perception process of humans, in this paper, we propose Cross-Ray NeRF (CR-NeRF) that leverages interactive information across multiple rays to synthesize occlusion-free novel views with the same appearances as the images.
1 code implementation • 25 May 2023 • Shuhai Zhang, Feng Liu, Jiahao Yang, Yifan Yang, Changsheng Li, Bo Han, Mingkui Tan
Last, we propose an EPS-based adversarial detection (EPS-AD) method, in which we develop EPS-based maximum mean discrepancy (MMD) as a metric to measure the discrepancy between the test sample and natural samples.
no code implementations • 22 May 2023 • Hongbin Lin, Mingkui Tan, Yifan Zhang, Zhen Qiu, Shuaicheng Niu, Dong Liu, Qing Du, Yanxia Liu
To address this issue, we study a more practical SF-UDA task, termed imbalance-agnostic SF-UDA, where the class distributions of both the unseen source domain and unlabeled target domain are unknown and could be arbitrarily skewed.
no code implementations • 5 Apr 2023 • Shoukai Xu, Jiangchao Yao, Ran Luo, Shuhai Zhang, Zihao Lian, Mingkui Tan, Bo Han, YaoWei Wang
Moreover, the data used for pretraining foundation models are usually invisible and very different from the target data of downstream tasks.
1 code implementation • CVPR 2023 • Huantong Li, Xiangmiao Wu, Fanbing Lv, Daihai Liao, Thomas H. Li, Yonggang Zhang, Bo Han, Mingkui Tan
Nonetheless, we find that the synthetic samples constructed in existing ZSQ methods can be easily fitted by models.
1 code implementation • 24 Feb 2023 • Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Zhiquan Wen, Yaofo Chen, Peilin Zhao, Mingkui Tan
In this paper, we investigate the unstable reasons and find that the batch norm layer is a crucial factor hindering TTA stability.
1 code implementation • 31 Oct 2022 • Yaofo Chen, Yong Guo, Daihai Liao, Fanbing Lv, Hengjie Song, James Tin-Yau Kwok, Mingkui Tan
Then, we perform a local search within the mined subspace to find effective architectures.
Ranked #2 on
Neural Architecture Search
on NAS-Bench-201, ImageNet-16-120
(Accuracy (Val) metric)
1 code implementation • 14 Oct 2022 • Peihao Chen, Dongyu Ji, Kunyang Lin, Runhao Zeng, Thomas H. Li, Mingkui Tan, Chuang Gan
To achieve accurate and efficient navigation, it is critical to build a map that accurately represents both spatial location and the semantic information of the environment objects.
1 code implementation • 14 Oct 2022 • Yong Guo, Yaofo Chen, Yin Zheng, Qi Chen, Peilin Zhao, Jian Chen, Junzhou Huang, Mingkui Tan
More critically, these independent search processes cannot share their learned knowledge (i. e., the distribution of good architectures) with each other and thus often result in limited search results.
no code implementations • 14 Oct 2022 • Peihao Chen, Dongyu Ji, Kunyang Lin, Weiwen Hu, Wenbing Huang, Thomas H. Li, Mingkui Tan, Chuang Gan
How to make robots perceive the environment as efficiently as humans is a fundamental problem in robotics.
2 code implementations • CVPR 2023 • Xinyu Sun, Peihao Chen, LiangWei Chen, Changhao Li, Thomas H. Li, Mingkui Tan, Chuang Gan
The latest attempts seek to learn a representation model by predicting the appearance contents in the masked regions.
Ranked #2 on
Self-Supervised Action Recognition
on HMDB51
no code implementations • 25 Aug 2022 • Zhen Qiu, Yifan Zhang, Fei Li, Xiulan Zhang, Yanwu Xu, Mingkui Tan
Based on these domain-invariant features at different scales, the deep model trained on the source domain is able to classify angle closure on multiple target domains even without any annotations in these domains.
2 code implementations • 5 Aug 2022 • Junde Wu, Huihui Fang, Hoayi Xiong, Lixin Duan, Mingkui Tan, Weihua Yang, Huiying Liu, Yanwu Xu
Inspired by this observation, we propose diagnosis-first principle, which is to take disease diagnosis as the criterion to calibrate the inter-observer segmentation uncertainty.
1 code implementation • 30 Jul 2022 • Lizhao Liu, Shangxin Huang, Zhuangwei Zhuang, Ran Yang, Mingkui Tan, YaoWei Wang
To this end, we propose a Densely-Anchored Sampling (DAS) scheme that considers the embedding with corresponding data point as "anchor" and exploits the anchor's nearby embedding space to densely produce embeddings without data points.
Ranked #2 on
Metric Learning
on CUB-200-2011
1 code implementation • 22 Jul 2022 • Hongbin Lin, Yifan Zhang, Zhen Qiu, Shuaicheng Niu, Chuang Gan, Yanxia Liu, Mingkui Tan
2) Prototype-based alignment and replay: based on the identified label prototypes, we align both domains and enforce the model to retain previous knowledge.
2 code implementations • 16 Jul 2022 • Yong Guo, Mingkui Tan, Zeshuai Deng, Jingdong Wang, Qi Chen, JieZhang Cao, Yanwu Xu, Jian Chen
Nevertheless, it is hard for existing model compression methods to accurately identify the redundant components due to the extremely large SR mapping space.
1 code implementation • 6 Apr 2022 • Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Yaofo Chen, Shijian Zheng, Peilin Zhao, Mingkui Tan
Test-time adaptation (TTA) seeks to tackle potential distribution shifts between training and testing data by adapting a given model w. r. t.
no code implementations • 21 Mar 2022 • Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Guanghui Xu, Haokun Li, Peilin Zhao, Junzhou Huang, YaoWei Wang, Mingkui Tan
Motivated by this, we propose to predict those hard-classified test samples in a looped manner to boost the model performance.
1 code implementation • NeurIPS 2021 • Zhiquan Wen, Guanghui Xu, Mingkui Tan, Qingyao Wu, Qi Wu
From the sample perspective, we construct two types of negative samples to assist the training of the models, without introducing additional annotations.
no code implementations • 1 Dec 2021 • Runhao Zeng, Wenbing Huang, Mingkui Tan, Yu Rong, Peilin Zhao, Junzhou Huang, Chuang Gan
To this end, we propose a general graph convolutional module (GCM) that can be easily plugged into existing action localization methods, including two-stage and one-stage paradigms.
Ranked #2 on
Temporal Action Localization
on THUMOS’14
(mAP IOU@0.1 metric)
no code implementations • CVPR 2022 • Qi Chen, Yuanqing Li, Yuankai Qi, Jiaqiu Zhou, Mingkui Tan, Qi Wu
Existing Voice Cloning (VC) tasks aim to convert a paragraph text to a speech with desired voice specified by a reference audio.
1 code implementation • ICCV 2021 • Zhihao Liang, Zhihao LI, Songcen Xu, Mingkui Tan, Kui Jia
State-of-the-art methods largely rely on a general pipeline that first learns point-wise features discriminative at semantic and instance levels, followed by a separate step of point grouping for proposing object instances.
Ranked #10 on
3D Instance Segmentation
on S3DIS
no code implementations • 3 Aug 2021 • Jing Liu, Bohan Zhuang, Mingkui Tan, Xu Liu, Dinh Phung, Yuanqing Li, Jianfei Cai
More critically, EAS is able to find compact architectures within 0. 1 second for 50 deployment scenarios.
1 code implementation • 1 Jul 2021 • Shuaicheng Niu, Jiaxiang Wu, Guanghui Xu, Yifan Zhang, Yong Guo, Peilin Zhao, Peng Wang, Mingkui Tan
To address this, we present a neural architecture adaptation method, namely Adaptation eXpert (AdaXpert), to efficiently adjust previous architectures on the growing data.
1 code implementation • 30 Jun 2021 • Yong Guo, Yaofo Chen, Mingkui Tan, Kui Jia, Jian Chen, Jingdong Wang
In practice, the convolutional operation on some of the windows (e. g., smooth windows that contain very similar pixels) can be very redundant and may introduce noises into the computation.
1 code implementation • ICCV 2021 • Mingkui Tan, Zhuangwei Zhuang, Sitao Chen, Rong Li, Kui Jia, Qicheng Wang, Yuanqing Li
We then explore more efficient contextual modules under perspective projection and fuse the LiDAR features into the camera stream to boost the performance of the two-stream network.
Ranked #12 on
Semantic Segmentation
on KITTI-360
1 code implementation • 18 Jun 2021 • Zhen Qiu, Yifan Zhang, Hongbin Lin, Shuaicheng Niu, Yanxia Liu, Qing Du, Mingkui Tan
(2) prototype adaptation: based on the generated source prototypes and target pseudo labels, we develop a new robust contrastive prototype adaptation strategy to align each pseudo-labeled target data to the corresponding source prototypes.
Ranked #15 on
Domain Adaptation
on Office-31
no code implementations • ICCV 2021 • Deng Huang, Wenhao Wu, Weiwen Hu, Xu Liu, Dongliang He, Zhihua Wu, Xiangmiao Wu, Mingkui Tan, Errui Ding
Specifically, we propose two tasks to learn the appearance and speed consistency, respectively.
1 code implementation • CVPR 2021 • Guanghui Xu, Shuaicheng Niu, Mingkui Tan, Yucheng Luo, Qing Du, Qi Wu
This task, however, is very challenging because an image often contains complex texts and visual information that is hard to be described comprehensively.
1 code implementation • 13 Mar 2021 • Jincheng Li, JieZhang Cao, Yifan Zhang, Jian Chen, Mingkui Tan
Relying on this, we learn a defense transformer to counterattack the adversarial examples by parameterizing the affine transformations and exploiting the boundary information of DNNs.
no code implementations • 13 Mar 2021 • Qicheng Wang, Shuhai Zhang, JieZhang Cao, Jincheng Li, Mingkui Tan, Yang Xiang
Existing attack methods often construct adversarial examples relying on some metrics like the $\ell_p$ distance to perturb samples.
1 code implementation • CVPR 2021 • Yaofo Chen, Yong Guo, Qi Chen, Minli Li, Wei Zeng, YaoWei Wang, Mingkui Tan
One of the key steps in Neural Architecture Search (NAS) is to estimate the performance of candidate architectures.
no code implementations • 27 Feb 2021 • Yong Guo, Yaofo Chen, Yin Zheng, Qi Chen, Peilin Zhao, Jian Chen, Junzhou Huang, Mingkui Tan
To this end, we propose a Pareto-Frontier-aware Neural Architecture Generator (NAG) which takes an arbitrary budget as input and produces the Pareto optimal architecture for the target budget.
2 code implementations • 20 Feb 2021 • Yong Guo, Yin Zheng, Mingkui Tan, Qi Chen, Zhipeng Li, Jian Chen, Peilin Zhao, Junzhou Huang
To address this issue, we propose a Neural Architecture Transformer++ (NAT++) method which further enlarges the set of candidate transitions to improve the performance of architecture optimization.
1 code implementation • 19 Jan 2021 • Zhuoman Liu, Wei Jia, Ming Yang, Peiyao Luo, Yong Guo, Mingkui Tan
To address the above issues, in this paper, we propose a novel deep generative model, called Self-Consistent Generative Network (SCGN), which synthesizes novel views from the given input views without explicitly exploiting the geometric information.
no code implementations • 13 Jan 2021 • Jing Liu, Bohan Zhuang, Peng Chen, Chunhua Shen, Jianfei Cai, Mingkui Tan
By jointly training the binary gates in conjunction with network parameters, the compression configurations of each layer can be automatically determined.
1 code implementation • 4 Jan 2021 • Li Liu, Mengge He, Guanghui Xu, Mingkui Tan, Qi Wu
Typically, this requires an agent to fully understand the knowledge from the given text materials and generate correct and fluent novel paragraphs, which is very challenging in practice.
Ranked #3 on
KG-to-Text Generation
on AGENDA
no code implementations • 1 Jan 2021 • Yong Guo, Yaofo Chen, Yin Zheng, Peilin Zhao, Jian Chen, Junzhou Huang, Mingkui Tan
To find promising architectures under different budgets, existing methods may have to perform an independent search for each budget, which is very inefficient and unnecessary.
no code implementations • 22 Nov 2020 • Yihan Zheng, Zhiquan Wen, Mingkui Tan, Runhao Zeng, Qi Chen, YaoWei Wang, Qi Wu
Moreover, to capture the complex logic in a query, we construct a relational graph to represent the visual objects and their relationships, and propose a multi-step reasoning method to progressively understand the complex logic.
Ranked #2 on
Referring Expression Comprehension
on CLEVR-Ref+
1 code implementation • 27 Oct 2020 • Peihao Chen, Deng Huang, Dongliang He, Xiang Long, Runhao Zeng, Shilei Wen, Mingkui Tan, Chuang Gan
We study unsupervised video representation learning that seeks to learn both motion and appearance features from unlabeled video only, which can be reused for downstream tasks such as action recognition.
Ranked #11 on
Self-Supervised Action Recognition
on UCF101
no code implementations • 10 Oct 2020 • Yong Guo, Qingyao Wu, Chaorui Deng, Jian Chen, Mingkui Tan
Although the standard BN can significantly accelerate the training of DNNs and improve the generalization performance, it has several underlying limitations which may hamper the performance in both training and inference.
1 code implementation • 5 Sep 2020 • Haocong Rao, Siqi Wang, Xiping Hu, Mingkui Tan, Yi Guo, Jun Cheng, Xinwang Liu, Bin Hu
This paper proposes a self-supervised gait encoding approach that can leverage unlabeled skeleton data to learn gait representations for person Re-ID.
1 code implementation • 21 Aug 2020 • Haocong Rao, Siqi Wang, Xiping Hu, Mingkui Tan, Huang Da, Jun Cheng, Bin Hu
Unlike previous methods, we for the first time propose a generic gait encoding approach that can utilize unlabeled skeleton data to learn gait representations in a self-supervised manner.
1 code implementation • 13 Aug 2020 • Jiapeng Tang, Xiaoguang Han, Mingkui Tan, Xin Tong, Kui Jia
However, they all have their own drawbacks, and cannot properly reconstruct the surface shapes of complex topologies, arguably due to a lack of constraints on the topologicalstructures in their learning frameworks.
1 code implementation • 7 Aug 2020 • Deng Huang, Peihao Chen, Runhao Zeng, Qing Du, Mingkui Tan, Chuang Gan
In this work, we propose to represent the contents in the video as a location-aware graph by incorporating the location information of an object into the graph construction.
1 code implementation • 28 Jul 2020 • Jiezhang Cao, Yong Guo, Qingyao Wu, Chunhua Shen, Junzhou Huang, Mingkui Tan
In this paper, rather than sampling from the predefined prior distribution, we propose an LCCGAN model with local coordinate coding (LCC) to improve the performance of generating data.
1 code implementation • ECCV 2020 • Chaorui Deng, Ning Ding, Mingkui Tan, Qi Wu
We verify the merit of the proposed length level embedding on three models: two state-of-the-art (SOTA) autoregressive models with different types of decoder, as well as our proposed non-autoregressive model, to show its generalization ability.
no code implementations • 15 Jul 2020 • Shihao Zhang, Huazhu Fu, Yanwu Xu, Yanxia Liu, Mingkui Tan
Retinal image segmentation plays an important role in automatic disease diagnosis.
1 code implementation • 14 Jul 2020 • Peihao Chen, Yang Zhang, Mingkui Tan, Hongdong Xiao, Deng Huang, Chuang Gan
During testing, the audio forwarding regularizer is removed to ensure that REGNET can produce purely aligned sound only from visual features.
1 code implementation • CVPR 2021 • Peng Chen, Jing Liu, Bohan Zhuang, Mingkui Tan, Chunhua Shen
Network quantization allows inference to be conducted using low-precision arithmetic for improved inference efficiency of deep neural networks on edge devices.
1 code implementation • ICML 2020 • Yong Guo, Yaofo Chen, Yin Zheng, Peilin Zhao, Jian Chen, Junzhou Huang, Mingkui Tan
With the proposed search strategy, our Curriculum Neural Architecture Search (CNAS) method significantly improves the search efficiency and finds better architectures than existing NAS methods.
1 code implementation • 5 Jul 2020 • Yifan Zhang, Ying WEI, Qingyao Wu, Peilin Zhao, Shuaicheng Niu, Junzhou Huang, Mingkui Tan
Deep learning based medical image diagnosis has shown great potential in clinical medicine.
2 code implementations • IJCAI 2020 • Ke Xu, Yifan Zhang, Deheng Ye, Peilin Zhao, Mingkui Tan
One of the key issues is how to represent the non-stationary price series of assets in a portfolio, which is important for portfolio decisions.
no code implementations • 17 Jun 2020 • Kun Liu, Wu Liu, Huadong Ma, Mingkui Tan, Chuang Gan
Our method achieves clear improvements on UCF101 action recognition benchmark against state-of-the-art real-time methods by 5. 4% in terms of accuracy and 2 times faster in terms of inference speed with a less than 5MB storage model.
no code implementations • 5 May 2020 • Huazhu Fu, Fei Li, Xu sun, Xingxing Cao, Jingan Liao, Jose Ignacio Orlando, Xing Tao, Yuexiang Li, Shihao Zhang, Mingkui Tan, Chenglang Yuan, Cheng Bian, Ruitao Xie, Jiongcheng Li, Xiaomeng Li, Jing Wang, Le Geng, Panming Li, Huaying Hao, Jiang Liu, Yan Kong, Yongyong Ren, Hrvoje Bogunovic, Xiulan Zhang, Yanwu Xu
To address this, we organized the Angle closure Glaucoma Evaluation challenge (AGE), held in conjunction with MICCAI 2019.
1 code implementation • 30 Apr 2020 • Yifan Zhang, Shuaicheng Niu, Zhen Qiu, Ying WEI, Peilin Zhao, Jianhua Yao, Junzhou Huang, Qingyao Wu, Mingkui Tan
There are two main challenges: 1) the discrepancy of data distributions between domains; 2) the task difference between the diagnosis of typical pneumonia and COVID-19.
1 code implementation • CVPR 2020 • Runhao Zeng, Haoming Xu, Wenbing Huang, Peihao Chen, Mingkui Tan, Chuang Gan
The key idea of this paper is to use the distances between the frame within the ground truth and the starting (ending) frame as dense supervisions to improve the video grounding accuracy.
Natural Language Moment Retrieval
Natural Language Queries
+2
no code implementations • 31 Mar 2020 • Chendi Rao, JieZhang Cao, Runhao Zeng, Qi Chen, Huazhu Fu, Yanwu Xu, Mingkui Tan
In this paper, we aim to review various adversarial attack and defense methods on chest X-rays.
no code implementations • 29 Mar 2020 • Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Yong Guo, Peilin Zhao, Junzhou Huang, Mingkui Tan
To alleviate the performance disturbance issue, we propose a new disturbance-immune update strategy for model updating.
3 code implementations • CVPR 2020 • Yong Guo, Jian Chen, Jingdong Wang, Qi Chen, JieZhang Cao, Zeshuai Deng, Yanwu Xu, Mingkui Tan
Extensive experiments with paired training data and unpaired real-world data demonstrate our superiority over existing methods.
3 code implementations • ECCV 2020 • Shoukai Xu, Haokun Li, Bohan Zhuang, Jing Liu, JieZhang Cao, Chuangrun Liang, Mingkui Tan
More critically, our method achieves much higher accuracy on 4-bit quantization than the existing data free quantization method.
Ranked #2 on
Data Free Quantization
on CIFAR-100
no code implementations • 6 Mar 2020 • Yifan Zhang, Peilin Zhao, Qingyao Wu, Bin Li, Junzhou Huang, Mingkui Tan
This task, however, has two main difficulties: (i) the non-stationary price series and complex asset correlations make the learning of feature representation very hard; (ii) the practicality principle in financial markets requires controlling both transaction and risk costs.
1 code implementation • CVPR 2020 • Qi Chen, Qi Wu, Rui Tang, Yu-Han Wang, Shuai Wang, Mingkui Tan
To this end, we propose a House Plan Generative Model (HPGM) that first translates the language input to a structural graph representation and then predicts the layout of rooms with a Graph Conditioned Layout Prediction Network (GC LPN) and generates the interior texture with a Language Conditioned Texture GAN (LCT-GAN).
no code implementations • 1 Mar 2020 • JieZhang Cao, Langyuan Mo, Qing Du, Yong Guo, Peilin Zhao, Junzhou Huang, Mingkui Tan
However, the resultant optimization problem is still intractable.
1 code implementation • 4 Jan 2020 • Jing Liu, Bohan Zhuang, Zhuangwei Zhuang, Yong Guo, Junzhou Huang, Jinhui Zhu, Mingkui Tan
In this paper, we propose a simple-yet-effective method called discrimination-aware channel pruning (DCP) to choose the channels that actually contribute to the discriminative power.
1 code implementation • 18 Nov 2019 • Yifan Zhang, Peilin Zhao, Shuaicheng Niu, Qingyao Wu, JieZhang Cao, Junzhou Huang, Mingkui Tan
In these problems, there are two key challenges: the query budget is often limited; the ratio between classes is highly imbalanced.
1 code implementation • 17 Nov 2019 • Yifan Zhang, Ying WEI, Peilin Zhao, Shuaicheng Niu, Qingyao Wu, Mingkui Tan, Junzhou Huang
In this paper, we seek to exploit rich labeled data from relevant domains to help the learning in the target task with unsupervised domain adaptation (UDA).
3 code implementations • NeurIPS 2019 • Jiezhang Cao, Langyuan Mo, Yifan Zhang, Kui Jia, Chunhua Shen, Mingkui Tan
Multiple marginal matching problem aims at learning mappings to match a source domain to multiple target domains and it has attracted great attention in many applications, such as multi-domain image translation.
1 code implementation • NeurIPS 2019 • Yong Guo, Yin Zheng, Mingkui Tan, Qi Chen, Jian Chen, Peilin Zhao, Junzhou Huang
To verify the effectiveness of the proposed strategies, we apply NAT on both hand-crafted architectures and NAS based architectures.
no code implementations • 25 Sep 2019 • JieZhang Cao, Jincheng Li, Xiping Hu, Peilin Zhao, Mingkui Tan
ii) the $W$-distance of a specific layer to the target distribution tends to decrease along training iterations.
no code implementations • 22 Sep 2019 • Bohan Zhuang, Chunhua Shen, Mingkui Tan, Peng Chen, Lingqiao Liu, Ian Reid
Experiments on both classification, semantic segmentation and object detection tasks demonstrate the superior performance of the proposed methods over various quantized networks in the literature.
1 code implementation • ICCV 2019 • Runhao Zeng, Wenbing Huang, Mingkui Tan, Yu Rong, Peilin Zhao, Junzhou Huang, Chuang Gan
Then we apply the GCNs over the graph to model the relations among different proposals and learn powerful representations for the action classification and localization.
Ranked #4 on
Temporal Action Localization
on THUMOS’14
(mAP IOU@0.1 metric)
42 code implementations • 20 Aug 2019 • Jingdong Wang, Ke Sun, Tianheng Cheng, Borui Jiang, Chaorui Deng, Yang Zhao, Dong Liu, Yadong Mu, Mingkui Tan, Xinggang Wang, Wenyu Liu, Bin Xiao
High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection.
Ranked #1 on
Object Detection
on COCO test-dev
(Hardware Burden metric)
no code implementations • 10 Aug 2019 • Bohan Zhuang, Jing Liu, Mingkui Tan, Lingqiao Liu, Ian Reid, Chunhua Shen
Furthermore, we propose a second progressive quantization scheme which gradually decreases the bit-width from high-precision to low-precision during training.
2 code implementations • 25 Jul 2019 • Shihao Zhang, Huazhu Fu, Yuguang Yan, Yubing Zhang, Qingyao Wu, Ming Yang, Mingkui Tan, Yanwu Xu
Learning structural information is critical for producing an ideal result in retinal image segmentation.
no code implementations • 21 Jun 2019 • Fengda Zhu, Xiaojun Chang, Runhao Zeng, Mingkui Tan
We first develop an unsupervised diversity exploration method to learn task-specific skills using an unsupervised objective.
no code implementations • 25 Apr 2019 • Kui Jia, Jiehong Lin, Mingkui Tan, DaCheng Tao
Such a perspective enables us to study deep multi-view learning in the context of regularized network training, for which we present control experiments of benchmark image classification to show the efficacy of our proposed CorrReg.
1 code implementation • CVPR 2019 • Yabin Zhang, Hui Tang, Kui Jia, Mingkui Tan
Since target samples are unlabeled, we also propose a scheme of cross-domain training to help learn the target classifier.
no code implementations • CVPR 2020 • Bohan Zhuang, Lingqiao Liu, Mingkui Tan, Chunhua Shen, Ian Reid
In this paper, we seek to tackle a challenge in training low-precision networks: the notorious difficulty in propagating gradient through a low-precision network due to the non-differentiable quantization function.
1 code implementation • 27 Mar 2019 • Yong Guo, Qi Chen, Jian Chen, Qingyao Wu, Qinfeng Shi, Mingkui Tan
To address this issue, we develop a novel GAN called Auto-Embedding Generative Adversarial Network (AEGAN), which simultaneously encodes the global structure features and captures the fine-grained details.
no code implementations • 12 Feb 2019 • Chaorui Deng, Qi Wu, Guanghui Xu, Zhuliang Yu, Yanwu Xu, Kui Jia, Mingkui Tan
Most state-of-the-art methods in VG operate in a two-stage manner, wherein the first stage an object detector is adopted to generate a set of object proposals from the input image and the second stage is simply formulated as a cross-modal matching problem that finds the best match between the language query and all region proposals.
no code implementations • CVPR 2019 • Bohan Zhuang, Chunhua Shen, Mingkui Tan, Lingqiao Liu, Ian Reid
In this paper, we propose to train convolutional neural networks (CNNs) with both binarized weights and activations, leading to quantized models specifically} for mobile devices with limited power capacity and computation resources.
1 code implementation • NeurIPS 2018 • Zhuangwei Zhuang, Mingkui Tan, Bohan Zhuang, Jing Liu, Yong Guo, Qingyao Wu, Junzhou Huang, Jinhui Zhu
Channel pruning is one of the predominant approaches for deep model compression.
no code implementations • 12 Oct 2018 • Dong Gong, Mingkui Tan, Qinfeng Shi, Anton Van Den Hengel, Yanning Zhang
Compared to existing methods, MPTV is less sensitive to the choice of the trade-off parameter between data fitting and regularization.
no code implementations • 27 Sep 2018 • JieZhang Cao, Yong Guo, Langyuan Mo, Peilin Zhao, Junzhou Huang, Mingkui Tan
We study the joint distribution matching problem which aims at learning bidirectional mappings to match the joint distribution of two domains.
Open-Ended Question Answering
Unsupervised Image-To-Image Translation
+2
no code implementations • 19 Sep 2018 • Yong Guo, Qi Chen, Jian Chen, Junzhou Huang, Yanwu Xu, JieZhang Cao, Peilin Zhao, Mingkui Tan
However, most deep learning methods employ feed-forward architectures, and thus the dependencies between LR and HR images are not fully exploited, leading to limited learning performance.
no code implementations • ICML 2018 • Jiezhang Cao, Yong Guo, Qingyao Wu, Chunhua Shen, Junzhou Huang, Mingkui Tan
Generative adversarial networks (GANs) aim to generate realistic data from some prior distribution (e. g., Gaussian noises).
no code implementations • CVPR 2018 • Chaorui Deng, Qi Wu, Qingyao Wu, Fuyuan Hu, Fan Lyu, Mingkui Tan
There are three main challenges in VG: 1) what is the main focus in a query; 2) how to understand an image; 3) how to locate an object.
no code implementations • 16 Apr 2018 • Xiang Zhang, Lina Yao, Chaoran Huang, Sen Wang, Mingkui Tan, Guodong Long, Can Wang
Multimodal wearable sensor data classification plays an important role in ubiquitous computing and has a wide range of applications in scenarios from healthcare to entertainment.
no code implementations • 6 Apr 2018 • Peilin Zhao, Yifan Zhang, Min Wu, Steven C. H. Hoi, Mingkui Tan, Junzhou Huang
Cost-Sensitive Online Classification has drawn extensive attention in recent years, where the main approach is to directly online optimize two well-known cost-sensitive metrics: (i) weighted sum of sensitivity and specificity; (ii) weighted misclassification cost.
2 code implementations • CVPR 2018 • Bohan Zhuang, Chunhua Shen, Mingkui Tan, Lingqiao Liu, Ian Reid
This paper tackles the problem of training a deep convolutional neural network with both low-precision weights and low-bitwidth activations.
no code implementations • ICCV 2017 • Dong Gong, Mingkui Tan, Yanning Zhang, Anton Van Den Hengel, Qinfeng Shi
Rather than attempt to identify outliers to the model a priori, we instead propose to sequentially identify inliers, and gradually incorporate them into the estimation process.
1 code implementation • 6 Nov 2016 • Yong Guo, Jian Chen, Qing Du, Anton Van Den Hengel, Qinfeng Shi, Mingkui Tan
As a result, the representation power of intermediate layers can be very weak and the model becomes very redundant with limited performance.
no code implementations • CVPR 2016 • Wen Li, Dengxin Dai, Mingkui Tan, Dong Xu, Luc van Gool
The SVM+ approach has shown excellent performance in visual recognition tasks for exploiting privileged information in the training data.
no code implementations • CVPR 2016 • Dong Gong, Mingkui Tan, Yanning Zhang, Anton Van Den Hengel, Qinfeng Shi
We show here that a subset of the image gradients are adequate to estimate the blur kernel robustly, no matter the gradient image is sparse or not.
no code implementations • CVPR 2016 • Mingkui Tan, Shijie Xiao, Junbin Gao, Dong Xu, Anton Van Den Hengel, Qinfeng Shi
Trace-norm regularization plays an important role in many areas such as machine learning and computer vision.
no code implementations • CVPR 2015 • Mingkui Tan, Qinfeng Shi, Anton Van Den Hengel, Chunhua Shen, Junbin Gao, Fuyuan Hu, Zhen Zhang
Exploiting label dependency for multi-label image classification can significantly improve classification performance.
no code implementations • 10 Mar 2015 • Mingkui Tan, Shijie Xiao, Junbin Gao, Dong Xu, Anton Van Den Hengel, Qinfeng Shi
Nuclear-norm regularization plays a vital role in many learning tasks, such as low-rank matrix recovery (MR), and low-rank representation (LRR).
no code implementations • 20 Feb 2013 • Mingkui Tan, Ivor W. Tsang, Li Wang
Matching Pursuit LASSIn Part I \cite{TanPMLPart1}, a Matching Pursuit LASSO ({MPL}) algorithm has been presented for solving large-scale sparse recovery (SR) problems.
no code implementations • 24 Sep 2012 • Mingkui Tan, Ivor W. Tsang, Li Wang
In this paper, we present a new adaptive feature scaling scheme for ultrahigh-dimensional feature selection on Big Data.