3 code implementations • 21 Dec 2022 • Jiaxian Guo, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Boyang Li, DaCheng Tao, Steven C. H. Hoi
To address this issue, we propose \emph{Img2Prompt}, a plug-and-play module that provides the prompts that can bridge the aforementioned modality and task disconnections, so that LLMs can perform zero-shot VQA tasks without end-to-end training.
1 code implementation • 23 Jul 2018 • Ya Li, Mingming Gong, Xinmei Tian, Tongliang Liu, DaCheng Tao
With the conditional invariant representation, the invariance of the joint distribution $\mathbb{P}(h(X), Y)$ can be guaranteed if the class prior $\mathbb{P}(Y)$ does not change across training and test domains.
5 code implementations • 26 Apr 2022 • Yufei Xu, Jing Zhang, Qiming Zhang, DaCheng Tao
In this paper, we show the surprisingly good capabilities of plain vision transformers for pose estimation from various aspects, namely simplicity in model structure, scalability in model size, flexibility in training paradigm, and transferability of knowledge between models, through a simple baseline model called ViTPose.
Ranked #1 on Pose Estimation on COCO test-dev
1 code implementation • 7 Dec 2022 • Yufei Xu, Jing Zhang, Qiming Zhang, DaCheng Tao
In this paper, we show the surprisingly good properties of plain vision transformers for body pose estimation from various aspects, namely simplicity in model structure, scalability in model size, flexibility in training paradigm, and transferability of knowledge between models, through a simple baseline model dubbed ViTPose.
Ranked #1 on Animal Pose Estimation on AP-10K (using extra training data)
1 code implementation • 13 Jan 2023 • Shiye Lei, DaCheng Tao
Dataset distillation, a dataset reduction method, addresses this problem by synthesizing a small typical dataset from substantial data and has attracted much attention from the deep learning community.
2 code implementations • NeurIPS 2019 • Yixing Xu, Yunhe Wang, Hanting Chen, Kai Han, Chunjing Xu, DaCheng Tao, Chang Xu
In practice, only a small portion of the original training set is required as positive examples and more useful training examples can be obtained from the massive unlabeled data on the cloud through a PU classifier with an attention based multi-scale feature extractor.
1 code implementation • 30 Oct 2020 • Jizhizi Li, Jing Zhang, Stephen J. Maybank, DaCheng Tao
Furthermore, we provide a benchmark containing 2, 000 high-resolution real-world animal images and 10, 000 portrait images along with their manually labeled alpha mattes to serve as a test bed for evaluating matting model's generalization ability on real-world images.
Ranked #2 on Image Matting on AM-2K
4 code implementations • CVPR 2022 • Haofei Xu, Jing Zhang, Jianfei Cai, Hamid Rezatofighi, DaCheng Tao
Learning-based optical flow estimation has been dominated with the pipeline of cost volume with convolutions for flow regression, which is inherently limited to local correlations and thus is hard to address the long-standing challenge of large displacements.
Ranked #8 on Optical Flow Estimation on Spring
1 code implementation • 10 Nov 2022 • Haofei Xu, Jing Zhang, Jianfei Cai, Hamid Rezatofighi, Fisher Yu, DaCheng Tao, Andreas Geiger
We present a unified formulation and model for three motion and 3D perception tasks: optical flow, rectified stereo matching and unrectified stereo depth estimation from posed images.
Ranked #1 on Optical Flow Estimation on Sintel-clean
4 code implementations • NeurIPS 2023 • Hanting Chen, Yunhe Wang, Jianyuan Guo, DaCheng Tao
In this study, we introduce VanillaNet, a neural network architecture that embraces elegance in design.
1 code implementation • 28 Jun 2023 • Jianzong Wu, Xiangtai Li, Shilin Xu, Haobo Yuan, Henghui Ding, Yibo Yang, Xia Li, Jiangning Zhang, Yunhai Tong, Xudong Jiang, Bernard Ghanem, DaCheng Tao
To our knowledge, this is the first comprehensive literature review of open vocabulary learning.
2 code implementations • NeurIPS 2023 • Di Wang, Jing Zhang, Bo Du, Minqiang Xu, Lin Liu, DaCheng Tao, Liangpei Zhang
In this study, we leverage SAM and existing RS object detection datasets to develop an efficient pipeline for generating a large-scale RS segmentation dataset, dubbed SAMRS.
1 code implementation • 29 Nov 2023 • Wenquan Lu, Yufei Xu, Jing Zhang, Chaoyue Wang, DaCheng Tao
Given a generated failed image due to malformed hands, we utilize ControlNet modules to re-inject such correct hand information.
4 code implementations • 21 Feb 2022 • Qiming Zhang, Yufei Xu, Jing Zhang, DaCheng Tao
Vision transformers have shown great potential in various computer vision tasks owing to their strong capability to model long-range dependency using the self-attention mechanism.
Ranked #2 on Image Classification on ImageNet ReaL
5 code implementations • CVPR 2018 • Huan Fu, Mingming Gong, Chaohui Wang, Kayhan Batmanghelich, DaCheng Tao
These methods model depth estimation as a regression problem and train the regression networks by minimizing mean squared error, which suffers from slow convergence and unsatisfactory local solutions.
Ranked #13 on Depth Estimation on NYU-Depth V2
7 code implementations • CVPR 2019 • Zhou Yu, Jun Yu, Yuhao Cui, DaCheng Tao, Qi Tian
In this paper, we propose a deep Modular Co-Attention Network (MCAN) that consists of Modular Co-Attention (MCA) layers cascaded in depth.
Ranked #5 on Question Answering on SQA3D
2 code implementations • 6 Apr 2022 • Di Wang, Jing Zhang, Bo Du, Gui-Song Xia, DaCheng Tao
To this end, we train different networks from scratch with the help of the largest RS scene recognition dataset up to now -- MillionAID, to obtain a series of RS pretrained backbones, including both convolutional neural networks (CNN) and vision transformers such as Swin and ViTAE, which have shown promising performance on computer vision tasks.
Ranked #1 on Aerial Scene Classification on UCM (80% as trainset)
Aerial Scene Classification Building change detection for remote sensing images +5
2 code implementations • 8 Aug 2022 • Di Wang, Qiming Zhang, Yufei Xu, Jing Zhang, Bo Du, DaCheng Tao, Liangpei Zhang
Large-scale vision foundation models have made significant progress in visual tasks on natural images, with vision transformers being the primary choice due to their good scalability and representation ability.
Ranked #1 on Aerial Scene Classification on AID (50% as trainset)
1 code implementation • 19 Mar 2023 • Kang Liao, Lang Nie, Shujuan Huang, Chunyu Lin, Jing Zhang, Yao Zhao, Moncef Gabbouj, DaCheng Tao
In this paper, we provide a comprehensive survey of learning-based camera calibration techniques, by analyzing their strengths and limitations.
1 code implementation • 15 Jul 2021 • Jizhizi Li, Jing Zhang, DaCheng Tao
To address the problem, a novel end-to-end matting network is proposed, which can predict a generalized trimap for any image of the above types as a unified semantic representation.
Ranked #2 on Image Matting on AIM-500
1 code implementation • 10 Jul 2022 • Xiangtai Li, Jiangning Zhang, Yibo Yang, Guangliang Cheng, Kuiyuan Yang, Yunhai Tong, DaCheng Tao
In this paper, we focus on exploring effective methods for faster, accurate, and domain agnostic semantic segmentation.
1 code implementation • CVPR 2020 • Jingyuan Li, Ning Wang, Lefei Zhang, Bo Du, DaCheng Tao
To capture information from distant places in the feature map for RFR, we further develop KCA and incorporate it in RFR.
6 code implementations • CVPR 2021 • Zhaohui Yang, Yunhe Wang, Xinghao Chen, Jianyuan Guo, Wei zhang, Chao Xu, Chunjing Xu, DaCheng Tao, Chang Xu
To achieve an extremely fast NAS while preserving the high accuracy, we propose to identify the vital blocks and make them the priority in the architecture search.
4 code implementations • NeurIPS 2020 • Yehui Tang, Yunhe Wang, Yixing Xu, DaCheng Tao, Chunjing Xu, Chao Xu, Chang Xu
To increase the reliability of the results, we prefer to have a more rigorous research design by including a scientific control group as an essential part to minimize the effect of all factors except the association between the filter and expected network output.
7 code implementations • CVPR 2021 • Yehui Tang, Yunhe Wang, Yixing Xu, Yiping Deng, Chao Xu, DaCheng Tao, Chang Xu
Then, the manifold relationship between instances and the pruned sub-networks will be aligned in the training procedure.
1 code implementation • 29 Apr 2021 • Jizhizi Li, Sihan Ma, Jing Zhang, DaCheng Tao
We systematically evaluate both trimap-free and trimap-based matting methods on P3M-10k and find that existing matting methods show different generalization capabilities when following the Privacy-Preserving Training (PPT) setting, i. e., training on face-blurred images and testing on arbitrary images.
Ranked #3 on Image Matting on P3M-10k
1 code implementation • 4 Dec 2021 • Yikai Wang, Fuchun Sun, Wenbing Huang, Fengxiang He, DaCheng Tao
For the application of dense image prediction, the validity of CEN is tested by four different scenarios: multimodal fusion, cycle multimodal fusion, multitask learning, and multimodal multitask learning.
Ranked #7 on Semantic Segmentation on LLRGBD-synthetic
1 code implementation • 15 Jul 2022 • Shengchao Hu, Li Chen, Penghao Wu, Hongyang Li, Junchi Yan, DaCheng Tao
In particular, we propose a spatial-temporal feature learning scheme towards a set of more representative features for perception, prediction and planning tasks simultaneously, which is called ST-P3.
Ranked #7 on Bird's-Eye View Semantic Segmentation on nuScenes (IoU ped - 224x480 - Vis filter. - 100x100 at 0.5 metric)
1 code implementation • 22 Feb 2023 • Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, DaCheng Tao, Yingbin Liang, Zhangyang Wang
While the optimizer generalization has been recently studied, the optimizee generalization (or learning to generalize) has not been rigorously studied in the L2O context, which is the aim of this paper.
2 code implementations • 28 Jun 2017 • Chaoyue Wang, Chang Xu, Chaohui Wang, DaCheng Tao
The proposed PAN consists of two feed-forward convolutional neural networks (CNNs), the image transformation network T and the discriminative network D. Through combining the generative adversarial loss and the proposed perceptual adversarial loss, these two networks can be trained alternately to solve image-to-image transformation tasks.
2 code implementations • NeurIPS 2021 • Yufei Xu, Qiming Zhang, Jing Zhang, DaCheng Tao
Nevertheless, vision transformers treat an image as 1D sequence of visual tokens, lacking an intrinsic inductive bias (IB) in modeling local visual structures and dealing with scale variance.
Ranked #2 on Video Object Segmentation on DAVIS 2017
3 code implementations • 28 Jan 2016 • Bolun Cai, Xiangmin Xu, Kui Jia, Chunmei Qing, DaCheng Tao
The key to achieve haze removal is to estimate a medium transmission map for an input hazy image.
Ranked #7 on Image Dehazing on RS-Haze
1 code implementation • CVPR 2022 • Zhi Hou, Baosheng Yu, DaCheng Tao
We perform extensive experiments on over ten datasets and the proposed method achieves significant improvements on different data scarcity applications without any bells and whistles, including the tasks of long-tailed recognition, compositional zero-shot learning, domain generalization, and contrastive learning.
Ranked #17 on Long-tail Learning on iNaturalist 2018
1 code implementation • 4 Apr 2022 • Zhi Hou, Baosheng Yu, Chaoyue Wang, Yibing Zhan, DaCheng Tao
Specifically, when applying the proposed module, it employs a two-stream pipeline during training, i. e., either with or without a BatchFormerV2 module, where the batchformer stream can be removed for testing.
1 code implementation • 29 Feb 2024 • Jianbin Zheng, Minghui Hu, Zhongyi Fan, Chaoyue Wang, Changxing Ding, DaCheng Tao, Tat-Jen Cham
Consequently, we introduce Trajectory Consistency Distillation (TCD), which encompasses trajectory consistency function and strategic stochastic sampling.
2 code implementations • CVPR 2023 • Hongwei Yi, Hualin Liang, Yifei Liu, Qiong Cao, Yandong Wen, Timo Bolkart, DaCheng Tao, Michael J. Black
This work addresses the problem of generating 3D holistic body motions from human speech.
Ranked #2 on Gesture Generation on BEAT2
1 code implementation • 20 Feb 2024 • Xiaohan Xu, Ming Li, Chongyang Tao, Tao Shen, Reynold Cheng, Jinyang Li, Can Xu, DaCheng Tao, Tianyi Zhou
In the era of Large Language Models (LLMs), Knowledge Distillation (KD) emerges as a pivotal methodology for transferring advanced capabilities from leading proprietary LLMs, such as GPT-4, to their open-source counterparts like LLaMA and Mistral.
2 code implementations • 30 Nov 2020 • Yufei Xu, Jing Zhang, Stephen J. Maybank, DaCheng Tao
In this paper, we attempt to tackle the video stabilization problem in a deep unsupervised learning manner, which borrows the divide-and-conquer idea from traditional stabilizers while leveraging the representation power of DNNs to handle the challenges in real-world scenarios.
3 code implementations • 10 Jul 2022 • Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Bo Du, DaCheng Tao
However, these methods built upon detection transformer framework might achieve sub-optimal training efficiency and performance due to coarse positional query modeling. In addition, the point label form exploited in previous works implies the reading order of humans, which impedes the detection robustness from our observation.
Ranked #3 on Scene Text Detection on SCUT-CTW1500
2 code implementations • CVPR 2023 • Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Tongliang Liu, Bo Du, DaCheng Tao
In this paper, we present DeepSolo, a simple DETR-like baseline that lets a single Decoder with Explicit Points Solo for text detection and recognition simultaneously.
Ranked #1 on Text Spotting on Total-Text (using extra training data)
2 code implementations • 31 May 2023 • Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Tongliang Liu, Bo Du, DaCheng Tao
In this paper, we present DeepSolo++, a simple DETR-like baseline that lets a single decoder with explicit points solo for text detection, recognition, and script identification simultaneously.
Ranked #1 on Text Spotting on Inverse-Text
1 code implementation • CVPR 2023 • Jizhizi Li, Jing Zhang, DaCheng Tao
Different from conventional image matting, which either requires user-defined scribbles/trimap to extract a specific foreground object or directly extracts all the foreground objects in the image indiscriminately, we introduce a new task named Referring Image Matting (RIM) in this paper, which aims to extract the meticulous alpha matte of the specific object that best matches the given natural language description, thus enabling a more natural and simpler instruction for image matting.
Ranked #1 on Referring Image Matting (RefMatte-RW100) on RefMatte
3 code implementations • 13 Jan 2022 • Qianyu Zhou, Xiangtai Li, Lu He, Yibo Yang, Guangliang Cheng, Yunhai Tong, Lizhuang Ma, DaCheng Tao
Detection Transformer (DETR) and Deformable DETR have been proposed to eliminate the need for many hand-designed components in object detection while demonstrating good performance as previous complex hand-crafted detectors.
Ranked #4 on Video Object Detection on ImageNet VID (using extra training data)
1 code implementation • 19 Feb 2023 • Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao
Recently, ChatGPT has attracted great attention, as it can generate fluent and high-quality responses to human inquiries.
1 code implementation • 11 Jul 2022 • Sen Zhang, Jing Zhang, DaCheng Tao
Unsupervised monocular depth and ego-motion estimation has drawn extensive research attention in recent years.
2 code implementations • 10 Aug 2017 • Zhou Yu, Jun Yu, Chenchao Xiang, Jianping Fan, DaCheng Tao
For fine-grained image and question representations, a `co-attention' mechanism is developed by using a deep neural network architecture to jointly learn the attentions for both the image and the question, which can allow us to reduce the irrelevant features effectively and obtain more discriminative features for image and question representations.
6 code implementations • ICCV 2017 • Zhou Yu, Jun Yu, Jianping Fan, DaCheng Tao
For multi-modal feature fusion, here we develop a Multi-modal Factorized Bilinear (MFB) pooling approach to efficiently and effectively combine multi-modal features, which results in superior performance for VQA compared with other bilinear pooling approaches.
3 code implementations • 9 Feb 2015 • Yong Luo, DaCheng Tao, Yonggang Wen, Kotagiri Ramamohanarao, Chao Xu
As a consequence, the high order correlation information contained in the different views is explored and thus a more reliable common subspace shared by all features can be obtained.
1 code implementation • CVPR 2018 • Chao Li, Cheng Deng, Ning li, Wei Liu, Xinbo Gao, DaCheng Tao
In addition, we harness a self-supervised semantic network to discover high-level semantic information in the form of multi-label annotations.
1 code implementation • ECCV 2018 • Yongcheng Jing, Yang Liu, Yezhou Yang, Zunlei Feng, Yizhou Yu, DaCheng Tao, Mingli Song
In this paper, we present a stroke controllable style transfer network that can achieve continuous and spatial stroke size control.
1 code implementation • 2 Apr 2019 • Zhe Chen, Jing Zhang, DaCheng Tao
To this end, LiDAR sensor data can be incorporated to improve the visual image-based road detection, because LiDAR data is less susceptible to visual noises.
1 code implementation • 10 Apr 2023 • Jizhizi Li, Jing Zhang, DaCheng Tao
Image matting refers to extracting precise alpha matte from natural images, and it plays a critical role in various downstream applications, such as image editing.
1 code implementation • CVPR 2023 • Dingfeng Shi, Yujie Zhong, Qiong Cao, Lin Ma, Jia Li, DaCheng Tao
In this paper, we present a one-stage framework TriDet for temporal action detection.
Ranked #2 on Temporal Action Localization on EPIC-KITCHENS-100
3 code implementations • 11 Sep 2023 • Dingfeng Shi, Qiong Cao, Yujie Zhong, Shan An, Jian Cheng, Haogang Zhu, DaCheng Tao
Temporal action detection (TAD) aims to detect all action boundaries and their corresponding categories in an untrimmed video.
Ranked #1 on Temporal Action Localization on MultiTHUMOS
2 code implementations • 18 Apr 2022 • Qiming Zhang, Yufei Xu, Jing Zhang, DaCheng Tao
Attention within windows has been widely explored in vision transformers to balance the performance, computation complexity, and memory footprint.
1 code implementation • 11 Sep 2021 • Shiyu Tang, Ruihao Gong, Yan Wang, Aishan Liu, Jiakai Wang, Xinyun Chen, Fengwei Yu, Xianglong Liu, Dawn Song, Alan Yuille, Philip H. S. Torr, DaCheng Tao
Thus, we propose RobustART, the first comprehensive Robustness investigation benchmark on ImageNet regarding ARchitecture design (49 human-designed off-the-shelf architectures and 1200+ networks from neural architecture search) and Training techniques (10+ techniques, e. g., data augmentation) towards diverse noises (adversarial, natural, and system noises).
1 code implementation • NeurIPS 2019 • Qiming Zhang, Jing Zhang, Wei Liu, DaCheng Tao
Although there has been a progress in matching the marginal distributions between two domains, the classifier favors the source domain features and makes incorrect predictions on the target domain due to category-agnostic feature alignment.
Ranked #24 on Image-to-Image Translation on SYNTHIA-to-Cityscapes
1 code implementation • CVPR 2022 • Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, DaCheng Tao
Point cloud segmentation is fundamental in understanding 3D environments.
Ranked #17 on Semantic Segmentation on S3DIS
1 code implementation • CVPR 2019 • Shanshan Zhao, Huan Fu, Mingming Gong, DaCheng Tao
Supervised depth estimation has achieved high accuracy due to the advanced deep network architectures.
Ranked #68 on Monocular Depth Estimation on KITTI Eigen split
1 code implementation • 29 Sep 2023 • Hao liu, Jiarui Feng, Lecheng Kong, Ningyue Liang, DaCheng Tao, Yixin Chen, Muhan Zhang
For in-context learning on graphs, OFA introduces a novel graph prompting paradigm that appends prompting substructures to the input graph, which enables it to address varied tasks without fine-tuning.
2 code implementations • 9 Dec 2020 • Shaoli Huang, Xinchao Wang, DaCheng Tao
As the main discriminative information of a fine-grained image usually resides in subtle regions, methods along this line are prone to heavy label noise in fine-grained recognition.
Ranked #31 on Fine-Grained Image Classification on CUB-200-2011
4 code implementations • 28 Aug 2021 • Hang Yu, Yufei Xu, Jing Zhang, Wei Zhao, Ziyu Guan, DaCheng Tao
The experimental results provide sound empirical evidence on the superiority of learning from diverse animals species in terms of both accuracy and generalization ability.
4 code implementations • 12 Jun 2022 • Yuxiang Yang, Junjie Yang, Yufei Xu, Jing Zhang, Long Lan, DaCheng Tao
Based on APT-36K, we benchmark several representative models on the following three tracks: (1) supervised animal pose estimation on a single frame under intra- and inter-domain transfer learning settings, (2) inter-species domain generalization test for unseen animals, and (3) animal pose estimation with animal tracking.
1 code implementation • 31 Jan 2024 • Maoyuan Ye, Jing Zhang, Juhua Liu, Chenyu Liu, BaoCai Yin, Cong Liu, Bo Du, DaCheng Tao
In terms of the AMG mode, Hi-SAM segments text stroke foreground masks initially, then samples foreground points for hierarchical text mask generation and achieves layout analysis in passing.
Ranked #1 on Hierarchical Text Segmentation on HierText
1 code implementation • 27 Mar 2023 • Qiming Zhang, Jing Zhang, Yufei Xu, DaCheng Tao
Window-based attention has become a popular choice in vision transformers due to its superior performance, lower computational complexity, and less memory footprint.
3 code implementations • 23 Nov 2021 • Haoyu He, Jianfei Cai, Jing Liu, Zizheng Pan, Jing Zhang, DaCheng Tao, Bohan Zhuang
Relying on the single-path space, we introduce learnable binary gates to encode the operation choices in MSA layers.
Ranked #18 on Efficient ViTs on ImageNet-1K (with DeiT-T)
1 code implementation • 15 Apr 2022 • Chuang Liu, Yibing Zhan, Jia Wu, Chang Li, Bo Du, Wenbin Hu, Tongliang Liu, DaCheng Tao
Graph neural networks have emerged as a leading architecture for many graph-level tasks, such as graph classification and graph generation.
2 code implementations • 4 Dec 2017 • Jun Yu, Xingxin Xu, Fei Gao, Shengjie Shi, Meng Wang, DaCheng Tao, Qingming Huang
Experimental results show that our method is capable of generating both visually comfortable and identity-preserving face sketches/photos over a wide range of challenging data.
Ranked #1 on Face Sketch Synthesis on CUFS (FID metric)
1 code implementation • ICCV 2021 • Wenyuan Xue, Baosheng Yu, Wen Wang, DaCheng Tao, Qingyong Li
A table arranging data in rows and columns is a very effective data structure, which has been widely used in business and scientific research.
1 code implementation • 27 Jul 2021 • Wen Wang, Yang Cao, Jing Zhang, Fengxiang He, Zheng-Jun Zha, Yonggang Wen, DaCheng Tao
In DQFA, a novel domain query is used to aggregate and align global context from the token sequence of both domains.
1 code implementation • 7 Jun 2021 • Jie Gui, Xiaofeng Cong, Yuan Cao, Wenqi Ren, Jun Zhang, Jing Zhang, Jiuxin Cao, DaCheng Tao
With the development of convolutional neural networks, hundreds of deep learning based dehazing methods have been proposed.
1 code implementation • 20 Mar 2024 • Di Wang, Jing Zhang, Minqiang Xu, Lin Liu, Dongsheng Wang, Erzhong Gao, Chengxi Han, HaoNan Guo, Bo Du, DaCheng Tao, Liangpei Zhang
However, transferring the pretrained models to downstream tasks may encounter task discrepancy due to their formulation of pretraining as image classification or object discrimination tasks.
Ranked #1 on Semantic Segmentation on SpaceNet 1 (using extra training data)
Aerial Scene Classification Building change detection for remote sensing images +12
1 code implementation • 6 Jan 2022 • Chen Chen, Zhe Chen, Jing Zhang, DaCheng Tao
We observe that the prevailing set abstraction design for down-sampling points may maintain too much unimportant background information that can affect feature learning for detecting objects.
1 code implementation • 19 Apr 2023 • Kunping Huang, Sen Zhang, Jing Zhang, DaCheng Tao
This paper presents a timely and comprehensive review of event-based vSLAM algorithms that exploit the benefits of asynchronous and irregular event streams for localization and mapping tasks.
1 code implementation • 10 Apr 2022 • Shilin Xu, Xiangtai Li, Jingbo Wang, Guangliang Cheng, Yunhai Tong, DaCheng Tao
This focus on joint human fashion segmentation and attribute recognition.
1 code implementation • 12 Dec 2022 • Haibin He, Xinyuan Chen, Chaoyue Wang, Juhua Liu, Bo Du, DaCheng Tao, Yu Qiao
Specifically, a large stroke-wise dataset is constructed, and a stroke-wise diffusion model is proposed to preserve the structure and the completion of each generated character.
1 code implementation • 26 Mar 2024 • Gan Pei, Jiangning Zhang, Menghan Hu, Zhenyu Zhang, Chengjie Wang, Yunsheng Wu, Guangtao Zhai, Jian Yang, Chunhua Shen, DaCheng Tao
Deepfake is a technology dedicated to creating highly realistic facial images and videos under specific conditions, which has significant application potential in fields such as entertainment, movie production, digital human creation, to name a few.
1 code implementation • 20 Dec 2022 • Qingyu Lu, Liang Ding, Liping Xie, Kanjian Zhang, Derek F. Wong, DaCheng Tao
To this end, we augment BARTScore by incorporating the human-like error analysis strategies, namely BARTScore++, where the final score consists of both the evaluations of major errors and minor errors.
1 code implementation • 24 Mar 2023 • Qingyu Lu, Baopu Qiu, Liang Ding, Kanjian Zhang, Tom Kocmi, DaCheng Tao
To further improve the performance of LLMs on MT quality assessment, we investigate several prompting designs, and propose a new prompting method called \textbf{\texttt{Error Analysis Prompting}} (EAPrompt) by combining Chain-of-Thoughts (Wei et al., 2022) and Error Analysis (Lu et al., 2023).
1 code implementation • CVPR 2022 • Yang Yang, Chaoyue Wang, Risheng Liu, Lin Zhang, Xiaojie Guo, DaCheng Tao
With estimated scene depth, our method is capable of re-rendering hazy images with different thicknesses which further benefits the training of the dehazing network.
2 code implementations • 20 Jul 2023 • Mang Ye, Xiuwen Fang, Bo Du, Pong C. Yuen, DaCheng Tao
Therefore, a systematic survey on this topic about the research challenges and state-of-the-art is essential.
4 code implementations • ECCV 2020 • Zhi Hou, Xiaojiang Peng, Yu Qiao, DaCheng Tao
The integration of decomposition and composition enables VCL to share object and verb features among different HOI samples and images, and to generate new interaction samples and new types of HOI, and thus largely alleviates the long-tail distribution problem and benefits low-shot or zero-shot HOI detection.
Ranked #3 on Affordance Recognition on HICO-DET(Unknown Concepts)
1 code implementation • CVPR 2021 • Zhi Hou, Baosheng Yu, Yu Qiao, Xiaojiang Peng, DaCheng Tao
With the proposed object fabricator, we are able to generate large-scale HOI samples for rare and unseen categories to alleviate the open long-tailed issues in HOI detection.
Ranked #4 on Affordance Recognition on HICO-DET
2 code implementations • CVPR 2021 • Zhi Hou, Baosheng Yu, Yu Qiao, Xiaojiang Peng, DaCheng Tao
The proposed method can thus be used to 1) improve the performance of HOI detection, especially for the HOIs with unseen objects; and 2) infer the affordances of novel objects.
Ranked #2 on Affordance Recognition on HICO-DET(Unknown Concepts)
2 code implementations • 27 Mar 2022 • Zhi Hou, Baosheng Yu, DaCheng Tao
Therefore, the proposed method enables the learning on both known and unknown HOI concepts.
Affordance Recognition Human-Object Interaction Concept Discovery +1
2 code implementations • 17 Mar 2022 • Wen Wang, Jing Zhang, Yang Cao, Yongliang Shen, DaCheng Tao
Besides, we introduce a simple yet effective label augmentation method to provide richer supervision and improve data efficiency.
1 code implementation • 31 Mar 2022 • Sihan Ma, Jizhizi Li, Jing Zhang, He Zhang, DaCheng Tao
P3M-10k consists of 10, 421 high resolution face-blurred portrait images along with high-quality alpha mattes, which enables us to systematically evaluate both trimap-free and trimap-based matting methods and obtain some useful findings about model generalization ability under the privacy preserving training (PPT) setting.
Ranked #1 on Image Matting on P3M-10k
1 code implementation • 16 Jul 2022 • Haimei Zhao, Jing Zhang, Sen Zhang, DaCheng Tao
A naive way is to accomplish them independently in a sequential or parallel manner, but there are many drawbacks, i. e., 1) the depth and VO results suffer from the inherent scale ambiguity issue; 2) the BEV layout is directly predicted from the front-view image without using any depth-related information, although the depth map contains useful geometry clues for inferring scene layouts.
1 code implementation • ICLR 2023 • Yibo Yang, Haobo Yuan, Xiangtai Li, Zhouchen Lin, Philip Torr, DaCheng Tao
In this paper, we deal with this misalignment dilemma in FSCIL inspired by the recently discovered phenomenon named neural collapse, which reveals that the last-layer features of the same class will collapse into a vertex, and the vertices of all classes are aligned with the classifier prototypes, which are formed as a simplex equiangular tight frame (ETF).
Ranked #3 on Few-Shot Class-Incremental Learning on CUB-200-2011
1 code implementation • 6 Feb 2023 • Yibo Yang, Haobo Yuan, Xiangtai Li, Zhouchen Lin, Philip Torr, DaCheng Tao
In this paper, we deal with this misalignment dilemma in FSCIL inspired by the recently discovered phenomenon named neural collapse, which reveals that the last-layer features of the same class will collapse into a vertex, and the vertices of all classes are aligned with the classifier prototypes, which are formed as a simplex equiangular tight frame (ETF).
1 code implementation • 24 Mar 2023 • Keqin Peng, Liang Ding, Qihuang Zhong, Li Shen, Xuebo Liu, Min Zhang, Yuanxin Ouyang, DaCheng Tao
We show that: 1) The performance of ChatGPT depends largely on temperature, and a lower temperature usually can achieve better performance; 2) Emphasizing the task information can further improve ChatGPT's performance, particularly in complex MT tasks; 3) Introducing domain information can elicit ChatGPT's generalization ability and improve its performance in the specific domain; 4) ChatGPT tends to generate hallucinations for non-English-centric MT tasks, which can be partially addressed by our proposed prompts but still need to be highlighted for the MT/NLP community.
2 code implementations • 3 Aug 2023 • Yibo Yang, Haobo Yuan, Xiangtai Li, Jianlong Wu, Lefei Zhang, Zhouchen Lin, Philip Torr, DaCheng Tao, Bernard Ghanem
Beyond the normal case, long-tail class incremental learning and few-shot class incremental learning are also proposed to consider the data imbalance and data scarcity, respectively, which are common in real-world implementations and further exacerbate the well-known problem of catastrophic forgetting.
3 code implementations • 1 Mar 2018 • Chaoyue Wang, Chang Xu, Xin Yao, DaCheng Tao
In this paper, we propose a novel GAN framework called evolutionary generative adversarial networks (E-GAN) for stable GAN training and improved generative performance.
1 code implementation • CVPR 2019 • Huan Fu, Mingming Gong, Chaohui Wang, Kayhan Batmanghelich, Kun Zhang, DaCheng Tao
Unsupervised domain mapping aims to learn a function to translate domain X to Y by a function GXY in the absence of paired examples.
1 code implementation • NAACL 2019 • Yu Cao, Meng Fang, DaCheng Tao
Graph convolutional networks are used to obtain a relation-aware representation of nodes for entity graphs built from documents with multi-level features.
1 code implementation • 22 Dec 2020 • Haoyu He, Jing Zhang, Bhavani Thuraisingham, DaCheng Tao
In this paper, we devise a novel Progressive One-shot Parsing network (POPNet) to address two critical challenges , i. e., testing bias and small sizes.
1 code implementation • 4 May 2021 • Haoyu He, Bohan Zhuang, Jing Zhang, Jianfei Cai, DaCheng Tao
To address three main challenges in OSHP, i. e., small sizes, testing bias, and similar parts, we devise an End-to-end One-shot human Parsing Network (EOP-Net).
1 code implementation • 3 Aug 2021 • Bo Du, Jian Ye, Jing Zhang, Juhua Liu, DaCheng Tao
Existing methods for arbitrary-shaped text detection in natural scenes face two critical issues, i. e., 1) fracture detections at the gaps in a text instance; and 2) inaccurate detections of arbitrary-shaped text instances with diverse background context.
Ranked #5 on Scene Text Detection on SCUT-CTW1500
1 code implementation • 18 Aug 2022 • Yi-Fan Zhang, Jindong Wang, Jian Liang, Zhang Zhang, Baosheng Yu, Liang Wang, DaCheng Tao, Xing Xie
Our bound motivates two strategies to reduce the gap: the first one is ensembling multiple classifiers to enrich the hypothesis space, then we propose effective gap estimation methods for guiding the selection of a better hypothesis for the target.
1 code implementation • 25 Aug 2020 • Shanshan Zhao, Mingming Gong, Huan Fu, DaCheng Tao
Furthermore, considering the mutli-modality of input data, we exploit the graph propagation on the two modalities respectively to extract multi-modal representations.
1 code implementation • 10 Oct 2022 • Guozheng Ma, Zhen Wang, Zhecheng Yuan, Xueqian Wang, Bo Yuan, DaCheng Tao
Visual reinforcement learning (RL), which makes decisions directly from high-dimensional visual inputs, has demonstrated significant potential in various domains.
1 code implementation • 27 Nov 2019 • Haoyu He, Jing Zhang, Qiming Zhang, DaCheng Tao
In this paper, we propose a novel GRAph PYramid Mutual Learning (Grapy-ML) method to address the cross-dataset human parsing problem, where the annotations are at different granularities.
1 code implementation • 1 Jun 2022 • Rong Dai, Li Shen, Fengxiang He, Xinmei Tian, DaCheng Tao
In this work, we propose a novel personalized federated learning framework in a decentralized (peer-to-peer) communication protocol named Dis-PFL, which employs personalized sparse masks to customize sparse local models on the edge.
1 code implementation • NeurIPS 2019 • Yali Du, Lei Han, Meng Fang, Ji Liu, Tianhong Dai, DaCheng Tao
A great challenge in cooperative decentralized multi-agent reinforcement learning (MARL) is generating diversified behaviors for each individual agent when receiving only a team reward.
Multi-agent Reinforcement Learning reinforcement-learning +3
1 code implementation • CVPR 2020 • Xin Lin, Changxing Ding, Jinquan Zeng, DaCheng Tao
There are three key properties of scene graph that have been underexplored in recent works: namely, the edge direction information, the difference in priority between nodes, and the long-tailed distribution of relationships.
Ranked #5 on Scene Graph Generation on Visual Genome
1 code implementation • NeurIPS 2020 • Xiaobo Xia, Tongliang Liu, Bo Han, Nannan Wang, Mingming Gong, Haifeng Liu, Gang Niu, DaCheng Tao, Masashi Sugiyama
Learning with the \textit{instance-dependent} label noise is challenging, because it is hard to model such real-world noise.
1 code implementation • NeurIPS 2020 • Shanshan Zhao, Mingming Gong, Tongliang Liu, Huan Fu, DaCheng Tao
To arrive at this, some methods introduce a domain discriminator through adversarial learning to match the feature distributions in multiple source domains.
Ranked #43 on Domain Generalization on PACS
1 code implementation • CVPR 2021 • Song Guo, Jingya Wang, Xinchao Wang, DaCheng Tao
On the other hand, such reliable embeddings can boost identity-awareness through memory aggregation, hence strengthen attention modules and suppress drifts.
2 code implementations • 28 Jun 2021 • Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, DaCheng Tao
To empower robots with this ability in unseen scenarios, we consider the challenging one-shot affordance detection problem in this paper, i. e., given a support image that depicts the action purpose, all objects in a scene with the common affordance should be detected.
1 code implementation • 8 Aug 2021 • Wei Zhai, Hongchen Luo, Jing Zhang, Yang Cao, DaCheng Tao
To empower robots with this ability in unseen scenarios, we first study the challenging one-shot affordance detection problem in this paper, i. e., given a support image that depicts the action purpose, all objects in a scene with the common affordance should be detected.
1 code implementation • 27 Jul 2021 • Zefeng Ding, Changxing Ding, Zhiyin Shao, DaCheng Tao
Third, we introduce a Compound Ranking (CR) loss that makes use of textual descriptions for other images of the same identity to provide extra supervision, thereby effectively reducing the intra-class variance in textual features.
Ranked #1 on Image Retrieval on ICFG-PEDES
1 code implementation • 5 Dec 2021 • Haobo Yuan, Xiangtai Li, Yibo Yang, Guangliang Cheng, Jing Zhang, Yunhai Tong, Lefei Zhang, DaCheng Tao
The Depth-aware Video Panoptic Segmentation (DVPS) is a new challenging vision problem that aims to predict panoptic segmentation and depth in a video simultaneously.
1 code implementation • 17 Jul 2023 • Shiye Lei, Hao Chen, Sen Zhang, Bo Zhao, DaCheng Tao
With the rapid development of Artificial Intelligence Generated Content (AIGC), it has become common practice in many learning tasks to train or fine-tune large models on synthetic data due to the data-scarcity and privacy leakage problems.
2 code implementations • CVPR 2023 • Haoyu He, Jianfei Cai, Zizheng Pan, Jing Liu, Jing Zhang, DaCheng Tao, Bohan Zhuang
In this paper, we propose a simple yet effective query design for semantic segmentation termed Dynamic Focus-aware Positional Queries (DFPQ), which dynamically generates positional queries conditioned on the cross-attention scores from the preceding decoder block and the positional encodings for the corresponding image features, simultaneously.
Ranked #21 on Semantic Segmentation on ADE20K
1 code implementation • CVPR 2020 • Yiding Yang, Jiayan Qiu, Mingli Song, DaCheng Tao, Xinchao Wang
To enable the knowledge transfer from the teacher GCN to the student, we propose a local structure preserving module that explicitly accounts for the topological semantics of the teacher.
1 code implementation • ICCV 2021 • Haibo Qiu, Baosheng Yu, Dihong Gong, Zhifeng Li, Wei Liu, DaCheng Tao
We then analyze the underlying causes behind the performance gap, e. g., the poor intra-class variations and the domain gap between synthetic and real face images.
1 code implementation • 6 Jun 2019 • Zhou Yu, Dejing Xu, Jun Yu, Ting Yu, Zhou Zhao, Yueting Zhuang, DaCheng Tao
It is both crucial and natural to extend this research direction to the video domain for video question answering (VideoQA).
Ranked #29 on Video Question Answering on ActivityNet-QA
Visual Question Answering (VQA) Zero-Shot Video Question Answer
1 code implementation • 27 Sep 2019 • Yuqing Ma, Xianglong Liu, Shihao Bai, Lei Wang, Aishan Liu, DaCheng Tao, Edwin Hancock
To address these problems, we propose a generic inpainting framework capable of handling with incomplete images on both continuous and discontinuous large missing areas, in an adversarial manner.
1 code implementation • 10 Aug 2020 • Jing Zhang, Yang Cao, Zheng-Jun Zha, DaCheng Tao
To address this issue, we propose a novel synthetic method called 3R to simulate nighttime hazy images from daytime clear images, which first reconstructs the scene geometry, then simulates the light rays and object reflectance, and finally renders the haze effects.
1 code implementation • 10 Apr 2022 • Xiangtai Li, Shilin Xu, Yibo Yang, Guangliang Cheng, Yunhai Tong, DaCheng Tao
To the best of our knowledge, we are the first to solve the PPS problem via \textit{a unified and end-to-end transformer model.
1 code implementation • 3 Jan 2023 • Xiangtai Li, Shilin Xu, Yibo Yang, Haobo Yuan, Guangliang Cheng, Yunhai Tong, Zhouchen Lin, Ming-Hsuan Yang, DaCheng Tao
Third, inspired by Mask2Former, based on our meta-architecture, we propose Panoptic-PartFormer++ and design a new part-whole cross-attention scheme to boost part segmentation qualities further.
1 code implementation • ICCV 2023 • Haoyu He, Jianfei Cai, Jing Zhang, DaCheng Tao, Bohan Zhuang
Visual Parameter-Efficient Fine-Tuning (PEFT) has become a powerful alternative for full fine-tuning so as to adapt pre-trained vision models to downstream tasks, which only tunes a small number of parameters while freezing the vast majority ones to ease storage burden and optimization difficulty.
1 code implementation • 12 Dec 2023 • Jiangning Zhang, Xuhai Chen, Yabiao Wang, Chengjie Wang, Yong liu, Xiangtai Li, Ming-Hsuan Yang, DaCheng Tao
Following this spirit, this paper explores plain ViT architecture for MUAD.
1 code implementation • 16 Apr 2024 • Jiangning Zhang, Chengjie Wang, Xiangtai Li, Guanzhong Tian, Zhucun Xue, Yong liu, Guansong Pang, DaCheng Tao
Moreover, current metrics such as AU-ROC have nearly reached saturation on simple datasets, which prevents a comprehensive evaluation of different methods.
3 code implementations • 20 Jun 2022 • Xudong Tian, Zhizhong Zhang, Cong Wang, Wensheng Zhang, Yanyun Qu, Lizhuang Ma, Zongze Wu, Yuan Xie, DaCheng Tao
Information Bottleneck (IB) based multi-view learning provides an information theoretic principle for seeking shared information contained in heterogeneous data descriptions.
2 code implementations • 1 Sep 2020 • Baosheng Yu, DaCheng Tao
Previous methods to overcome the sub-pixel localization problem usually rely on high-resolution heatmaps.
1 code implementation • 30 Jun 2018 • Yu Liu, Guanlong Zhao, Boyuan Gong, Yang Li, Ritu Raj, Niraj Goel, Satya Kesav, Sandeep Gottimukkala, Zhangyang Wang, Wenqi Ren, DaCheng Tao
Here we explore two related but important tasks based on the recently released REalistic Single Image DEhazing (RESIDE) benchmark dataset: (i) single image dehazing as a low-level image restoration problem; and (ii) high-level visual understanding (e. g., object detection) of hazy images.
1 code implementation • 6 Jun 2021 • Jian Cheng, Ziyang Liu, Hao Guan, Zhenzhou Wu, Haogang Zhu, Jiyang Jiang, Wei Wen, DaCheng Tao, Tao Liu
In this paper, a novel 3D convolutional network, called two-stage-age-network (TSAN), is proposed to estimate brain age from T1-weighted MRI data.
1 code implementation • 13 Jan 2023 • Jie Gui, Tuo Chen, Jing Zhang, Qiong Cao, Zhenan Sun, Hao Luo, DaCheng Tao
Deep supervised learning algorithms typically require a large volume of labeled data to achieve satisfactory performance.
1 code implementation • ECCV 2018 • Xiyu Yu, Tongliang Liu, Mingming Gong, DaCheng Tao
We therefore reason that the transition probabilities will be different.
1 code implementation • CVPR 2021 • Xinqi Zhu, Chang Xu, DaCheng Tao
We thus impose a perturbation on a certain dimension of the latent code, and expect to identify the perturbation along this dimension from the generated images so that the encoding of simple variations can be enforced.
1 code implementation • 19 Jan 2022 • Chunhui Zhang, Guanjie Huang, Li Liu, Shan Huang, Yinan Yang, Xiang Wan, Shiming Ge, DaCheng Tao
In this work, we propose WebUAV-3M, the largest public UAV tracking benchmark to date, to facilitate both the development and evaluation of deep UAV trackers.
1 code implementation • 27 Nov 2022 • Minghui Hu, Chuanxia Zheng, Heliang Zheng, Tat-Jen Cham, Chaoyue Wang, Zuopeng Yang, DaCheng Tao, Ponnuthurai N. Suganthan
The recently developed discrete diffusion models perform extraordinarily well in the text-to-image task, showing significant promise for handling the multi-modality signals.
1 code implementation • 10 Jan 2021 • Yiding Yang, Xinchao Wang, Mingli Song, Junsong Yuan, DaCheng Tao
SPAGAN therefore allows for a more informative and intact exploration of the graph structure and further {a} more effective aggregation of information from distant neighbors into the center node, as compared to node-based GCN methods.
1 code implementation • 12 Oct 2023 • Hongling Zheng, Li Shen, Anke Tang, Yong Luo, Han Hu, Bo Du, DaCheng Tao
LFM focuses on the research, modification, and design of FM based on the model interface, so as to better understand the model structure and weights (in a black box environment), and to generalize the model to downstream tasks.
1 code implementation • IJCAI2018 2018 • Erkun Yang, Cheng Deng, Tongliang Liu, Wei Liu, DaCheng Tao
Hashing is becoming increasingly popular for approximate nearest neighbor searching in massive databases due to its storage and search efficiency.
1 code implementation • 8 Mar 2022 • Jun Rao, Fei Wang, Liang Ding, Shuhan Qi, Yibing Zhan, Weifeng Liu, DaCheng Tao
In contrast to previous works, we focus on the reproducibility of the approaches and the examination of the elements that lead to improved performance by pretrained and nonpretrained models in retrieving images and text.
1 code implementation • 13 Jan 2022 • Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, Hua Jin, DaCheng Tao
To this end, we propose a knowledge graph augmented network KGAN, which aims to effectively incorporate external knowledge with explicitly syntactic and contextual information.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2
2 code implementations • 28 Mar 2022 • Shaopeng Fu, Fengxiang He, Yang Liu, Li Shen, DaCheng Tao
To address this concern, methods are proposed to make data unlearnable for deep learning models by adding a type of error-minimizing noise.
1 code implementation • 20 Sep 2022 • Changtong Zan, Keqin Peng, Liang Ding, Baopu Qiu, Boan Liu, Shwai He, Qingyu Lu, Zheng Zhang, Chuang Liu, Weifeng Liu, Yibing Zhan, DaCheng Tao
As for model sizes, we scale the Transformer-Big up to the extremely large model that owns nearly 4. 7 Billion parameters, to fully enhance the model capacity for our Vega-MT.
Ranked #1 on Machine Translation on WMT 2022 English-Russian
2 code implementations • 19 Apr 2023 • Di Wang, Jing Zhang, Bo Du, Liangpei Zhang, DaCheng Tao
Hyperspectral image (HSI) classification is challenging due to spatial variability caused by complex imaging conditions.
1 code implementation • CVPR 2022 • Zhe Chen, Jing Zhang, DaCheng Tao
Then, a glimpse-based decoder is introduced to provide refined detection results based on both the glimpse features and the attention modeling outputs of the previous stage.
Ranked #1 on Object Detection on MS COCO (GFlops metric)
1 code implementation • 17 Mar 2022 • Yibo Yang, Shixiang Chen, Xiangtai Li, Liang Xie, Zhouchen Lin, DaCheng Tao
Modern deep neural networks for classification usually jointly learn a backbone for representation and a linear classifier to output the logit of each class.
Ranked #26 on Long-tail Learning on CIFAR-10-LT (ρ=100)
1 code implementation • 21 Aug 2021 • Haibo Qiu, Dihong Gong, Zhifeng Li, Wei Liu, DaCheng Tao
However, the state-of-the-art general face recognition models do not generalize well to occluded face images, which are exactly the common cases in real-world scenarios.
2 code implementations • CVPR 2022 • Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, DaCheng Tao
To empower an agent with such ability, this paper proposes a task of affordance grounding from exocentric view, i. e., given exocentric human-object interaction and egocentric object images, learning the affordance knowledge of the object and transferring it to the egocentric image using only the affordance label as supervision.
2 code implementations • 28 Aug 2022 • Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, DaCheng Tao
Due to the diversity of interactive affordance, the uniqueness of different individuals leads to diverse interactions, which makes it difficult to establish an explicit link between object parts and affordance labels.
1 code implementation • 6 Jul 2022 • Haibo Qiu, Baosheng Yu, DaCheng Tao
However, recent projection-based methods for point cloud semantic segmentation usually utilize a vanilla late fusion strategy for the predictions of different views, failing to explore the complementary information from a geometric perspective during the representation learning.
Ranked #1 on Robust 3D Semantic Segmentation on nuScenes-C
1 code implementation • 14 Jul 2022 • Dingfeng Shi, Yujie Zhong, Qiong Cao, Jing Zhang, Lin Ma, Jia Li, DaCheng Tao
Moreover, we propose two losses to facilitate and stabilize the training of action classification.
Ranked #15 on Temporal Action Localization on THUMOS’14
1 code implementation • 11 Oct 2022 • Peng Mi, Li Shen, Tianhe Ren, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji, DaCheng Tao
One of the popular solutions is Sharpness-Aware Minimization (SAM), which smooths the loss landscape via minimizing the maximized change of training loss when adding a perturbation to the weight.
2 code implementations • 22 May 2019 • Sicheng Wang, Bihan Wen, Junru Wu, DaCheng Tao, Zhangyang Wang
Several recent works discussed application-driven image restoration neural networks, which are capable of not only removing noise in images but also preserving their semantic-aware details, making them suitable for various high-level computer vision tasks as the pre-processing step.
1 code implementation • 11 Jun 2019 • Jing Zhang, DaCheng Tao
Single image dehazing is a critical image pre-processing step for subsequent high-level computer vision tasks.
1 code implementation • NeurIPS 2019 • Tingting Qiao, Jing Zhang, Duanqing Xu, DaCheng Tao
Given a text description, we immediately imagine an overall visual impression using this prior and, based on this, we draw a picture by progressively adding more and more details.
1 code implementation • 12 Nov 2020 • Zeke Xie, Fengxiang He, Shaopeng Fu, Issei Sato, DaCheng Tao, Masashi Sugiyama
Thus it motivates us to design a similar mechanism named {\it artificial neural variability} (ANV), which helps artificial neural networks learn some advantages from ``natural'' neural networks.
1 code implementation • 11 Jan 2019 • Chongyi Li, Chunle Guo, Wenqi Ren, Runmin Cong, Junhui Hou, Sam Kwong, DaCheng Tao
In this paper, we construct an Underwater Image Enhancement Benchmark (UIEB) including 950 real-world underwater images, 890 of which have the corresponding reference images.
Ranked #5 on Underwater Image Restoration on LSUI (using extra training data)
2 code implementations • 24 Nov 2021 • Yufei Xu, Qiming Zhang, Jing Zhang, DaCheng Tao
In this paper, we make the first attempt to demonstrate the importance of both regions in cropping from a complete perspective and propose a simple yet effective pretext task called Region Contrastive Learning (RegionCL).
1 code implementation • NeurIPS 2020 • Youjian Zhang, Chaoyue Wang, DaCheng Tao
However, in complicated real-world situations, the temporal priors of videos, i. e. frames per second (FPS) and frame exposure time, may vary from different camera sensors.
1 code implementation • 5 Jun 2019 • Chenhong Zhou, Changxing Ding, Xinchao Wang, Zhentai Lu, DaCheng Tao
The model cascade (MC) strategy significantly alleviates the class imbalance issue via running a set of individual deep models for coarse-to-fine segmentation.
Ranked #1 on Brain Tumor Segmentation on BRATS-2015
1 code implementation • 6 Oct 2020 • Youjian Zhang, Chaoyue Wang, Stephen J. Maybank, DaCheng Tao
However, the motion information contained in a blurry image has yet to be fully explored and accurately formulated because: (i) the ground truth of dynamic motion is difficult to obtain; (ii) the temporal ordering is destroyed during the exposure; and (iii) the motion estimation from a blurry image is highly ill-posed.
1 code implementation • 25 Jun 2022 • Tongtian Zhu, Fengxiang He, Lan Zhang, Zhengyang Niu, Mingli Song, DaCheng Tao
Our theory indicates that the generalizability of D-SGD is positively correlated with the spectral gap, and can explain why consensus control in initial training phase can ensure better generalization.
1 code implementation • 29 Jun 2023 • Sihan Ma, Qiong Cao, Hongwei Yi, Jing Zhang, DaCheng Tao
Demystifying complex human-ground interactions is essential for accurate and realistic 3D human motion reconstruction from RGB videos, as it ensures consistency between the humans and the ground plane.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Yu Cao, Wei Bi, Meng Fang, DaCheng Tao
In this work, we study dialogue models with multiple input sources adapted from the pretrained language model GPT2.
1 code implementation • 19 Jun 2022 • Jiangning Zhang, Xiangtai Li, Yabiao Wang, Chengjie Wang, Yibo Yang, Yong liu, DaCheng Tao
Motivated by biological evolution, this paper explains the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA) and derives that both have consistent mathematical formulation.
1 code implementation • CVPR 2023 • Yifan Shi, Yingqi Liu, Kang Wei, Li Shen, Xueqian Wang, DaCheng Tao
Specifically, DP-FedSAM integrates Sharpness Aware Minimization (SAM) optimizer to generate local flatness models with better stability and weight perturbation robustness, which results in the small norm of local updates and robustness to DP noise, thereby improving the performance.
1 code implementation • 1 May 2023 • Yifan Shi, Kang Wei, Li Shen, Yingqi Liu, Xueqian Wang, Bo Yuan, DaCheng Tao
To defend the inference attacks and mitigate the sensitive information leakages in Federated Learning (FL), client-level Differentially Private FL (DPFL) is the de-facto standard for privacy protection by clipping local updates and adding random noise.
1 code implementation • NeurIPS 2020 • Zhaohui Yang, Yunhe Wang, Kai Han, Chunjing Xu, Chao Xu, DaCheng Tao, Chang Xu
Quantized neural networks with low-bit weights and activations are attractive for developing AI accelerators.
1 code implementation • CVPR 2021 • Xubin Zhong, Xian Qu, Changxing Ding, DaCheng Tao
In this paper, we propose a novel one-stage method, namely Glance and Gaze Network (GGNet), which adaptively models a set of actionaware points (ActPoints) via glance and gaze steps.
Ranked #17 on Human-Object Interaction Detection on V-COCO
1 code implementation • CVPR 2022 • Lin Zhang, Li Shen, Liang Ding, DaCheng Tao, Ling-Yu Duan
Instead, we propose a data-free knowledge distillation method to fine-tune the global model in the server (FedFTG), which relieves the issue of direct model aggregation.
1 code implementation • CVPR 2023 • Xu Zhang, Wen Wang, Zhe Chen, Yufei Xu, Jing Zhang, DaCheng Tao
Motivated by the progress of visual-language research, we propose that pre-trained language models (e. g., CLIP) can facilitate animal pose estimation by providing rich prior knowledge for describing animal keypoints in text.
3 code implementations • CVPR 2023 • Sanqing Qu, Tianpei Zou, Florian Roehrbein, Cewu Lu, Guang Chen, DaCheng Tao, Changjun Jiang
We examine the superiority of our GLC on multiple benchmarks with different category shift scenarios, including partial-set, open-set, and open-partial-set DA.
Ranked #2 on Universal Domain Adaptation on VisDA2017
2 code implementations • 21 Mar 2024 • Sanqing Qu, Tianpei Zou, Florian Röhrbein, Cewu Lu, Guang Chen, DaCheng Tao, Changjun Jiang
GLC++ enhances the novel category clustering accuracy of GLC by 4. 3% in open-set scenarios on Office-Home.
2 code implementations • CVPR 2019 • Tingting Qiao, Jing Zhang, Duanqing Xu, DaCheng Tao
Generating an image from a given text description has two goals: visual realism and semantic consistency.
Ranked #8 on Text-to-Image Generation on CUB (Inception score metric)
1 code implementation • 20 Oct 2020 • Yuxuan Du, Tao Huang, Shan You, Min-Hsiu Hsieh, DaCheng Tao
Variational quantum algorithms (VQAs) are expected to be a path to quantum advantages on noisy intermediate-scale quantum devices.
1 code implementation • 25 Dec 2020 • Jun Yu, Hao Zhou, Yibing Zhan, DaCheng Tao
Essentially, DGCPN addresses the inaccurate similarity problem by exploring and exploiting the data's intrinsic relationships in a graph.
1 code implementation • ICCV 2021 • Yufei Xu, Jing Zhang, DaCheng Tao
However, since the view outside the boundary is not available during warping, the resulting holes around the boundary of the stabilized frame must be discarded (i. e., cropping) to maintain visual consistency, and thus does leads to a tradeoff between stability and cropping ratio.
1 code implementation • 6 Apr 2022 • Sanqing Qu, Guang Chen, Jing Zhang, Zhijun Li, wei he, DaCheng Tao
Source-free Domain Adaptation (SFDA) aims to adapt a pre-trained source model to the unlabeled target domain without accessing the well-labeled source data, which is a much more practical setting due to the data privacy, security, and transmission issues.
1 code implementation • 21 Feb 2023 • Yan Sun, Li Shen, Tiansheng Huang, Liang Ding, DaCheng Tao
Federated learning is an emerging distributed machine learning framework which jointly trains a global model via a large number of local devices with data privacy protections.
1 code implementation • 19 May 2023 • Yan Sun, Li Shen, Shixiang Chen, Liang Ding, DaCheng Tao
In federated learning (FL), a cluster of local clients are chaired under the coordination of the global server and cooperatively train one model with privacy protection.
1 code implementation • 3 Feb 2020 • Jing Zhang, Zhe Chen, DaCheng Tao
Human keypoint detection from a single image is very challenging due to occlusion, blur, illumination and scale variance.
Ranked #5 on Pose Estimation on COCO test-dev
1 code implementation • 25 Apr 2020 • Zhou Yu, Yuhao Cui, Jun Yu, Meng Wang, DaCheng Tao, Qi Tian
Most existing works focus on a single task and design neural architectures manually, which are highly task-specific and hard to generalize to different tasks.
Ranked #19 on Visual Question Answering (VQA) on VQA v2 test-std
2 code implementations • 4 Apr 2019 • Xinyuan Chen, Chang Xu, Xiaokang Yang, Li Song, DaCheng Tao
We propose adversarial gated networks (Gated GAN) to transfer multiple styles in a single model.
1 code implementation • 28 Jul 2021 • Xiangtai Li, Hao He, Yibo Yang, Henghui Ding, Kuiyuan Yang, Guangliang Cheng, Yunhai Tong, DaCheng Tao
To incorporate both temporal and scale information, we propose a Temporal Pyramid Routing (TPR) strategy to conditionally align and conduct pixel-level aggregation from a feature pyramid pair of two adjacent frames.
3 code implementations • CVPR 2022 • Yu Feng, Benteng Ma, Jing Zhang, Shanshan Zhao, Yong Xia, DaCheng Tao
However, designing a unified BA method that can be applied to various MIA systems is challenging due to the diversity of imaging modalities (e. g., X-Ray, CT, and MRI) and analysis tasks (e. g., classification, detection, and segmentation).
1 code implementation • 31 Aug 2023 • Zehao Dong, Weidong Cao, Muhan Zhang, DaCheng Tao, Yixin Chen, Xuan Zhang
The electronic design automation of analog circuits has been a longstanding challenge in the integrated circuit field due to the huge design space and complex design trade-offs among circuit specifications.
1 code implementation • 18 Feb 2024 • Xikun Zhang, Dongjin Song, DaCheng Tao
To bridge the gap, we provide a comprehensive review of existing continual graph learning (CGL) algorithms by elucidating the different task settings and categorizing the existing methods based on their characteristics.
1 code implementation • 9 May 2018 • Zhou Yu, Jun Yu, Chenchao Xiang, Zhou Zhao, Qi Tian, DaCheng Tao
Visual grounding aims to localize an object in an image referred to by a textual query phrase.
Ranked #9 on Phrase Grounding on Flickr30k Entities Test
1 code implementation • CVPR 2020 • Tianyu Guo, Chang Xu, Jiajun Huang, Yunhe Wang, Boxin Shi, Chao Xu, DaCheng Tao
In contrast, it is more reasonable to treat the generated data as unlabeled, which could be positive or negative according to their quality.
1 code implementation • 7 Jun 2021 • Xinqi Zhu, Chang Xu, DaCheng Tao
Instead, we propose to encode the data variations with groups, a structure not only can equivariantly represent variations, but can also be adaptively optimized to preserve the properties of data variations.
1 code implementation • 20 Jul 2021 • Li Gao, Jing Zhang, Lefei Zhang, DaCheng Tao
In addition, feature-level alignment is carried out by aligning the feature maps of the source and target images from student network using a weighted maximum mean discrepancy loss.
Ranked #18 on Synthetic-to-Real Translation on SYNTHIA-to-Cityscapes
1 code implementation • 15 Oct 2023 • Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, DaCheng Tao
Although a sparse Mixture of Experts (MoE) can reduce the cost by activating a small subset of parameters (e. g., one expert) for each input, its computation escalates significantly if increasing the number of activated experts, limiting its practical utility.
1 code implementation • ICCV 2017 • Lei Huang, Xianglong Liu, Yang Liu, Bo Lang, DaCheng Tao
Training deep neural networks is difficult for the pathological curvature problem.
1 code implementation • 19 Jun 2020 • Elija Perrier, Christopher Ferrie, DaCheng Tao
Our results demonstrate how geometric control techniques can be used to both (a) verify the extent to which geometrically synthesised quantum circuits lie along geodesic, and thus time-optimal, routes and (b) synthesise those circuits.
1 code implementation • ECCV 2020 • Yu-Tong Cao, Jingya Wang, DaCheng Tao
The current state-of-the-art methods either focus on learning better cross-modal embeddings by mining only seen data, or they explicitly use generative adversarial networks (GANs) to synthesize unseen features.
1 code implementation • NeurIPS 2020 • Benteng Ma, Jing Zhang, Yong Xia, DaCheng Tao
Attention modules have been demonstrated effective in strengthening the representation ability of a neural network via reweighting spatial or channel features or stacking both operations sequentially.
1 code implementation • COLING 2022 • Bing Wang, Liang Ding, Qihuang Zhong, Ximing Li, DaCheng Tao
Aspect-based sentiment analysis (ABSA) is a fine-grained sentiment analysis task, which focuses on detecting the sentiment polarity towards the aspect in a sentence.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +4
1 code implementation • 12 Dec 2017 • Boyi Li, Wenqi Ren, Dengpan Fu, DaCheng Tao, Dan Feng, Wen-Jun Zeng, Zhangyang Wang
We present a comprehensive study and evaluation of existing single image dehazing algorithms, using a new large-scale benchmark consisting of both synthetic and real-world hazy images, called REalistic Single Image DEhazing (RESIDE).
1 code implementation • 4 Jun 2020 • Shengcong Chen, Changxing Ding, DaCheng Tao
Accordingly, in this paper, we devise a Boundary-assisted Region Proposal Network (BRP-Net) that achieves robust instance-level nucleus segmentation.
2 code implementations • 7 Aug 2020 • Xubin Zhong, Changxing Ding, Xian Qu, DaCheng Tao
To address this issue, in this paper, we propose a novel Polysemy Deciphering Network (PD-Net) that decodes the visual polysemy of verbs for HOI detection in three distinct ways.
Ranked #19 on Human-Object Interaction Detection on V-COCO
1 code implementation • ECCV 2020 • Xubin Zhong, Changxing Ding, Xian Qu, DaCheng Tao
First, PD-Net augments human pose and spatial features for HOI detection using language priors, enabling the verb classifiers to receive language hints that reduce the intra-class variation of the same verb.
1 code implementation • ACL 2022 • Yu Cao, Wei Bi, Meng Fang, Shuming Shi, DaCheng Tao
To alleviate the above data issues, we propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model to improve its performance.
1 code implementation • 11 Jun 2022 • Wei Li, Qiming Zhang, Jing Zhang, Zhen Huang, Xinmei Tian, DaCheng Tao
To address these issues, we establish a new high-quality dataset named RealRain-1k, consisting of $1, 120$ high-resolution paired clean and rainy images with low- and high-density rain streaks, respectively.