no code implementations • 26 May 2023 • Gaole Dai, Wei Wu, Ziyu Wang, Jie Fu, Shanghang Zhang, Tiejun Huang
By incorporating hand-designed optimizers as the second component in our hybrid approach, we are able to retain the benefits of learned optimizers while stabilizing the training process and, more importantly, improving testing performance.
no code implementations • 21 May 2023 • Yijia Zhang, Lingran Zhao, Shijie Cao, WenQiang Wang, Ting Cao, Fan Yang, Mao Yang, Shanghang Zhang, Ningyi Xu
In this study, we conduct a comparative analysis of INT and FP quantization with the same bit-width, revealing that the optimal quantization format varies across different layers due to the complexity and diversity of tensor distribution.
1 code implementation • CVPR 2023 • Yuheng Lu, Chenfeng Xu, Xiaobao Wei, Xiaodong Xie, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang
In this paper, we address open-vocabulary 3D point-cloud detection by a dividing-and-conquering strategy, which involves: 1) developing a point-cloud detector that can learn a general representation for localizing various objects, and 2) connecting textual and point-cloud representations to enable the detector to classify novel object categories based on text prompting.
no code implementations • 24 Mar 2023 • Yulin Luo, Rui Zhao, Xiaobao Wei, Jinwei Chen, Yijie Lu, Shenghao Xie, Tianyu Wang, Ruiqin Xiong, Ming Lu, Shanghang Zhang
Our MoWE achieves SOTA performance in upstream task on the proposed dataset and two public datasets, i. e. All-Weather and Rain/Fog-Cityscapes, and also have better perceptual results in downstream segmentation task compared to other methods.
1 code implementation • 17 Mar 2023 • Senqiao Yang, Jiarui Wu, Jiaming Liu, Xiaoqi Li, Qizhe Zhang, Mingjie Pan, Shanghang Zhang
Therefore, we propose a novel Sparse Visual Domain Prompts (SVDP) approach tailored for addressing domain shift problems in semantic segmentation, which holds minimal discrete trainable parameters (e. g. 10\%) of the prompt and reserves more spatial information.
1 code implementation • CVPR 2023 • Anthony Chen, Kevin Zhang, Renrui Zhang, Zihan Wang, Yuheng Lu, Yandong Guo, Shanghang Zhang
Masked Autoencoders learn strong visual representations and achieve state-of-the-art results in several independent modalities, yet very few works have addressed their capabilities in multi-modality settings.
1 code implementation • CVPR 2023 • Jianyang Gu, Kai Wang, Hao Luo, Chen Chen, Wei Jiang, Yuqiang Fang, Shanghang Zhang, Yang You, Jian Zhao
Neural Architecture Search (NAS) has been increasingly appealing to the society of object Re-Identification (ReID), for that task-specific architectures significantly improve the retrieval performance.
Ranked #6 on
Vehicle Re-Identification
on VeRi-776
1 code implementation • 8 Feb 2023 • Xiuyu Li, Long Lian, Yijiang Liu, Huanrui Yang, Zhen Dong, Daniel Kang, Shanghang Zhang, Kurt Keutzer
Diffusion models have achieved great success in synthesizing diverse and high-fidelity images.
no code implementations • CVPR 2023 • Lianzhe Wang, Shiji Zhou, Shanghang Zhang, Xu Chu, Heng Chang, Wenwu Zhu
Despite the broad interest in meta-learning, the generalization problem remains one of the significant challenges in this field.
1 code implementation • CVPR 2023 • Yuqing Ma, Hainan Li, Zhange Zhang, Jinyang Guo, Shanghang Zhang, Ruihao Gong, Xianglong Liu
To the best of our knowledge, this is the first OWOD work without manual unknown selection.
no code implementations • 6 Dec 2022 • Lirui Xiao, Huanrui Yang, Zhen Dong, Kurt Keutzer, Li Du, Shanghang Zhang
CSQ stabilizes the bit-level mixed-precision training process with a bi-level gradual continuous sparsification on both the bit values of the quantized weights and the bit selection in determining the quantization precision of each layer.
no code implementations • CVPR 2023 • Yulu Gan, Mingjie Pan, Rongyu Zhang, Zijian Ling, Lingran Zhao, Jiaming Liu, Shanghang Zhang
To enable the device model to deal with changing environments, we propose a new learning paradigm of Cloud-Device Collaborative Continual Adaptation, which encourages collaboration between cloud and device and improves the generalization of the device model.
no code implementations • CVPR 2023 • Xiaowei Chi, Jiaming Liu, Ming Lu, Rongyu Zhang, Zhaoqing Wang, Yandong Guo, Shanghang Zhang
In order to find them, we further propose a LiDAR-guided sampling strategy to leverage the statistical distribution of LiDAR to determine the heights of local slices.
no code implementations • 1 Dec 2022 • Jianing Li, Ming Lu, Jiaming Liu, Yandong Guo, Li Du, Shanghang Zhang
In this paper, we propose a unified framework named BEV-LGKD to transfer the knowledge in the teacher-student manner.
no code implementations • 30 Nov 2022 • Jiaming Liu, Rongyu Zhang, Xiaowei Chi, Xiaoqi Li, Ming Lu, Yandong Guo, Shanghang Zhang
Vision-Centric Bird-Eye-View (BEV) perception has shown promising potential and attracted increasing attention in autonomous driving.
no code implementations • CVPR 2023 • Yijiang Liu, Huanrui Yang, Zhen Dong, Kurt Keutzer, Li Du, Shanghang Zhang
Building on the theoretical insight, NoisyQuant achieves the first success on actively altering the heavy-tailed activation distribution with additive noisy bias to fit a given quantizer.
2 code implementations • 21 Nov 2022 • Xiangyang Zhu, Renrui Zhang, Bowei He, Ziyao Zeng, Shanghang Zhang, Peng Gao
Contrastive Language-Image Pre-training (CLIP) has shown promising open-world performance on 2D image tasks, while its transferred capacity on 3D point clouds, i. e., PointCLIP, is still far from satisfactory.
Ranked #1 on
Zero-Shot Transfer 3D Point Cloud Classification
on ScanObjectNN
(using extra training data)
1 code implementation • 10 Oct 2022 • Yixiong Zou, Shanghang Zhang, Yuhua Li, Ruixuan Li
Few-shot class-incremental learning (FSCIL) is designed to incrementally recognize novel classes with only few training samples after the (pre-)training on base classes with sufficient samples, which focuses on both base-class performance and novel-class generalization.
class-incremental learning
Few-Shot Class-Incremental Learning
+1
1 code implementation • 27 Sep 2022 • Xiuying Wei, Yunchen Zhang, Xiangguo Zhang, Ruihao Gong, Shanghang Zhang, Qi Zhang, Fengwei Yu, Xianglong Liu
With the trends of large NLP models, the increasing memory and computation costs hinder their efficient deployment on resource-limited devices.
1 code implementation • 26 Aug 2022 • Jiaming Liu, Qizhe Zhang, Jianing Li, Ming Lu, Tiejun Huang, Shanghang Zhang
Neuromorphic spike data, an upcoming modality with high temporal resolution, has shown promising potential in real-world applications due to its inherent advantage to overcome high-velocity motion blur.
no code implementations • 26 Aug 2022 • Jianing Li, Jiaming Liu, Xiaobao Wei, Jiyuan Zhang, Ming Lu, Lei Ma, Li Du, Tiejun Huang, Shanghang Zhang
In this paper, we propose a novel Uncertainty-Guided Depth Fusion (UGDF) framework to fuse the predictions of monocular and stereo depth estimation networks for spike camera.
1 code implementation • 20 Jul 2022 • Xiaoqi Li, Jiaming Liu, Shizun Wang, Cheng Lyu, Ming Lu, Yurong Chen, Anbang Yao, Yandong Guo, Shanghang Zhang
Our method significantly reduces the computational cost and achieves even better performance, paving the way for applying neural video delivery techniques to practical applications.
no code implementations • 5 Jul 2022 • Yuheng Lu, Chenfeng Xu, Xiaobao Wei, Xiaodong Xie, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang
Current point-cloud detection methods have difficulty detecting the open-vocabulary objects in the real world, due to their limited generalization capability.
1 code implementation • 20 Jun 2022 • Tian Li, Xiang Chen, Zhen Dong, Weijiang Yu, Yijun Yan, Kurt Keutzer, Shanghang Zhang
Then during training, DASK injects pivot-related knowledge graph information into source domain texts.
no code implementations • 4 May 2022 • Zhen Dong, Kaicheng Zhou, Guohao Li, Qiang Zhou, Mingfei Guo, Bernard Ghanem, Kurt Keutzer, Shanghang Zhang
Neural architecture search (NAS) has shown great success in the automatic design of deep neural networks (DNNs).
1 code implementation • 3 May 2022 • Jinze Yu, Jiaming Liu, Xiaobao Wei, Haoyi Zhou, Yohei Nakata, Denis Gudovskiy, Tomoyuki Okuno, JianXin Li, Kurt Keutzer, Shanghang Zhang
To solve this problem, we propose an end-to-end cross-domain detection Transformer based on the mean teacher framework, MTTrans, which can fully exploit unlabeled target domain data in object detection training and transfer knowledge between domains via pseudo labels.
1 code implementation • ICLR 2022 • Shikuang Deng, Yuhang Li, Shanghang Zhang, Shi Gu
Then we introduce the temporal efficient training (TET) approach to compensate for the loss of momentum in the gradient descent with SG so that the training process can converge into flatter minima with better generalizability.
no code implementations • NeurIPS 2021 • Yuhang Li, Yufei Guo, Shanghang Zhang, Shikuang Deng, Yongqing Hai, Shi Gu
Based on the introduced finite difference gradient, we propose a new family of Differentiable Spike (Dspike) functions that can adaptively evolve during training to find the optimal shape and smoothness for gradient estimation.
Ranked #2 on
Event data classification
on CIFAR10-DVS
no code implementations • 27 Oct 2021 • Haojin Liao, Xiaolin Song, Sicheng Zhao, Shanghang Zhang, Xiangyu Yue, Xingxu Yao, Yueming Zhang, Tengfei Xing, Pengfei Xu, Qiang Wang
The Visual Domain Adaptation (VisDA) 2021 Challenge calls for unsupervised domain adaptation (UDA) methods that can deal with both input distribution shift and label set variance between the source and target domains.
no code implementations • 29 Sep 2021 • Lianzhe Wang, Shiji Zhou, Shanghang Zhang, Wenpeng Zhang, Heng Chang, Wenwu Zhu
Even though meta-learning has attracted research wide attention in recent years, the generalization problem of meta-learning is still not well addressed.
1 code implementation • ICCV 2021 • Zhipeng Luo, Zhongang Cai, Changqing Zhou, Gongjie Zhang, Haiyu Zhao, Shuai Yi, Shijian Lu, Hongsheng Li, Shanghang Zhang, Ziwei Liu
In addition, existing 3D domain adaptive detection methods often assume prior access to the target domain annotations, which is rarely feasible in the real world.
1 code implementation • ICCV 2021 • Yunze Liu, Qingnan Fan, Shanghang Zhang, Hao Dong, Thomas Funkhouser, Li Yi
Another approach is to concatenate all the modalities into a tuple and then contrast positive and negative tuple correspondences.
Ranked #38 on
Semantic Segmentation
on NYU Depth v2
1 code implementation • CVPR 2022 • Chongzhi Zhang, Mingyuan Zhang, Shanghang Zhang, Daisheng Jin, Qiang Zhou, Zhongang Cai, Haiyu Zhao, Xianglong Liu, Ziwei Liu
By comprehensively investigating these GE-ViTs and comparing with their corresponding CNN models, we observe: 1) For the enhanced model, larger ViTs still benefit more for the OOD generalization.
no code implementations • 11 Jun 2021 • Shiji Zhou, Han Zhao, Shanghang Zhang, Lianzhe Wang, Heng Chang, Zhi Wang, Wenwu Zhu
Our theoretical results show that OSAMD can fast adapt to changing environments with active queries.
1 code implementation • CVPR 2021 • Xiangyu Yue, Zangwei Zheng, Shanghang Zhang, Yang Gao, Trevor Darrell, Kurt Keutzer, Alberto Sangiovanni Vincentelli
In this paper, we propose an end-to-end Prototypical Cross-domain Self-Supervised Learning (PCS) framework for Few-shot Unsupervised Domain Adaptation (FUDA).
Ranked #6 on
Semantic Segmentation
on DensePASS
1 code implementation • 23 Mar 2021 • Colorado J. Reed, Xiangyu Yue, Ani Nrusimha, Sayna Ebrahimi, Vivek Vijaykumar, Richard Mao, Bo Li, Shanghang Zhang, Devin Guillory, Sean Metzger, Kurt Keutzer, Trevor Darrell
Through experimentation on 16 diverse vision datasets, we show HPT converges up to 80x faster, improves accuracy across tasks, and improves the robustness of the self-supervised pretraining process to changes in the image augmentation policy or amount of pretraining data.
no code implementations • 24 Dec 2020 • Yunze Liu, Li Yi, Shanghang Zhang, Qingnan Fan, Thomas Funkhouser, Hao Dong
Self-supervised representation learning is a critical problem in computer vision, as it provides a way to pretrain feature extractors on large unlabeled datasets that can be used as an initialization for more efficient and effective training on downstream tasks.
5 code implementations • 14 Dec 2020 • Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, JianXin Li, Hui Xiong, Wancai Zhang
Many real-world applications require the prediction of long sequence time-series, such as electricity consumption planning.
Ranked #5 on
Time Series Forecasting
on ETTh2 (48)
(using extra training data)
Multivariate Time Series Forecasting
Univariate Time Series Forecasting
1 code implementation • 5 Dec 2020 • Tian Li, Xiang Chen, Shanghang Zhang, Zhen Dong, Kurt Keutzer
In this paper, we propose a contrastive learning framework for cross-domain sentiment classification.
no code implementations • 30 Nov 2020 • Yixiong Zou, Shanghang Zhang, Guangyao Chen, Yonghong Tian, Kurt Keutzer, José M. F. Moura
In this paper, we target a new problem, Annotation-Efficient Video Recognition, to reduce the requirement of annotations for both large amount of samples and the action location.
1 code implementation • 30 Oct 2020 • Tian Li, Xiang Chen, Shanghang Zhang, Zhen Dong, Kurt Keutzer
Due to scarcity of labels on the target domain, we introduce mutual information maximization (MIM) apart from CL to exploit the features that best support the final prediction.
no code implementations • CVPR 2021 • Bo Li, Yezhen Wang, Shanghang Zhang, Dongsheng Li, Trevor Darrell, Kurt Keutzer, Han Zhao
First, we provide a finite sample bound for both classification and regression problems under Semi-DA.
1 code implementation • 1 Sep 2020 • Sicheng Zhao, Xiangyu Yue, Shanghang Zhang, Bo Li, Han Zhao, Bichen Wu, Ravi Krishna, Joseph E. Gonzalez, Alberto L. Sangiovanni-Vincentelli, Sanjit A. Seshia, Kurt Keutzer
To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled or unlabeled target domain.
2 code implementations • ECCV 2020 • Ke Mei, Chuang Zhu, Jiaqi Zou, Shanghang Zhang
In this paper, we propose an instance adaptive self-training framework for UDA on the task of semantic segmentation.
Ranked #13 on
Image-to-Image Translation
on SYNTHIA-to-Cityscapes
no code implementations • 7 Aug 2020 • Yixiong Zou, Shanghang Zhang, JianPeng Yu, Yonghong Tian, José M. F. Moura
To solve this problem, cross-domain FSL (CDFSL) is proposed very recently to transfer knowledge from general-domain base classes to special-domain novel classes.
no code implementations • ECCV 2020 • Xinwei Sun, Yilun Xu, Peng Cao, Yuqing Kong, Lingjing Hu, Shanghang Zhang, Yizhou Wang
In this paper, we propose a novel information-theoretic approach, namely \textbf{T}otal \textbf{C}orrelation \textbf{G}ain \textbf{M}aximization (TCGM), for semi-supervised multi-modal learning, which is endowed with promising properties: (i) it can utilize effectively the information across different modalities of unlabeled data points to facilitate training classifiers of each modality (ii) it has theoretical guarantee to identify Bayesian classifiers, i. e., the ground truth posteriors of all modalities.
no code implementations • 23 Jun 2020 • Bo Li, Yezhen Wang, Tong Che, Shanghang Zhang, Sicheng Zhao, Pengfei Xu, Wei Zhou, Yoshua Bengio, Kurt Keutzer
In this paper, in order to devise robust DA algorithms, we first systematically analyze the limitations of DM based methods, and then build new benchmarks with more realistic domain shifts to evaluate the well-accepted DM methods.
1 code implementation • 19 Jun 2020 • Xingyi Yang, Xuehai He, Yuxiao Liang, Yue Yang, Shanghang Zhang, Pengtao Xie
There has not been a clear understanding on what properties of data and tasks render one approach outperforms the other.
no code implementations • 12 May 2020 • Yixiong Zou, Shanghang Zhang, Ke Chen, Yonghong Tian, Yao-Wei Wang, José M. F. Moura
Inspired by such capability of humans, to imitate humans' ability of learning visual primitives and composing primitives to recognize novel classes, we propose an approach to FSL to learn a feature representation composed of important primitives, which is jointly trained with two parts, i. e. primitive discovery and primitive enhancing.
1 code implementation • medRxiv 2020 • Xuehai He, Xingyi Yang, Shanghang Zhang, Jinyu Zhao, Yichen Zhang, Eric Xing, Pengtao Xie
Besides, these works require a large number of CTs to train accurate diagnosis models, which are difficult to obtain.
1 code implementation • ICLR 2021 • Xuezhe Ma, Xiang Kong, Shanghang Zhang, Eduard Hovy
In this work, we propose a new generative model that is capable of automatically decoupling global and local representations of images in an entirely unsupervised setting, by embedding a generative flow in the VAE framework to model the decoder.
19 code implementations • 30 Mar 2020 • Xingyi Yang, Xuehai He, Jinyu Zhao, Yichen Zhang, Shanghang Zhang, Pengtao Xie
Using this dataset, we develop diagnosis methods based on multi-task learning and self-supervised learning, that achieve an F1 of 0. 90, an AUC of 0. 98, and an accuracy of 0. 89.
no code implementations • 26 Nov 2019 • Siyan Dong, Songyin Wu, Yixin Zhuang, Kai Xu, Shanghang Zhang, Baoquan Chen
To address this issue, we approach camera relocalization with a decoupled solution where feature extraction, coordinate regression, and pose estimation are performed separately.
1 code implementation • 22 Nov 2019 • Sicheng Zhao, Guangzhi Wang, Shanghang Zhang, Yang Gu, Yaxian Li, Zhichao Song, Pengfei Xu, Runbo Hu, Hua Chai, Kurt Keutzer
Deep neural networks suffer from performance decay when there is domain shift between the labeled source domain and unlabeled target domain, which motivates the research on domain adaptation (DA).
Domain Adaptation
Multi-Source Unsupervised Domain Adaptation
no code implementations • 28 Sep 2019 • Congzheng Song, Shanghang Zhang, Najmeh Sadoughi, Pengtao Xie, Eric Xing
The International Classification of Diseases (ICD) is a list of classification codes for the diagnoses.
no code implementations • NeurIPS 2019 • Jian Ni, Shanghang Zhang, Haiyong Xie
In particular, the primal GAN learns to synthesize inter-class discriminative and semantics-preserving visual features from both the semantic representations of seen/unseen classes and the ones reconstructed by the dual GAN.
2 code implementations • NeurIPS 2019 • Xuezhe Ma, Xiang Kong, Shanghang Zhang, Eduard Hovy
Flow-based generative models, conceptually attractive due to tractability of both the exact log-likelihood computation and latent-variable inference, and efficiency of both training and sampling, has led to a number of impressive empirical successes and spawned many advanced variants and theoretical investigations.
Ranked #3 on
Image Generation
on CelebA 256x256
no code implementations • NeurIPS 2018 • Han Zhao, Shanghang Zhang, Guanhang Wu, José M. F. Moura, Joao P. Costeira, Geoffrey J. Gordon
In this paper we propose new generalization bounds and algorithms under both classification and regression settings for unsupervised multiple source domain adaptation.
Ranked #3 on
Domain Adaptation
on GTA5+Synscapes to Cityscapes
1 code implementation • 14 Oct 2018 • Chen Li, Xutan Peng, Shanghang Zhang, Hao Peng, Philip S. Yu, Min He, Linfeng Du, Lihong Wang
By treating relations and multi-hop paths as two different input sources, we use a feature extractor, which is shared by two downstream components (i. e. relation classifier and source discriminator), to capture shared/similar information between them.
no code implementations • CVPR 2018 • Shanghang Zhang, Xiaohui Shen, Zhe Lin, RadomÃr MÄch, João P. Costeira, José M. F. Moura
In this paper, we propose a unified framework to estimate a spatially-varying blur map and understand its desirability in terms of image quality at the same time.
no code implementations • ICLR 2018 • Han Zhao, Shanghang Zhang, Guanhang Wu, Jo\~{a}o P. Costeira, Jos\'{e} M. F. Moura, Geoffrey J. Gordon
We propose a new generalization bound for domain adaptation when there are multiple source domains with labeled instances and one target domain with unlabeled instances.
2 code implementations • ICLR 2018 • Jian Du, Shanghang Zhang, Guanhang Wu, Jose M. F. Moura, Soummya Kar
Spectral graph convolutional neural networks (CNNs) require approximation to the convolution to alleviate the computational complexity, resulting in performance loss.
1 code implementation • ICCV 2017 • Shanghang Zhang, Guanhang Wu, João P. Costeira, José M. F. Moura
To overcome limitations of existing methods and incorporate the temporal information of traffic video, we design a novel FCN-rLSTM network to jointly estimate vehicle density and vehicle count by connecting fully convolutional neural networks (FCN) with long short term memory networks (LSTM) in a residual learning fashion.
4 code implementations • 26 May 2017 • Han Zhao, Shanghang Zhang, Guanhang Wu, João P. Costeira, José M. F. Moura, Geoffrey J. Gordon
As a step toward bridging the gap, we propose a new generalization bound for domain adaptation when there are multiple source domains with labeled instances and one target domain with unlabeled instances.
1 code implementation • CVPR 2017 • Shanghang Zhang, Guanhang Wu, João P. Costeira, José M. F. Moura
Understanding traffic density from large-scale web camera (webcam) videos is a challenging problem because such videos have low spatial and temporal resolution, high occlusion and large perspective.