Search Results for author: Shanghang Zhang

Found 65 papers, 35 papers with code

HUB: Guiding Learned Optimizers with Continuous Prompt Tuning

no code implementations26 May 2023 Gaole Dai, Wei Wu, Ziyu Wang, Jie Fu, Shanghang Zhang, Tiejun Huang

By incorporating hand-designed optimizers as the second component in our hybrid approach, we are able to retain the benefits of learned optimizers while stabilizing the training process and, more importantly, improving testing performance.


Integer or Floating Point? New Outlooks for Low-Bit Quantization on Large Language Models

no code implementations21 May 2023 Yijia Zhang, Lingran Zhao, Shijie Cao, WenQiang Wang, Ting Cao, Fan Yang, Mao Yang, Shanghang Zhang, Ningyi Xu

In this study, we conduct a comparative analysis of INT and FP quantization with the same bit-width, revealing that the optimal quantization format varies across different layers due to the complexity and diversity of tensor distribution.


Open-Vocabulary Point-Cloud Object Detection without 3D Annotation

1 code implementation CVPR 2023 Yuheng Lu, Chenfeng Xu, Xiaobao Wei, Xiaodong Xie, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang

In this paper, we address open-vocabulary 3D point-cloud detection by a dividing-and-conquering strategy, which involves: 1) developing a point-cloud detector that can learn a general representation for localizing various objects, and 2) connecting textual and point-cloud representations to enable the detector to classify novel object categories based on text prompting.

3D Object Detection Cloud Detection +2

MoWE: Mixture of Weather Experts for Multiple Adverse Weather Removal

no code implementations24 Mar 2023 Yulin Luo, Rui Zhao, Xiaobao Wei, Jinwei Chen, Yijie Lu, Shenghao Xie, Tianyu Wang, Ruiqin Xiong, Ming Lu, Shanghang Zhang

Our MoWE achieves SOTA performance in upstream task on the proposed dataset and two public datasets, i. e. All-Weather and Rain/Fog-Cityscapes, and also have better perceptual results in downstream segmentation task compared to other methods.

Autonomous Driving Rain Removal

Exploring Sparse Visual Prompt for Cross-domain Semantic Segmentation

1 code implementation17 Mar 2023 Senqiao Yang, Jiarui Wu, Jiaming Liu, Xiaoqi Li, Qizhe Zhang, Mingjie Pan, Shanghang Zhang

Therefore, we propose a novel Sparse Visual Domain Prompts (SVDP) approach tailored for addressing domain shift problems in semantic segmentation, which holds minimal discrete trainable parameters (e. g. 10\%) of the prompt and reserves more spatial information.

Domain Adaptation Semantic Segmentation

PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection

1 code implementation CVPR 2023 Anthony Chen, Kevin Zhang, Renrui Zhang, Zihan Wang, Yuheng Lu, Yandong Guo, Shanghang Zhang

Masked Autoencoders learn strong visual representations and achieve state-of-the-art results in several independent modalities, yet very few works have addressed their capabilities in multi-modality settings.

3D Object Detection object-detection +2

MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReID

1 code implementation CVPR 2023 Jianyang Gu, Kai Wang, Hao Luo, Chen Chen, Wei Jiang, Yuqiang Fang, Shanghang Zhang, Yang You, Jian Zhao

Neural Architecture Search (NAS) has been increasingly appealing to the society of object Re-Identification (ReID), for that task-specific architectures significantly improve the retrieval performance.

Image Classification Neural Architecture Search +3

Improving Generalization of Meta-Learning With Inverted Regularization at Inner-Level

no code implementations CVPR 2023 Lianzhe Wang, Shiji Zhou, Shanghang Zhang, Xu Chu, Heng Chang, Wenwu Zhu

Despite the broad interest in meta-learning, the generalization problem remains one of the significant challenges in this field.


CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level Continuous Sparsification

no code implementations6 Dec 2022 Lirui Xiao, Huanrui Yang, Zhen Dong, Kurt Keutzer, Li Du, Shanghang Zhang

CSQ stabilizes the bit-level mixed-precision training process with a bi-level gradual continuous sparsification on both the bit values of the quantized weights and the bit selection in determining the quantization precision of each layer.


Cloud-Device Collaborative Adaptation to Continual Changing Environments in the Real-world

no code implementations CVPR 2023 Yulu Gan, Mingjie Pan, Rongyu Zhang, Zijian Ling, Lingran Zhao, Jiaming Liu, Shanghang Zhang

To enable the device model to deal with changing environments, we propose a new learning paradigm of Cloud-Device Collaborative Continual Adaptation, which encourages collaboration between cloud and device and improves the generalization of the device model.

object-detection Object Detection

BEV-SAN: Accurate BEV 3D Object Detection via Slice Attention Networks

no code implementations CVPR 2023 Xiaowei Chi, Jiaming Liu, Ming Lu, Rongyu Zhang, Zhaoqing Wang, Yandong Guo, Shanghang Zhang

In order to find them, we further propose a LiDAR-guided sampling strategy to leverage the statistical distribution of LiDAR to determine the heights of local slices.

3D Object Detection Autonomous Driving +1

NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers

no code implementations CVPR 2023 Yijiang Liu, Huanrui Yang, Zhen Dong, Kurt Keutzer, Li Du, Shanghang Zhang

Building on the theoretical insight, NoisyQuant achieves the first success on actively altering the heavy-tailed activation distribution with additive noisy bias to fit a given quantizer.


PointCLIP V2: Adapting CLIP for Powerful 3D Open-world Learning

2 code implementations21 Nov 2022 Xiangyang Zhu, Renrui Zhang, Bowei He, Ziyao Zeng, Shanghang Zhang, Peng Gao

Contrastive Language-Image Pre-training (CLIP) has shown promising open-world performance on 2D image tasks, while its transferred capacity on 3D point clouds, i. e., PointCLIP, is still far from satisfactory.

3D Classification 3D Object Detection +4

Margin-Based Few-Shot Class-Incremental Learning with Class-Level Overfitting Mitigation

1 code implementation10 Oct 2022 Yixiong Zou, Shanghang Zhang, Yuhua Li, Ruixuan Li

Few-shot class-incremental learning (FSCIL) is designed to incrementally recognize novel classes with only few training samples after the (pre-)training on base classes with sufficient samples, which focuses on both base-class performance and novel-class generalization.

class-incremental learning Few-Shot Class-Incremental Learning +1

Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models

1 code implementation27 Sep 2022 Xiuying Wei, Yunchen Zhang, Xiangguo Zhang, Ruihao Gong, Shanghang Zhang, Qi Zhang, Fengwei Yu, Xianglong Liu

With the trends of large NLP models, the increasing memory and computation costs hinder their efficient deployment on resource-limited devices.


Unsupervised Spike Depth Estimation via Cross-modality Cross-domain Knowledge Transfer

1 code implementation26 Aug 2022 Jiaming Liu, Qizhe Zhang, Jianing Li, Ming Lu, Tiejun Huang, Shanghang Zhang

Neuromorphic spike data, an upcoming modality with high temporal resolution, has shown promising potential in real-world applications due to its inherent advantage to overcome high-velocity motion blur.

Autonomous Driving Depth Estimation +2

Uncertainty Guided Depth Fusion for Spike Camera

no code implementations26 Aug 2022 Jianing Li, Jiaming Liu, Xiaobao Wei, Jiyuan Zhang, Ming Lu, Lei Ma, Li Du, Tiejun Huang, Shanghang Zhang

In this paper, we propose a novel Uncertainty-Guided Depth Fusion (UGDF) framework to fuse the predictions of monocular and stereo depth estimation networks for spike camera.

Autonomous Driving Stereo Depth Estimation

Efficient Meta-Tuning for Content-aware Neural Video Delivery

1 code implementation20 Jul 2022 Xiaoqi Li, Jiaming Liu, Shizun Wang, Cheng Lyu, Ming Lu, Yurong Chen, Anbang Yao, Yandong Guo, Shanghang Zhang

Our method significantly reduces the computational cost and achieves even better performance, paving the way for applying neural video delivery techniques to practical applications.


Open-Vocabulary 3D Detection via Image-level Class and Debiased Cross-modal Contrastive Learning

no code implementations5 Jul 2022 Yuheng Lu, Chenfeng Xu, Xiaobao Wei, Xiaodong Xie, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang

Current point-cloud detection methods have difficulty detecting the open-vocabulary objects in the real world, due to their limited generalization capability.

Cloud Detection Contrastive Learning

MTTrans: Cross-Domain Object Detection with Mean-Teacher Transformer

1 code implementation3 May 2022 Jinze Yu, Jiaming Liu, Xiaobao Wei, Haoyi Zhou, Yohei Nakata, Denis Gudovskiy, Tomoyuki Okuno, JianXin Li, Kurt Keutzer, Shanghang Zhang

To solve this problem, we propose an end-to-end cross-domain detection Transformer based on the mean teacher framework, MTTrans, which can fully exploit unlabeled target domain data in object detection training and transfer knowledge between domains via pseudo labels.

Domain Adaptation object-detection +2

Temporal Efficient Training of Spiking Neural Network via Gradient Re-weighting

1 code implementation ICLR 2022 Shikuang Deng, Yuhang Li, Shanghang Zhang, Shi Gu

Then we introduce the temporal efficient training (TET) approach to compensate for the loss of momentum in the gradient descent with SG so that the training process can converge into flatter minima with better generalizability.

Differentiable Spike: Rethinking Gradient-Descent for Training Spiking Neural Networks

no code implementations NeurIPS 2021 Yuhang Li, Yufei Guo, Shanghang Zhang, Shikuang Deng, Yongqing Hai, Shi Gu

Based on the introduced finite difference gradient, we propose a new family of Differentiable Spike (Dspike) functions that can adaptively evolve during training to find the optimal shape and smoothness for gradient estimation.

Event data classification Image Classification

2nd Place Solution for VisDA 2021 Challenge -- Universally Domain Adaptive Image Recognition

no code implementations27 Oct 2021 Haojin Liao, Xiaolin Song, Sicheng Zhao, Shanghang Zhang, Xiangyu Yue, Xingxu Yao, Yueming Zhang, Tengfei Xing, Pengfei Xu, Qiang Wang

The Visual Domain Adaptation (VisDA) 2021 Challenge calls for unsupervised domain adaptation (UDA) methods that can deal with both input distribution shift and label set variance between the source and target domains.

Universal Domain Adaptation Unsupervised Domain Adaptation

Meta Learning with Minimax Regularization

no code implementations29 Sep 2021 Lianzhe Wang, Shiji Zhou, Shanghang Zhang, Wenpeng Zhang, Heng Chang, Wenwu Zhu

Even though meta-learning has attracted research wide attention in recent years, the generalization problem of meta-learning is still not well addressed.

Few-Shot Learning

Unsupervised Domain Adaptive 3D Detection with Multi-Level Consistency

1 code implementation ICCV 2021 Zhipeng Luo, Zhongang Cai, Changqing Zhou, Gongjie Zhang, Haiyu Zhao, Shuai Yi, Shijian Lu, Hongsheng Li, Shanghang Zhang, Ziwei Liu

In addition, existing 3D domain adaptive detection methods often assume prior access to the target domain annotations, which is rarely feasible in the real world.

3D Object Detection Autonomous Driving +1

Delving Deep into the Generalization of Vision Transformers under Distribution Shifts

1 code implementation CVPR 2022 Chongzhi Zhang, Mingyuan Zhang, Shanghang Zhang, Daisheng Jin, Qiang Zhou, Zhongang Cai, Haiyu Zhao, Xianglong Liu, Ziwei Liu

By comprehensively investigating these GE-ViTs and comparing with their corresponding CNN models, we observe: 1) For the enhanced model, larger ViTs still benefit more for the OOD generalization.

Out-of-Distribution Generalization Self-Supervised Learning

Online Continual Adaptation with Active Self-Training

no code implementations11 Jun 2021 Shiji Zhou, Han Zhao, Shanghang Zhang, Lianzhe Wang, Heng Chang, Zhi Wang, Wenwu Zhu

Our theoretical results show that OSAMD can fast adapt to changing environments with active queries.

Self-Supervised Pretraining Improves Self-Supervised Pretraining

1 code implementation23 Mar 2021 Colorado J. Reed, Xiangyu Yue, Ani Nrusimha, Sayna Ebrahimi, Vivek Vijaykumar, Richard Mao, Bo Li, Shanghang Zhang, Devin Guillory, Sean Metzger, Kurt Keutzer, Trevor Darrell

Through experimentation on 16 diverse vision datasets, we show HPT converges up to 80x faster, improves accuracy across tasks, and improves the robustness of the self-supervised pretraining process to changes in the image augmentation policy or amount of pretraining data.

Image Augmentation

P4Contrast: Contrastive Learning with Pairs of Point-Pixel Pairs for RGB-D Scene Understanding

no code implementations24 Dec 2020 Yunze Liu, Li Yi, Shanghang Zhang, Qingnan Fan, Thomas Funkhouser, Hao Dong

Self-supervised representation learning is a critical problem in computer vision, as it provides a way to pretrain feature extractors on large unlabeled datasets that can be used as an initialization for more efficient and effective training on downstream tasks.

Contrastive Learning Representation Learning +1

Annotation-Efficient Untrimmed Video Action Recognition

no code implementations30 Nov 2020 Yixiong Zou, Shanghang Zhang, Guangyao Chen, Yonghong Tian, Kurt Keutzer, José M. F. Moura

In this paper, we target a new problem, Annotation-Efficient Video Recognition, to reduce the requirement of annotations for both large amount of samples and the action location.

Action Recognition Contrastive Learning +3

Cross-Domain Sentiment Classification with Contrastive Learning and Mutual Information Maximization

1 code implementation30 Oct 2020 Tian Li, Xiang Chen, Shanghang Zhang, Zhen Dong, Kurt Keutzer

Due to scarcity of labels on the target domain, we introduce mutual information maximization (MIM) apart from CL to exploit the features that best support the final prediction.

Contrastive Learning General Classification +3

A Review of Single-Source Deep Unsupervised Visual Domain Adaptation

1 code implementation1 Sep 2020 Sicheng Zhao, Xiangyu Yue, Shanghang Zhang, Bo Li, Han Zhao, Bichen Wu, Ravi Krishna, Joseph E. Gonzalez, Alberto L. Sangiovanni-Vincentelli, Sanjit A. Seshia, Kurt Keutzer

To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled or unlabeled target domain.

Unsupervised Domain Adaptation

Revisiting Mid-Level Patterns for Cross-Domain Few-Shot Recognition

no code implementations7 Aug 2020 Yixiong Zou, Shanghang Zhang, JianPeng Yu, Yonghong Tian, José M. F. Moura

To solve this problem, cross-domain FSL (CDFSL) is proposed very recently to transfer knowledge from general-domain base classes to special-domain novel classes.

cross-domain few-shot learning

TCGM: An Information-Theoretic Framework for Semi-Supervised Multi-Modality Learning

no code implementations ECCV 2020 Xinwei Sun, Yilun Xu, Peng Cao, Yuqing Kong, Lingjing Hu, Shanghang Zhang, Yizhou Wang

In this paper, we propose a novel information-theoretic approach, namely \textbf{T}otal \textbf{C}orrelation \textbf{G}ain \textbf{M}aximization (TCGM), for semi-supervised multi-modal learning, which is endowed with promising properties: (i) it can utilize effectively the information across different modalities of unlabeled data points to facilitate training classifiers of each modality (ii) it has theoretical guarantee to identify Bayesian classifiers, i. e., the ground truth posteriors of all modalities.

Disease Prediction Emotion Recognition +1

Rethinking Distributional Matching Based Domain Adaptation

no code implementations23 Jun 2020 Bo Li, Yezhen Wang, Tong Che, Shanghang Zhang, Sicheng Zhao, Pengfei Xu, Wei Zhou, Yoshua Bengio, Kurt Keutzer

In this paper, in order to devise robust DA algorithms, we first systematically analyze the limitations of DM based methods, and then build new benchmarks with more realistic domain shifts to evaluate the well-accepted DM methods.

Domain Adaptation

Transfer Learning or Self-supervised Learning? A Tale of Two Pretraining Paradigms

1 code implementation19 Jun 2020 Xingyi Yang, Xuehai He, Yuxiao Liang, Yue Yang, Shanghang Zhang, Pengtao Xie

There has not been a clear understanding on what properties of data and tasks render one approach outperforms the other.

Self-Supervised Learning Transfer Learning

Compositional Few-Shot Recognition with Primitive Discovery and Enhancing

no code implementations12 May 2020 Yixiong Zou, Shanghang Zhang, Ke Chen, Yonghong Tian, Yao-Wei Wang, José M. F. Moura

Inspired by such capability of humans, to imitate humans' ability of learning visual primitives and composing primitives to recognize novel classes, we propose an approach to FSL to learn a feature representation composed of important primitives, which is jointly trained with two parts, i. e. primitive discovery and primitive enhancing.

Few-Shot Image Classification Few-Shot Learning +1

Decoupling Global and Local Representations via Invertible Generative Flows

1 code implementation ICLR 2021 Xuezhe Ma, Xiang Kong, Shanghang Zhang, Eduard Hovy

In this work, we propose a new generative model that is capable of automatically decoupling global and local representations of images in an entirely unsupervised setting, by embedding a generative flow in the VAE framework to model the decoder.

Density Estimation Image Generation +2

COVID-CT-Dataset: A CT Scan Dataset about COVID-19

19 code implementations30 Mar 2020 Xingyi Yang, Xuehai He, Jinyu Zhao, Yichen Zhang, Shanghang Zhang, Pengtao Xie

Using this dataset, we develop diagnosis methods based on multi-task learning and self-supervised learning, that achieve an F1 of 0. 90, an AUC of 0. 98, and an accuracy of 0. 89.

Computed Tomography (CT) COVID-19 Diagnosis +2

Decoupling Features and Coordinates for Few-shot RGB Relocalization

no code implementations26 Nov 2019 Siyan Dong, Songyin Wu, Yixin Zhuang, Kai Xu, Shanghang Zhang, Baoquan Chen

To address this issue, we approach camera relocalization with a decoupled solution where feature extraction, coordinate regression, and pose estimation are performed separately.

Camera Relocalization Pose Estimation +1

Multi-source Distilling Domain Adaptation

1 code implementation22 Nov 2019 Sicheng Zhao, Guangzhi Wang, Shanghang Zhang, Yang Gu, Yaxian Li, Zhichao Song, Pengfei Xu, Runbo Hu, Hua Chai, Kurt Keutzer

Deep neural networks suffer from performance decay when there is domain shift between the labeled source domain and unlabeled target domain, which motivates the research on domain adaptation (DA).

Domain Adaptation Multi-Source Unsupervised Domain Adaptation

Generalized Zero-shot ICD Coding

no code implementations28 Sep 2019 Congzheng Song, Shanghang Zhang, Najmeh Sadoughi, Pengtao Xie, Eric Xing

The International Classification of Diseases (ICD) is a list of classification codes for the diagnoses.

General Classification Generalized Zero-Shot Learning +3

Dual Adversarial Semantics-Consistent Network for Generalized Zero-Shot Learning

no code implementations NeurIPS 2019 Jian Ni, Shanghang Zhang, Haiyong Xie

In particular, the primal GAN learns to synthesize inter-class discriminative and semantics-preserving visual features from both the semantic representations of seen/unseen classes and the ones reconstructed by the dual GAN.

Generalized Zero-Shot Learning Transfer Learning

MaCow: Masked Convolutional Generative Flow

2 code implementations NeurIPS 2019 Xuezhe Ma, Xiang Kong, Shanghang Zhang, Eduard Hovy

Flow-based generative models, conceptually attractive due to tractability of both the exact log-likelihood computation and latent-variable inference, and efficiency of both training and sampling, has led to a number of impressive empirical successes and spawned many advanced variants and theoretical investigations.

Density Estimation Image Generation

Adversarial Multiple Source Domain Adaptation

no code implementations NeurIPS 2018 Han Zhao, Shanghang Zhang, Guanhang Wu, José M. F. Moura, Joao P. Costeira, Geoffrey J. Gordon

In this paper we propose new generalization bounds and algorithms under both classification and regression settings for unsupervised multiple source domain adaptation.

Classification Domain Adaptation +5

Modeling relation paths for knowledge base completion via joint adversarial training

1 code implementation14 Oct 2018 Chen Li, Xutan Peng, Shanghang Zhang, Hao Peng, Philip S. Yu, Min He, Linfeng Du, Lihong Wang

By treating relations and multi-hop paths as two different input sources, we use a feature extractor, which is shared by two downstream components (i. e. relation classifier and source discriminator), to capture shared/similar information between them.

Knowledge Base Completion

Learning to Understand Image Blur

no code implementations CVPR 2018 Shanghang Zhang, Xiaohui Shen, Zhe Lin, Radomír Měch, João P. Costeira, José M. F. Moura

In this paper, we propose a unified framework to estimate a spatially-varying blur map and understand its desirability in terms of image quality at the same time.

Multiple Source Domain Adaptation with Adversarial Learning

no code implementations ICLR 2018 Han Zhao, Shanghang Zhang, Guanhang Wu, Jo\~{a}o P. Costeira, Jos\'{e} M. F. Moura, Geoffrey J. Gordon

We propose a new generalization bound for domain adaptation when there are multiple source domains with labeled instances and one target domain with unlabeled instances.

Domain Adaptation Sentiment Analysis

Topology Adaptive Graph Convolutional Networks

2 code implementations ICLR 2018 Jian Du, Shanghang Zhang, Guanhang Wu, Jose M. F. Moura, Soummya Kar

Spectral graph convolutional neural networks (CNNs) require approximation to the convolution to alleviate the computational complexity, resulting in performance loss.

FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras

1 code implementation ICCV 2017 Shanghang Zhang, Guanhang Wu, João P. Costeira, José M. F. Moura

To overcome limitations of existing methods and incorporate the temporal information of traffic video, we design a novel FCN-rLSTM network to jointly estimate vehicle density and vehicle count by connecting fully convolutional neural networks (FCN) with long short term memory networks (LSTM) in a residual learning fashion.

Multiple Source Domain Adaptation with Adversarial Training of Neural Networks

4 code implementations26 May 2017 Han Zhao, Shanghang Zhang, Guanhang Wu, João P. Costeira, José M. F. Moura, Geoffrey J. Gordon

As a step toward bridging the gap, we propose a new generalization bound for domain adaptation when there are multiple source domains with labeled instances and one target domain with unlabeled instances.

Domain Adaptation Sentiment Analysis

Understanding Traffic Density from Large-Scale Web Camera Data

1 code implementation CVPR 2017 Shanghang Zhang, Guanhang Wu, João P. Costeira, José M. F. Moura

Understanding traffic density from large-scale web camera (webcam) videos is a challenging problem because such videos have low spatial and temporal resolution, high occlusion and large perspective.


Cannot find the paper you are looking for? You can Submit a new open access paper.