Search Results for author: Han Hu

Found 99 papers, 59 papers with code

Scalable Differential Privacy with Certified Robustness in Adversarial Learning

1 code implementation ICML 2020 Hai Phan, My T. Thai, Han Hu, Ruoming Jin, Tong Sun, Dejing Dou

In this paper, we aim to develop a scalable algorithm to preserve differential privacy (DP) in adversarial learning for deep neural networks (DNNs), with certified robustness to adversarial examples.

小样本关系分类研究综述(Few-Shot Relation Classification: A Survey)

no code implementations CCL 2020 Han Hu, Pengyuan Liu

关系分类作为构建结构化知识的重要一环, 在自然语言处理领域备受关注。但在很多应用领域中(医疗、金融领域), 收集充足的用于训练关系分类模型的数据是十分困难的。近年来, 仅需要少量训练样本的小样本学习研究逐渐新兴于各大领域。本文对近期小样本关系分类模型与方法进行了系统的综述。根据度量方法的不同, 将现有方法分为原型式和分布式两大类。根据是否利用额外信息, 将模型分为预训练和非预训练两大类。此外, 除了常规设定下的小样本学习, 本文还梳理了跨领域和稀缺资源场景下的小样本学习, 并探讨了目前小样本关系分类方法的局限性, 分析了跨领域小样本 学习面临的技术挑战。最后, 展望了小样本关系分类未来的发展方向。

Few-Shot Relation Classification

Human Pose as Compositional Tokens

no code implementations21 Mar 2023 Zigang Geng, Chunyu Wang, Yixuan Wei, Ze Liu, Houqiang Li, Han Hu

Human pose is typically represented by a coordinate vector of body joints or their heatmap embeddings.

Pose Estimation

Pseudo Supervised Metrics: Evaluating Unsupervised Image to Image Translation Models In Unsupervised Cross-Domain Classification Frameworks

no code implementations18 Mar 2023 Firas Al-Hindawi, Md Mahfuzur Rahman Siddiquee, Teresa Wu, Han Hu, Ying Sun

In this paper, we introduce a new method called Pseudo Supervised Metrics that was designed specifically to support cross-domain classification applications contrary to other typically used metrics such as the FID which was designed to evaluate the model in terms of the quality of the generated image from a human-eye perspective.

Translation Unsupervised Image-To-Image Translation

Efficient Diffusion Training via Min-SNR Weighting Strategy

1 code implementation16 Mar 2023 Tiankai Hang, Shuyang Gu, Chen Li, Jianmin Bao, Dong Chen, Han Hu, Xin Geng, Baining Guo

Denoising diffusion models have been a mainstream approach for image generation, however, training these models often suffers from slow convergence.

Denoising Image Generation +2

DeepMIM: Deep Supervision for Masked Image Modeling

1 code implementation15 Mar 2023 Sucheng Ren, Fangyun Wei, Samuel Albanie, Zheng Zhang, Han Hu

Deep supervision, which involves extra supervisions to the intermediate features of a neural network, was widely used in image classification in the early deep learning era since it significantly reduces the training difficulty and eases the optimization like avoiding gradient vanish over the vanilla training.

Image Classification object-detection +2

SGDA: Towards 3D Universal Pulmonary Nodule Detection via Slice Grouped Domain Attention

1 code implementation7 Mar 2023 Rui Xu, Zhi Liu, Yong Luo, Han Hu, Li Shen, Bo Du, Kaiming Kuang, Jiancheng Yang

To address this issue, we propose a slice grouped domain attention (SGDA) module to enhance the generalization capability of the pulmonary nodule detection networks.

Computed Tomography (CT)

Subspace based Federated Unlearning

no code implementations24 Feb 2023 Guanghao Li, Li Shen, Yan Sun, Yue Hu, Han Hu, DaCheng Tao

Federated learning (FL) enables multiple clients to train a machine learning model collaboratively without exchanging their local data.

Federated Learning

Side Adapter Network for Open-Vocabulary Semantic Segmentation

1 code implementation23 Feb 2023 Mengde Xu, Zheng Zhang, Fangyun Wei, Han Hu, Xiang Bai

A side network is attached to a frozen CLIP model with two branches: one for predicting mask proposals, and the other for predicting attention bias which is applied in the CLIP model to recognize the class of masks.

Language Modelling Open Vocabulary Semantic Segmentation +1

FedABC: Targeting Fair Competition in Personalized Federated Learning

no code implementations15 Feb 2023 Dui Wang, Li Shen, Yong Luo, Han Hu, Kehua Su, Yonggang Wen, DaCheng Tao

In particular, we adopt the ``one-vs-all'' training strategy in each client to alleviate the unfair competition between classes by constructing a personalized binary classification problem for each class.

Personalized Federated Learning

Training-free Lexical Backdoor Attacks on Language Models

1 code implementation8 Feb 2023 Yujin Huang, Terry Yue Zhuo, Qiongkai Xu, Han Hu, Xingliang Yuan, Chunyang Chen

In this work, we propose Training-Free Lexical Backdoor Attack (TFLexAttack) as the first training-free backdoor attack on language models.

Backdoor Attack Data Poisoning +1

All in Tokens: Unifying Output Space of Visual Tasks via Soft Token

1 code implementation5 Jan 2023 Jia Ning, Chen Li, Zheng Zhang, Zigang Geng, Qi Dai, Kun He, Han Hu

With these new techniques and other designs, we show that the proposed general-purpose task-solver can perform both instance segmentation and depth estimation well.

Instance Segmentation Monocular Depth Estimation +1

TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models

1 code implementation3 Jan 2023 Sucheng Ren, Fangyun Wei, Zheng Zhang, Han Hu

Our TinyMIM model of tiny size achieves 79. 6% top-1 accuracy on ImageNet-1K image classification, which sets a new record for small vision models of the same size and computation budget.

Image Classification Semantic Segmentation

Attentive Mask CLIP

no code implementations16 Dec 2022 Yifan Yang, Weiquan Huang, Yixuan Wei, Houwen Peng, Xinyang Jiang, Huiqiang Jiang, Fangyun Wei, Yin Wang, Han Hu, Lili Qiu, Yuqing Yang

To address this issue, we propose an attentive token removal approach for CLIP training, which retains tokens with a high semantic correlation to the text description.

Contrastive Learning Retrieval +1

ResFormer: Scaling ViTs with Multi-Resolution Training

no code implementations1 Dec 2022 Rui Tian, Zuxuan Wu, Qi Dai, Han Hu, Yu Qiao, Yu-Gang Jiang

Vision Transformers (ViTs) have achieved overwhelming success, yet they suffer from vulnerable resolution scalability, i. e., the performance drops drastically when presented with input resolutions that are unseen during training.

Action Recognition Image Classification +2

Exploring Discrete Diffusion Models for Image Captioning

1 code implementation21 Nov 2022 Zixin Zhu, Yixuan Wei, JianFeng Wang, Zhe Gan, Zheng Zhang, Le Wang, Gang Hua, Lijuan Wang, Zicheng Liu, Han Hu

The image captioning task is typically realized by an auto-regressive method that decodes the text tokens one by one.

Image Captioning Image Generation

Could Giant Pretrained Image Models Extract Universal Representations?

no code implementations3 Nov 2022 Yutong Lin, Ze Liu, Zheng Zhang, Han Hu, Nanning Zheng, Stephen Lin, Yue Cao

In this paper, we present a study of frozen pretrained models when applied to diverse and representative computer vision tasks, including object detection, semantic segmentation and video action recognition.

Action Recognition In Videos Instance Segmentation +5

Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning

no code implementations3 Oct 2022 Weicong Liang, Yuhui Yuan, Henghui Ding, Xiao Luo, WeiHong Lin, Ding Jia, Zheng Zhang, Chao Zhang, Han Hu

Vision transformers have recently achieved competitive results across various vision tasks but still suffer from heavy computation costs when processing a large number of tokens.

Depth Estimation Image Classification +4

One-to-Many Semantic Communication Systems: Design, Implementation, Performance Evaluation

no code implementations20 Sep 2022 Han Hu, Xingwu Zhu, Fuhui Zhou, Wei Wu, Rose Qingyang Hu, Hongbo Zhu

To effectively exploit the benefits enabled by semantic communication, in this paper, we propose a one-to-many semantic communication system.

Transfer Learning

Not All Instances Contribute Equally: Instance-adaptive Class Representation Learning for Few-Shot Visual Recognition

no code implementations7 Sep 2022 Mengya Han, Yibing Zhan, Yong Luo, Bo Du, Han Hu, Yonggang Wen, DaCheng Tao

To address the above issues, we propose a novel metric-based meta-learning framework termed instance-adaptive class representation learning network (ICRL-Net) for few-shot visual recognition.

Meta-Learning Representation Learning

Leveraging GAN Priors for Few-Shot Part Segmentation

1 code implementation27 Jul 2022 Mengya Han, Heliang Zheng, Chaoyue Wang, Yong Luo, Han Hu, Bo Du

Overall, this work is an attempt to explore the internal relevance between generation tasks and perception tasks by prompt designing.

Image Generation

DETRs with Hybrid Matching

1 code implementation26 Jul 2022 Ding Jia, Yuhui Yuan, Haodi He, Xiaopei Wu, Haojun Yu, WeiHong Lin, Lei Sun, Chao Zhang, Han Hu

This end-to-end signature is important for the versatility of DETR, and it has been generalized to a wide range of visual problems, including instance/semantic segmentation, human pose estimation, and point cloud/multi-view-images based detection, etc.

Object Detection Pose Estimation +2

Lifelong DP: Consistently Bounded Differential Privacy in Lifelong Machine Learning

1 code implementation26 Jul 2022 Phung Lai, Han Hu, NhatHai Phan, Ruoming Jin, My T. Thai, An M. Chen

In this paper, we show that the process of continually learning new tasks and memorizing previous tasks introduces unknown privacy risks and challenges to bound the privacy loss.

BIG-bench Machine Learning

On Data Scaling in Masked Image Modeling

1 code implementation9 Jun 2022 Zhenda Xie, Zheng Zhang, Yue Cao, Yutong Lin, Yixuan Wei, Qi Dai, Han Hu

Our study reveals that: (i) Masked image modeling is also demanding on larger data.

Self-Supervised Learning

Tutel: Adaptive Mixture-of-Experts at Scale

2 code implementations7 Jun 2022 Changho Hwang, Wei Cui, Yifan Xiong, Ziyue Yang, Ze Liu, Han Hu, Zilong Wang, Rafael Salas, Jithin Jose, Prabhat Ram, Joe Chau, Peng Cheng, Fan Yang, Mao Yang, Yongqiang Xiong

On effectiveness, the SwinV2-MoE model achieves superior accuracy in both pre-training and down-stream computer vision tasks such as COCO object detection than the counterpart dense model, indicating the readiness of Tutel for end-to-end real-world model training and inference.

Object Detection

Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation

1 code implementation27 May 2022 Yixuan Wei, Han Hu, Zhenda Xie, Zheng Zhang, Yue Cao, Jianmin Bao, Dong Chen, Baining Guo

These properties, which we aggregately refer to as optimization friendliness, are identified and analyzed by a set of attention- and optimization-related diagnosis tools.

Ranked #2 on Instance Segmentation on COCO test-dev (using extra training data)

Contrastive Learning Image Classification +5

Revealing the Dark Secrets of Masked Image Modeling

1 code implementation26 May 2022 Zhenda Xie, Zigang Geng, Jingcheng Hu, Zheng Zhang, Han Hu, Yue Cao

In this paper, we compare MIM with the long-dominant supervised pre-trained models from two perspectives, the visualizations and the experiments, to uncover their key representational differences.

Inductive Bias Monocular Depth Estimation +3

CDFKD-MFS: Collaborative Data-free Knowledge Distillation via Multi-level Feature Sharing

1 code implementation24 May 2022 Zhiwei Hao, Yong Luo, Zhi Wang, Han Hu, Jianping An

To tackle this challenge, we propose a framework termed collaborative data-free knowledge distillation via multi-level feature sharing (CDFKD-MFS), which consists of a multi-header student module, an asymmetric adversarial data-free KD module, and an attention-based aggregation module.

Knowledge Distillation

Multi-Agent Collaborative Inference via DNN Decoupling: Intermediate Feature Compression and Edge Learning

1 code implementation24 May 2022 Zhiwei Hao, Guanyu Xu, Yong Luo, Han Hu, Jianping An, Shiwen Mao

In this paper, we study the multi-agent collaborative inference scenario, where a single edge server coordinates the inference of multiple UEs.

Feature Compression

Deeper Insights into the Robustness of ViTs towards Common Corruptions

no code implementations26 Apr 2022 Rui Tian, Zuxuan Wu, Qi Dai, Han Hu, Yu-Gang Jiang

With Vision Transformers (ViTs) making great advances in a variety of computer vision tasks, recent literature have proposed various variants of vanilla ViTs to achieve better efficiency and efficacy.

Benchmarking Data Augmentation

iCAR: Bridging Image Classification and Image-text Alignment for Visual Recognition

no code implementations22 Apr 2022 Yixuan Wei, Yue Cao, Zheng Zhang, Zhuliang Yao, Zhenda Xie, Han Hu, Baining Guo

Second, we convert the image classification problem from learning parametric category classifier weights to learning a text encoder as a meta network to generate category classifier weights.

Action Recognition Classification +7

Enhancing the Robustness, Efficiency, and Diversity of Differentiable Architecture Search

no code implementations10 Apr 2022 Chao Li, Jia Ning, Han Hu, Kun He

Differentiable architecture search (DARTS) has attracted much attention due to its simplicity and significant improvement in efficiency.

RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation

2 code implementations8 Mar 2022 Haodi He, Yuhui Yuan, Xiangyu Yue, Han Hu

Given an input image or video, our framework first conducts multi-label classification over the complete label, then sorts the complete label and selects a small subset according to their class confidence scores.

Classification Instance Segmentation +5

Energy Efficiency and Delay Tradeoff in an MEC-Enabled Mobile IoT Network

no code implementations8 Feb 2022 Han Hu, Weiwei Song, Qun Wang, Rose Qingyang Hu, Hongbo Zhu

Theoretical analysis proves that the proposed algorithm can achieve a $[O(1/V), O(V)]$ tradeoff between EE and service delay.

Edge-computing Stochastic Optimization

Semi-Supervised Adversarial Recognition of Refined Window Structures for Inverse Procedural Façade Modeling

no code implementations22 Jan 2022 Han Hu, Xinrong Liang, Yulin Ding, Qisen Shang, Bo Xu, Xuming Ge, Min Chen, Ruofei Zhong, Qing Zhu

Unfortunately, the large amount of interactive sample labeling efforts has dramatically hindered the application of deep learning methods, especially for 3D modeling tasks, which require heterogeneous samples.

A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-language Model

1 code implementation29 Dec 2021 Mengde Xu, Zheng Zhang, Fangyun Wei, Yutong Lin, Yue Cao, Han Hu, Xiang Bai

However, semantic segmentation and the CLIP model perform on different visual granularity, that semantic segmentation processes on pixels while CLIP performs on images.

Image Classification Language Modelling +6

Swin Transformer V2: Scaling Up Capacity and Resolution

16 code implementations CVPR 2022 Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, Furu Wei, Baining Guo

Three main techniques are proposed: 1) a residual-post-norm method combined with cosine attention to improve training stability; 2) A log-spaced continuous position bias method to effectively transfer models pre-trained using low-resolution images to downstream tasks with high-resolution inputs; 3) A self-supervised pre-training method, SimMIM, to reduce the needs of vast labeled images.

Ranked #4 on Instance Segmentation on COCO minival (using extra training data)

Action Classification Image Classification +3

SimMIM: A Simple Framework for Masked Image Modeling

2 code implementations CVPR 2022 Zhenda Xie, Zheng Zhang, Yue Cao, Yutong Lin, Jianmin Bao, Zhuliang Yao, Qi Dai, Han Hu

We also leverage this approach to facilitate the training of a 3B model (SwinV2-G), that by $40\times$ less data than that in previous practice, we achieve the state-of-the-art on four representative vision benchmarks.

Representation Learning Self-Supervised Image Classification

FLSys: Toward an Open Ecosystem for Federated Learning Mobile Apps

no code implementations17 Nov 2021 Xiaopeng Jiang, Han Hu, Vijaya Datta Mayyuri, An Chen, Devu M. Shila, Adriaan Larmuseau, Ruoming Jin, Cristian Borcea, NhatHai Phan

This article presents the design, implementation, and evaluation of FLSys, a mobile-cloud federated learning (FL) system, which can be a key component for an open ecosystem of FL models and apps.

Data Augmentation Federated Learning +3

Bootstrap Your Object Detector via Mixed Training

1 code implementation NeurIPS 2021 Mengde Xu, Zheng Zhang, Fangyun Wei, Yutong Lin, Yue Cao, Stephen Lin, Han Hu, Xiang Bai

We introduce MixTraining, a new training paradigm for object detection that can improve the performance of existing detectors for free.

Data Augmentation object-detection +1

Joint Task Offloading and Resource Allocation for IoT Edge Computing with Sequential Task Dependency

no code implementations23 Oct 2021 Xuming An, Rongfei Fan, Han Hu, Ning Zhang, Saman Atapattu, Theodoros A. Tsiftsis

To solve this challenging problem, we decompose it as a one-dimensional search of task offloading decision problem and a non-convex optimization problem with task offloading decision given.


Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning

1 code implementation NeurIPS 2021 Hanzhe Hu, Fangyun Wei, Han Hu, Qiwei Ye, Jinshi Cui, LiWei Wang

The confidence bank is leveraged as an indicator to tilt training towards under-performing categories, instantiated in three strategies: 1) adaptive Copy-Paste and CutMix data augmentation approaches which give more chance for under-performing categories to be copied or cut; 2) an adaptive data sampling approach to encourage pixels from under-performing category to be sampled; 3) a simple yet effective re-weighting method to alleviate the training noise raised by pseudo-labeling.

Data Augmentation Semi-Supervised Semantic Segmentation

Meta-learning an Intermediate Representation for Few-shot Block-wise Prediction of Landslide Susceptibility

1 code implementation3 Oct 2021 Li Chen, Yulin Ding, Saeid Pirasteh, Han Hu, Qing Zhu, Haowei Zeng, Haojia Yu, Qisen Shang, Yongfei Song

Then, the critical problem is that in each block with limited samples, conducting training and testing a model is impossible for a satisfactory LSM prediction, especially in dangerous mountainous areas where landslide surveying is expensive.


Energy-Efficient Design for IRS-Assisted MEC Networks with NOMA

no code implementations19 Sep 2021 Qun Wang, Fuhui Zhou, Han Hu, Rose Qingyang Hu

Energy-efficient design is of crucial importance in wireless internet of things (IoT) networks.


Video Swin Transformer

12 code implementations CVPR 2022 Ze Liu, Jia Ning, Yue Cao, Yixuan Wei, Zheng Zhang, Stephen Lin, Han Hu

The vision community is witnessing a modeling shift from CNNs to Transformers, where pure Transformer architectures have attained top accuracy on the major video recognition benchmarks.

Ranked #21 on Action Classification on Kinetics-600 (using extra training data)

Action Classification Action Recognition +5

End-to-End Semi-Supervised Object Detection with Soft Teacher

6 code implementations ICCV 2021 Mengde Xu, Zheng Zhang, Han Hu, JianFeng Wang, Lijuan Wang, Fangyun Wei, Xiang Bai, Zicheng Liu

This paper presents an end-to-end semi-supervised object detection approach, in contrast to previous more complex multi-stage methods.

Instance Segmentation object-detection +4

Aligning Pretraining for Detection via Object-Level Contrastive Learning

1 code implementation NeurIPS 2021 Fangyun Wei, Yue Gao, Zhirong Wu, Han Hu, Stephen Lin

Image-level contrastive representation learning has proven to be highly effective as a generic model for transfer learning.

Contrastive Learning object-detection +5

TENSILE: A Tensor granularity dynamic GPU memory scheduling method toward multiple dynamic workloads system

no code implementations27 May 2021 Kaixin Zhang, Hongzhi Wang, Han Hu, Songling Zou, Jiye Qiu, Tongxin Li, Zhishun Wang

In this paper, we demonstrated TENSILE, a method of managing GPU memory in tensor granularity to reduce the GPU memory peak, considering the multiple dynamic workloads.

Management Scheduling

Group-Free 3D Object Detection via Transformers

3 code implementations ICCV 2021 Ze Liu, Zheng Zhang, Yue Cao, Han Hu, Xin Tong

Instead of grouping local points to each object candidate, our method computes the feature of an object from all the points in the point cloud with the help of an attention mechanism in the Transformers \cite{vaswani2017attention}, where the contribution of each point is automatically learned in the network training.

3D Object Detection object-detection

Capsule Network is Not More Robust than Convolutional Network

no code implementations CVPR 2021 Jindong Gu, Volker Tresp, Han Hu

The examination reveals five major new/different components in CapsNet: a transformation process, a dynamic routing layer, a squashing function, a marginal loss other than cross-entropy loss, and an additional class-conditional reconstruction loss for regularization.

Image Classification

Boosting Adversarial Transferability through Enhanced Momentum

no code implementations19 Mar 2021 Xiaosen Wang, Jiadong Lin, Han Hu, Jingdong Wang, Kun He

Various momentum iterative gradient-based methods are shown to be effective to improve the adversarial transferability.

Adversarial Attack

Mobility-Aware Offloading and Resource Allocation in MEC-Enabled IoT Networks

no code implementations16 Mar 2021 Han Hu, Weiwei Song, Qun Wang, Fuhui Zhou, Rose Qingyang Hu

In this paper, the offloading decision and resource allocation problem is studied with mobility consideration.

Association Autonomous Driving +1

Secure and Energy-Efficient Offloading and Resource Allocation in a NOMA-Based MEC Network

no code implementations9 Feb 2021 Qun Wang, Han Hu, Haijian Sun, Rose Qingyang Hu

In this paper, we study the task offloading and resource allocation problem in a non-orthogonal multiple access (NOMA) assisted MEC network with security and energy efficiency considerations.


Robustness of on-device Models: Adversarial Attack to Deep Learning Models on Android Apps

1 code implementation12 Jan 2021 Yujin Huang, Han Hu, Chunyang Chen

Deep learning has shown its power in many applications, including object detection in images, natural-language understanding, and speech recognition.

Adversarial Attack Image Classification +3

Global Context Networks

3 code implementations24 Dec 2020 Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu

The Non-Local Network (NLNet) presents a pioneering approach for capturing long-range dependencies within an image, via aggregating query-specific global context to each query position.

Instance Segmentation Object Detection

Evading Web Application Firewalls with Reinforcement Learning

no code implementations CUHK Course IERG5350 2020 Xianbo Wang, Han Hu

Our framework successfully discovered numbers of evasion payloads for each WAF in our experiments and can significantly outperform baseline policy.

OpenAI Gym reinforcement-learning +1

Depth-Enhanced Feature Pyramid Network for Occlusion-Aware Verification of Buildings from Oblique Images

no code implementations26 Nov 2020 Qing Zhu, Shengzhi Huang, Han Hu, Haifeng Li, Min Chen, Ruofei Zhong

Finally, multi-view information from both the nadir and oblique images is used in a robust voting procedure to label changes in existing buildings.

Joint Task Offloading and Resource Allocation for IoT Edge Computing with Sequential Task Dependency

no code implementations25 Nov 2020 Xuming An, Rongfei Fan, Han Hu, Ning Zhang, Saman Atapattu, Theodoros A. Tsiftsis

To solve this challenging problem, we decompose it as a one-dimensional search of task offloading decision problem and a non-convex optimization problem with task offloading decision given.

Edge-computing Information Theory Information Theory

Structure-Aware Completion of Photogrammetric Meshes in Urban Road Environment

1 code implementation23 Nov 2020 Qing Zhu, Qisen Shang, Han Hu, Haojia Yu, Ruofei Zhong

Finally, the completed rendered image is deintegrated to the original texture atlas and the triangles for the vehicles are also flattened for improved meshes.

object-detection Object Detection

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

7 code implementations CVPR 2021 Zhenda Xie, Yutong Lin, Zheng Zhang, Yue Cao, Stephen Lin, Han Hu

We argue that the power of contrastive learning has yet to be fully unleashed, as current methods are trained only on instance-level pretext tasks, leading to representations that may be sub-optimal for downstream tasks requiring dense pixel predictions.

Contrastive Learning object-detection +3

RepPoints V2: Verification Meets Regression for Object Detection

1 code implementation NeurIPS 2020 Yihong Chen, Zheng Zhang, Yue Cao, Li-Wei Wang, Stephen Lin, Han Hu

Though RepPoints provides high performance, we find that its heavy reliance on regression for object localization leaves room for improvement.

Instance Segmentation object-detection +5

A Closer Look at Local Aggregation Operators in Point Cloud Analysis

1 code implementation ECCV 2020 Ze Liu, Han Hu, Yue Cao, Zheng Zhang, Xin Tong

Our investigation reveals that despite the different designs of these operators, all of these operators make surprisingly similar contributions to the network performance under the same network input and feature numbers and result in the state-of-the-art accuracy on standard benchmarks.

3D Semantic Segmentation

Disentangled Non-Local Neural Networks

4 code implementations ECCV 2020 Minghao Yin, Zhuliang Yao, Yue Cao, Xiu Li, Zheng Zhang, Stephen Lin, Han Hu

This paper first studies the non-local block in depth, where we find that its attention computation can be split into two terms, a whitened pairwise term accounting for the relationship between two pixels and a unary term representing the saliency of every pixel.

Action Recognition object-detection +2

Ontology-based Interpretable Machine Learning for Textual Data

2 code implementations1 Apr 2020 Phung Lai, NhatHai Phan, Han Hu, Anuja Badeti, David Newman, Dejing Dou

In this paper, we introduce a novel interpreting framework that learns an interpretable model based on an ontology-based sampling technique to explain agnostic prediction models.

BIG-bench Machine Learning Interpretable Machine Learning

Memory Enhanced Global-Local Aggregation for Video Object Detection

2 code implementations CVPR 2020 Yihong Chen, Yue Cao, Han Hu, Li-Wei Wang

We argue that there are two important cues for humans to recognize objects in videos: the global semantic information and the local localization information.

object-detection Video Object Detection

Fast and Regularized Reconstruction of Building Façades from Street-View Images using Binary Integer Programming

1 code implementation20 Feb 2020 Han Hu, Libin Wang, Mier Zhang, Yulin Ding, Qing Zhu

Regularized arrangement of primitives on building fa\c{c}ades to aligned locations and consistent sizes is important towards structured reconstruction of urban environment.

3D Reconstruction

Deep Fusion of Local and Non-Local Features for Precision Landslide Recognition

1 code implementation20 Feb 2020 Qing Zhu, Lin Chen, Han Hu, Binzhi Xu, Yeting Zhang, Haifeng Li

The second uses a scale attention mechanism to guide the up-sampling of features from the coarse level by a learned weight map.

Semantic Segmentation

Dense RepPoints: Representing Visual Objects with Dense Point Sets

2 code implementations ECCV 2020 Ze Yang, Yinghao Xu, Han Xue, Zheng Zhang, Raquel Urtasun, Li-Wei Wang, Stephen Lin, Han Hu

We present a new object representation, called Dense RepPoints, that utilizes a large set of points to describe an object at multiple levels, including both box level and pixel level.

Object Detection

MAP-Net: Multi Attending Path Neural Network for Building Footprint Extraction from Remote Sensed Imagery

1 code implementation26 Oct 2019 Qing Zhu, Cheng Liao, Han Hu, Xiaoming Mei, Haifeng Li

This paper proposes a novel multi attending path neural network (MAP-Net) for accurately extracting multiscale building footprints and precise boundaries.

Differential Privacy in Adversarial Learning with Provable Robustness

no code implementations25 Sep 2019 NhatHai Phan, My T. Thai, Ruoming Jin, Han Hu, Dejing Dou

In this paper, we aim to develop a novel mechanism to preserve differential privacy (DP) in adversarial learning for deep neural networks, with provable robustness to adversarial examples.

XCMRC: Evaluating Cross-lingual Machine Reading Comprehension

no code implementations15 Aug 2019 Pengyuan Liu, Yuning Deng, Chenghao Zhu, Han Hu

Chinese and English are rich-resource language pairs, in order to study low-resource cross-lingual machine reading comprehension (XMRC), besides defining the common XCMRC task which has no restrictions on use of external language resources, we also define the pseudo low-resource XCMRC task by limiting the language resources to be used.

Machine Reading Comprehension

Spatial-Temporal Relation Networks for Multi-Object Tracking

no code implementations ICCV 2019 Jiarui Xu, Yue Cao, Zheng Zhang, Han Hu

Recent progress in multiple object tracking (MOT) has shown that a robust similarity score is key to the success of trackers.

Multi-Object Tracking Multiple Object Tracking

GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

9 code implementations25 Apr 2019 Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu

In this paper, we take advantage of this finding to create a simplified network based on a query-independent formulation, which maintains the accuracy of NLNet but with significantly less computation.

Instance Segmentation Object Detection +1

Preserving Differential Privacy in Adversarial Learning with Provable Robustness

no code implementations23 Mar 2019 NhatHai Phan, My T. Thai, Ruoming Jin, Han Hu, Dejing Dou

In this paper, we aim to develop a novel mechanism to preserve differential privacy (DP) in adversarial learning for deep neural networks, with provable robustness to adversarial examples.

Cryptography and Security

Deep Metric Transfer for Label Propagation with Limited Annotated Data

1 code implementation20 Dec 2018 Bin Liu, Zhirong Wu, Han Hu, Stephen Lin

In this paper, we propose a generic framework that utilizes unlabeled data to aid generalization for all three tasks.

Metric Learning Object Recognition +1

Deformable ConvNets v2: More Deformable, Better Results

21 code implementations CVPR 2019 Xizhou Zhu, Han Hu, Stephen Lin, Jifeng Dai

The superior performance of Deformable Convolutional Networks arises from its ability to adapt to the geometric variations of objects.

Instance Segmentation Object Detection +1

Learning Region Features for Object Detection

no code implementations ECCV 2018 Jiayuan Gu, Han Hu, Li-Wei Wang, Yichen Wei, Jifeng Dai

While most steps in the modern object detection methods are learnable, the region feature extraction step remains largely hand-crafted, featured by RoI pooling methods.

object-detection Object Detection

Relation Networks for Object Detection

6 code implementations CVPR 2018 Han Hu, Jiayuan Gu, Zheng Zhang, Jifeng Dai, Yichen Wei

Although it is well believed for years that modeling relations between objects would help object recognition, there has not been evidence that the idea is working in the deep learning era.

object-detection Object Detection +1

Adaptive Laplace Mechanism: Differential Privacy Preservation in Deep Learning

2 code implementations18 Sep 2017 NhatHai Phan, Xintao Wu, Han Hu, Dejing Dou

In this paper, we focus on developing a novel mechanism to preserve differential privacy in deep neural networks, such that: (1) The privacy budget consumption is totally independent of the number of training steps; (2) It has the ability to adaptively inject noise into features based on the contribution of each to the output; and (3) It could be applied in a variety of different deep neural networks.

WordSup: Exploiting Word Annotations for Character based Text Detection

no code implementations ICCV 2017 Han Hu, Chengquan Zhang, Yuxuan Luo, Yuzhuo Wang, Junyu Han, Errui Ding

When applied in scene text detection, we are thus able to train a robust character detector by exploiting word annotations in the rich large-scale real scene text datasets, e. g. ICDAR15 and COCO-text.

Scene Text Detection

Deformable Convolutional Networks

37 code implementations ICCV 2017 Jifeng Dai, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, Yichen Wei

Convolutional neural networks (CNNs) are inherently limited to model geometric transformations due to the fixed geometric structures in its building modules.

Object Detection Semantic Segmentation

Power Data Classification: A Hybrid of a Novel Local Time Warping and LSTM

no code implementations15 Aug 2016 Yuanlong Li, Han Hu, Yonggang Wen, Jun Zhang

Finally, using the power consumption data from a real data center, we show that the proposed LTW can improve the classification accuracy of DTW from about 84% to 90%.

Classification General Classification +2

Smooth Representation Clustering

no code implementations CVPR 2014 Han Hu, Zhouchen Lin, Jianjiang Feng, Jie zhou

Based on our analysis, we propose the SMooth Representation (SMR) model.

Pose from Flow and Flow from Pose

no code implementations CVPR 2013 Katerina Fragkiadaki, Han Hu, Jianbo Shi

The pose labeled segments and corresponding articulated joints are used to improve the motion flow fields by proposing kinematically constrained affine displacements on body parts.

Motion Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.