Search Results for author: Xiangyu Zhang

Found 144 papers, 85 papers with code

Detecting Backdoors in Pre-trained Encoders

1 code implementation23 Mar 2023 Shiwei Feng, Guanhong Tao, Siyuan Cheng, Guangyu Shen, Xiangzhe Xu, Yingqi Liu, Kaiyuan Zhang, Shiqing Ma, Xiangyu Zhang

We show the effectiveness of our method on image encoders pre-trained on ImageNet and OpenAI's CLIP 400 million image-text pairs.

Self-Supervised Learning

Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection

1 code implementation21 Mar 2023 Shihao Wang, Yingfei Liu, Tiancai Wang, Ying Li, Xiangyu Zhang

In this paper, we propose a long-sequence modeling framework, named StreamPETR, for multi-view 3D object detection.

3D Object Detection object-detection

VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking

1 code implementation20 Mar 2023 Yukang Chen, Jianhui Liu, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia

Our core insight is to predict objects directly based on sparse voxel features, without relying on hand-crafted proxies.

3D Object Detection object-detection

Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception

no code implementations10 Mar 2023 Chunrui Han, Jianjian Sun, Zheng Ge, Jinrong Yang, Runpei Dong, HongYu Zhou, Weixin Mao, Yuang Peng, Xiangyu Zhang

In this paper, we explore an embarrassingly simple long-term recurrent fusion strategy built upon the LSS-based methods and find it already able to enjoy the merits from both sides, i. e., rich long-term information and efficient fusion pipeline.

motion prediction object-detection +1

Referring Multi-Object Tracking

1 code implementation6 Mar 2023 Dongming Wu, Wencheng Han, Tiancai Wang, Xingping Dong, Xiangyu Zhang, Jianbing Shen

In this paper, we propose a new and general referring understanding task, termed referring multi-object tracking (RMOT).

Multi-Object Tracking

Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining

2 code implementations5 Feb 2023 Zekun Qi, Runpei Dong, Guofan Fan, Zheng Ge, Xiangyu Zhang, Kaisheng Ma, Li Yi

This motivates us to learn 3D representations by sharing the merits of both paradigms, which is non-trivial due to the pattern difference between the two paradigms.

3D Point Cloud Linear Classification Few-Shot 3D Point Cloud Classification +2

KNOD: Domain Knowledge Distilled Tree Decoder for Automated Program Repair

1 code implementation3 Feb 2023 Nan Jiang, Thibaud Lutellier, Yiling Lou, Lin Tan, Dan Goldwasser, Xiangyu Zhang

KNOD has two major novelties, including (1) a novel three-stage tree decoder, which directly generates Abstract Syntax Trees of patched code according to the inherent tree structure, and (2) a novel domain-rule distillation, which leverages syntactic and semantic rules and teacher-student distributions to explicitly inject the domain knowledge into the decoding procedure during both the training and inference phases.

Program Repair

BEAGLE: Forensics of Deep Learning Backdoor Attack for Better Defense

1 code implementation16 Jan 2023 Siyuan Cheng, Guanhong Tao, Yingqi Liu, Shengwei An, Xiangzhe Xu, Shiwei Feng, Guangyu Shen, Kaiyuan Zhang, QiuLing Xu, Shiqing Ma, Xiangyu Zhang

Attack forensics, a critical counter-measure for traditional cyber attacks, is hence of importance for defending model backdoor attacks.

Backdoor Attack

Cross Modal Transformer: Towards Fast and Robust 3D Object Detection

2 code implementations3 Jan 2023 Junjie Yan, Yingfei Liu, Jianjian Sun, Fan Jia, Shuailin Li, Tiancai Wang, Xiangyu Zhang

In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection.

3D Object Detection object-detection

Understanding Imbalanced Semantic Segmentation Through Neural Collapse

2 code implementations3 Jan 2023 Zhisheng Zhong, Jiequan Cui, Yibo Yang, Xiaoyang Wu, Xiaojuan Qi, Xiangyu Zhang, Jiaya Jia

Based on our empirical and theoretical analysis, we point out that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes, which breaks the equiangular and maximally separated structure of neural collapse for both feature centers and classifiers.

3D Semantic Segmentation

Reversible Column Networks

1 code implementation22 Dec 2022 Yuxuan Cai, Yizhuang Zhou, Qi Han, Jianjian Sun, Xiangwen Kong, Jun Li, Xiangyu Zhang

Such architectural scheme attributes RevCol very different behavior from conventional networks: during forward propagation, features in RevCol are learned to be gradually disentangled when passing through each column, whose total information is maintained rather than compressed or discarded as other network does.

Ranked #7 on Object Detection on COCO minival (using extra training data)

Image Classification object-detection +2

Backdoor Vulnerabilities in Normally Trained Deep Learning Models

no code implementations29 Nov 2022 Guanhong Tao, Zhenting Wang, Siyuan Cheng, Shiqing Ma, Shengwei An, Yingqi Liu, Guangyu Shen, Zhuo Zhang, Yunshu Mao, Xiangyu Zhang

We leverage 20 different types of injected backdoor attacks in the literature as the guidance and study their correspondences in normally trained models, which we call natural backdoor vulnerabilities.

Data Poisoning

Near-Field Channel Estimation for Extremely Large-Scale Array Communications: A model-based deep learning approach

no code implementations28 Nov 2022 Xiangyu Zhang, Zening Wang, Haiyang Zhang, Luxi Yang

In particular, we first formulate the XL-MIMO near-field channel estimation task as a compressed sensing problem using the spatial gridding-based sparsifying dictionary, and then solve the resulting problem by applying the Learning Iterative Shrinkage and Thresholding Algorithm (LISTA).

Dictionary Learning

MatrixVT: Efficient Multi-Camera to BEV Transformation for 3D Perception

2 code implementations19 Nov 2022 HongYu Zhou, Zheng Ge, Zeming Li, Xiangyu Zhang

This paper proposes an efficient multi-camera to Bird's-Eye-View (BEV) view transformation method for 3D perception, dubbed MatrixVT.

Autonomous Driving object-detection +1

MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors

1 code implementation17 Nov 2022 Yuang Zhang, Tiancai Wang, Xiangyu Zhang

In this paper, we propose MOTRv2, a simple yet effective pipeline to bootstrap end-to-end multi-object tracking with a pretrained object detector.

 Ranked #1 on Multi-Object Tracking on DanceTrack (using extra training data)

Association Multi-Object Tracking +2

FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning

1 code implementation23 Oct 2022 Kaiyuan Zhang, Guanhong Tao, QiuLing Xu, Siyuan Cheng, Shengwei An, Yingqi Liu, Shiwei Feng, Guangyu Shen, Pin-Yu Chen, Shiqing Ma, Xiangyu Zhang

In this work, we theoretically analyze the connection among cross-entropy loss, attack success rate, and clean accuracy in this setting.

Federated Learning

From Model-Based to Model-Free: Learning Building Control for Demand Response

1 code implementation18 Oct 2022 David Biagioni, Xiangyu Zhang, Christiane Adcock, Michael Sinner, Peter Graf, Jennifer King

We demonstrate, in this context, that hybrid methods offer many benefits over both purely model-free and model-based methods as long as certain requirements are met.

PQLM -- Multilingual Decentralized Portable Quantum Language Model for Privacy Protection

no code implementations6 Oct 2022 Shuyue Stella Li, Xiangyu Zhang, Shu Zhou, Hongchao Shu, Ruixing Liang, Hexin Liu, Leibny Paola Garcia

In this work, we propose a highly Portable Quantum Language Model (PQLM) that can easily transmit information to downstream tasks on classical machines.

Language Modelling Sentence Embedding +3

Differentiable Architecture Search with Random Features

no code implementations18 Aug 2022 Xuanyang Zhang, Yonggang Li, Xiangyu Zhang, Yongtao Wang, Jian Sun

Differentiable architecture search (DARTS) has significantly promoted the development of NAS techniques because of its high search efficiency and effectiveness but suffers from performance collapse.

Neural Architecture Search

Understanding Masked Image Modeling via Learning Occlusion Invariant Feature

no code implementations8 Aug 2022 Xiangwen Kong, Xiangyu Zhang

Recently, Masked Image Modeling (MIM) achieves great success in self-supervised visual recognition.

Contrastive Learning

Revisiting the Critical Factors of Augmentation-Invariant Representation Learning

1 code implementation30 Jul 2022 Junqiang Huang, Xiangwen Kong, Xiangyu Zhang

We focus on better understanding the critical factors of augmentation-invariant representation learning.

Representation Learning

Physical Attack on Monocular Depth Estimation with Optimal Adversarial Patches

no code implementations11 Jul 2022 Zhiyuan Cheng, James Liang, Hongjun Choi, Guanhong Tao, Zhiwen Cao, Dongfang Liu, Xiangyu Zhang

Experimental results show that our method can generate stealthy, effective, and robust adversarial patches for different target objects and models and achieves more than 6 meters mean depth estimation error and 93% attack success rate (ASR) in object detection with a patch of 1/9 of the vehicle's rear area.

3D Object Detection Autonomous Driving +2

DECK: Model Hardening for Defending Pervasive Backdoors

no code implementations18 Jun 2022 Guanhong Tao, Yingqi Liu, Siyuan Cheng, Shengwei An, Zhuo Zhang, QiuLing Xu, Guangyu Shen, Xiangyu Zhang

As such, using the samples derived from our attack in adversarial training can harden a model against these backdoor vulnerabilities.

Re-parameterizing Your Optimizers rather than Architectures

1 code implementation30 May 2022 Xiaohan Ding, Honghao Chen, Xiangyu Zhang, Kaiqi Huang, Jungong Han, Guiguang Ding

For the extreme simplicity of model structure, we focus on a VGG-style plain model and showcase that such a simple model trained with a RepOptimizer, which is referred to as RepOpt-VGG, performs on par with or better than the recent well-designed models.


Self-Supervised Visual Representation Learning with Semantic Grouping

1 code implementation30 May 2022 Xin Wen, Bingchen Zhao, Anlin Zheng, Xiangyu Zhang, Xiaojuan Qi

The semantic grouping is performed by assigning pixels to a set of learnable prototypes, which can adapt to each sample by attentive pooling over the feature and form new slots.

Contrastive Learning Instance Segmentation +6

GL-RG: Global-Local Representation Granularity for Video Captioning

1 code implementation22 May 2022 Liqi Yan, Qifan Wang, Yiming Cui, Fuli Feng, Xiaojun Quan, Xiangyu Zhang, Dongfang Liu

Video captioning is a challenging task as it needs to accurately transform visual understanding into natural language description.

Video Captioning

Focal Sparse Convolutional Networks for 3D Object Detection

2 code implementations CVPR 2022 Yukang Chen, Yanwei Li, Xiangyu Zhang, Jian Sun, Jiaya Jia

In this paper, we introduce two new modules to enhance the capability of Sparse CNNs, both are based on making feature sparsity learnable with position-wise importance prediction.

3D Object Detection object-detection

Simple Baselines for Image Restoration

7 code implementations10 Apr 2022 Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, Jian Sun

Although there have been significant advances in the field of image restoration recently, the system complexity of the state-of-the-art (SOTA) methods is increasing as well, which may hinder the convenient analysis and comparison of methods.

Deblurring Image Deblurring +2

Near-optimality for infinite-horizon restless bandits with many arms

no code implementations29 Mar 2022 Xiangyu Zhang, Peter I. Frazier

Although an average-case-optimal policy can be computed via stochastic dynamic programming, the computation required grows exponentially with the number of arms $N$.

Active Learning Management +1

Tree Energy Loss: Towards Sparsely Annotated Semantic Segmentation

1 code implementation CVPR 2022 Zhiyuan Liang, Tiancai Wang, Xiangyu Zhang, Jian Sun, Jianbing Shen

The tree energy loss is effective and easy to be incorporated into existing frameworks by combining it with a traditional segmentation loss.

Semantic Segmentation

Relieving Long-tailed Instance Segmentation via Pairwise Class Balance

2 code implementations CVPR 2022 Yin-Yin He, Peizhen Zhang, Xiu-Shen Wei, Xiangyu Zhang, Jian Sun

In this paper, we explore to excavate the confusion matrix, which carries the fine-grained misclassification details, to relieve the pairwise biases, generalizing the coarse one.

Instance Segmentation Semantic Segmentation

Communication-Efficient TeraByte-Scale Model Training Framework for Online Advertising

no code implementations5 Jan 2022 Weijie Zhao, Xuewu Jiao, Mingqing Hu, Xiaoyun Li, Xiangyu Zhang, Ping Li

In this paper, we propose a hardware-aware training workflow that couples the hardware topology into the algorithm design.

Click-Through Rate Prediction

Bounded Adversarial Attack on Deep Content Features

1 code implementation CVPR 2022 QiuLing Xu, Guanhong Tao, Xiangyu Zhang

We propose a novel adversarial attack targeting content features in some deep layer, that is, individual neurons in the layer.

Adversarial Attack

Complex Backdoor Detection by Symmetric Feature Differencing

1 code implementation CVPR 2022 Yingqi Liu, Guangyu Shen, Guanhong Tao, Zhenting Wang, Shiqing Ma, Xiangyu Zhang

Our results on the TrojAI competition rounds 2-4, which have patch backdoors and filter backdoors, show that existing scanners may produce hundreds of false positives (i. e., clean models recognized as trojaned), while our technique removes 78-100% of them with a small increase of false negatives by 0-30%, leading to 17-41% overall accuracy improvement.

RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality

3 code implementations CVPR 2022 Xiaohan Ding, Honghao Chen, Xiangyu Zhang, Jungong Han, Guiguang Ding

Our results reveal that 1) Locality Injection is a general methodology for MLP models; 2) RepMLPNet has favorable accuracy-efficiency trade-off compared to the other MLPs; 3) RepMLPNet is the first MLP that seamlessly transfer to Cityscapes semantic segmentation.

Image Classification Semantic Segmentation

On Efficient Transformer-Based Image Pre-training for Low-Level Vision

1 code implementation19 Dec 2021 Wenbo Li, Xin Lu, Shengju Qian, Jiangbo Lu, Xiangyu Zhang, Jiaya Jia

Pre-training has marked numerous state of the arts in high-level computer vision, while few attempts have ever been made to investigate how pre-training acts in image processing systems.

Denoising Super-Resolution

Implicit Feature Refinement for Instance Segmentation

1 code implementation9 Dec 2021 Lufan Ma, Tiancai Wang, Bin Dong, Jiangpeng Yan, Xiu Li, Xiangyu Zhang

Our IFR enjoys several advantages: 1) simulates an infinite-depth refinement network while only requiring parameters of single residual block; 2) produces high-level equilibrium instance features of global receptive field; 3) serves as a plug-and-play general module easily extended to most object recognition frameworks.

Instance Segmentation Object Recognition +2

Two-step Lookahead Bayesian Optimization with Inequality Constraints

no code implementations6 Dec 2021 Yunxiang Zhang, Xiangyu Zhang, Peter I. Frazier

Recent advances in computationally efficient non-myopic Bayesian optimization (BO) improve query efficiency over traditional myopic methods like expected improvement while only modestly increasing computational cost.

Spherical Motion Dynamics: Learning Dynamics of Normalized Neural Network using SGD and Weight Decay

no code implementations NeurIPS 2021 Ruosi Wan, Zhanxing Zhu, Xiangyu Zhang, Jian Sun

Specifically, 1) we introduce the assumptions that can lead to equilibrium state in SMD, and prove equilibrium can be reached in a linear rate regime under given assumptions; 2) we propose ``angular update" as a substitute for effective learning rate to depict the state of SMD, and derive the theoretical value of angular update in equilibrium state; 3) we verify our assumptions and theoretical results on various large-scale computer vision tasks including ImageNet and MSCOCO with standard settings.

Constrained Two-step Look-Ahead Bayesian Optimization

no code implementations NeurIPS 2021 Yunxiang Zhang, Xiangyu Zhang, Peter Frazier

Recent advances in computationally efficient non-myopic Bayesian optimization offer improved query efficiency over traditional myopic methods like expected improvement, with only a modest increase in computational cost.

PowerGridworld: A Framework for Multi-Agent Reinforcement Learning in Power Systems

1 code implementation10 Nov 2021 David Biagioni, Xiangyu Zhang, Dylan Wald, Deepthi Vaidhynathan, Rohit Chintala, Jennifer King, Ahmed S. Zamzam

We present the PowerGridworld software package to provide users with a lightweight, modular, and customizable framework for creating power-systems-focused, multi-agent Gym environments that readily integrate with existing training frameworks for reinforcement learning (RL).

Multi-agent Reinforcement Learning reinforcement-learning +1

A Comparison of Model-Free and Model Predictive Control for Price Responsive Water Heaters

no code implementations8 Nov 2021 David J. Biagioni, Xiangyu Zhang, Peter Graf, Devon Sigler, Wesley Jones

We demonstrate that optimal control for this problem is challenging, requiring more than 8-hour lookahead for MPC with perfect forecasting to attain the minimum cost.

Time Series Analysis

Raw Bayer Pattern Image Synthesis for Computer Vision-oriented Image Signal Processing Pipeline Design

no code implementations25 Oct 2021 Wei Zhou, Xiangyu Zhang, Hongyu Wang, Shenghua Gao, Xin Lou

It is shown that by adding another transformation, the proposed method is able to synthesize high-quality RAW Bayer images with arbitrary size.

Demosaicking Image Generation +3

RWN: Robust Watermarking Network for Image Cropping Localization

no code implementations12 Oct 2021 Qichao Ying, Xiaoxiao Hu, Xiangyu Zhang, Zhenxing Qian, Xinpeng Zhang

At the recipient's side, ACP extracts the watermark from the attacked image, and we conduct feature matching on the original and extracted watermark to locate the position of the crop in the original image plane.

Image Cropping Image Forensics

Partial to Whole Knowledge Distillation: Progressive Distilling Decomposed Knowledge Boosts Student Better

no code implementations26 Sep 2021 Xuanyang Zhang, Xiangyu Zhang, Jian Sun

Knowledge distillation field delicately designs various types of knowledge to shrink the performance gap between compact student and large-scale teacher.

Knowledge Distillation

Image Synthesis via Semantic Composition

no code implementations ICCV 2021 Yi Wang, Lu Qi, Ying-Cong Chen, Xiangyu Zhang, Jiaya Jia

In this paper, we present a novel approach to synthesize realistic images based on their semantic layouts.

Image Generation Semantic Composition

Anchor DETR: Query Design for Transformer-Based Object Detection

2 code implementations15 Sep 2021 Yingming Wang, Xiangyu Zhang, Tong Yang, Jian Sun

Thanks to the query design and the attention variant, the proposed detector that we called Anchor DETR, can achieve better performance and run faster than the DETR with 10$\times$ fewer training epochs.

object-detection Object Detection

Accelerating Markov Random Field Inference with Uncertainty Quantification

no code implementations2 Aug 2021 Ramin Bashizade, Xiangyu Zhang, Sayan Mukherjee, Alvin R. Lebeck

In this paper, we propose a high-throughput accelerator for Markov Random Field (MRF) inference, a powerful model for representing a wide range of applications, using MCMC with Gibbs sampling.

Motion Estimation Playing the Game of 2048

Restless Bandits with Many Arms: Beating the Central Limit Theorem

no code implementations25 Jul 2021 Xiangyu Zhang, Peter I. Frazier

Thus, there is substantial value in understanding the performance of index policies and other policies that can be computed efficiently for large $N$.

Active Learning Management +1

The Threat of Offensive AI to Organizations

no code implementations30 Jun 2021 Yisroel Mirsky, Ambra Demontis, Jaidip Kotak, Ram Shankar, Deng Gelei, Liu Yang, Xiangyu Zhang, Wenke Lee, Yuval Elovici, Battista Biggio

Although offensive AI has been discussed in the past, there is a need to analyze and understand the threat in the context of organizations.

SOLQ: Segmenting Objects by Learning Queries

1 code implementation NeurIPS 2021 Bin Dong, Fangao Zeng, Tiancai Wang, Xiangyu Zhang, Yichen Wei

Moreover, the joint learning of unified query representation can greatly improve the detection performance of DETR.

Ranked #4 on Object Detection on COCO minival (AP75 metric)

Instance Segmentation Object Detection +1

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

8 code implementations5 May 2021 Xiaohan Ding, Chunlong Xia, Xiangyu Zhang, Xiaojie Chu, Jungong Han, Guiguang Ding

We propose RepMLP, a multi-layer-perceptron-style neural network building block for image recognition, which is composed of a series of fully-connected (FC) layers.

Face Recognition Image Classification +1

Points as Queries: Weakly Semi-supervised Object Detection by Points

1 code implementation CVPR 2021 Liangyu Chen, Tong Yang, Xiangyu Zhang, Wei zhang, Jian Sun

We propose a novel point annotated setting for the weakly semi-supervised object detection task, in which the dataset comprises small fully annotated images and large weakly annotated images by points.

object-detection Object Detection +1

Joint User Association and Power Allocation in Heterogeneous Ultra Dense Network via Semi-Supervised Representation Learning

no code implementations29 Mar 2021 Xiangyu Zhang, Zhengming Zhang, Luxi Yang

We model the HUDNs as a heterogeneous graph and train a Graph Neural Network (GNN) to approach this representation function by using semi-supervised learning, in which the loss function is composed of the unsupervised part that helps the GNN approach the optimal representation function and the supervised part that utilizes the previous experience to reduce useless exploration.

Association Representation Learning

Diverse Branch Block: Building a Convolution as an Inception-like Unit

2 code implementations CVPR 2021 Xiaohan Ding, Xiangyu Zhang, Jungong Han, Guiguang Ding

We propose a universal building block of Convolutional Neural Network (ConvNet) to improve the performance without any inference-time costs.

Image Classification object-detection +2

You Only Look One-level Feature

6 code implementations CVPR 2021 Qiang Chen, Yingming Wang, Tong Yang, Xiangyu Zhang, Jian Cheng, Jian Sun

From the perspective of optimization, we introduce an alternative way to address the problem instead of adopting the complex feature pyramids - {\em utilizing only one-level feature for detection}.

object-detection Object Detection

Backdoor Scanning for Deep Neural Networks through K-Arm Optimization

1 code implementation9 Feb 2021 Guangyu Shen, Yingqi Liu, Guanhong Tao, Shengwei An, QiuLing Xu, Siyuan Cheng, Shiqing Ma, Xiangyu Zhang

By iteratively and stochastically selecting the most promising labels for optimization with the guidance of an objective function, we substantially reduce the complexity, allowing to handle models with many classes.

Neural Architecture Search with Random Labels

1 code implementation CVPR 2021 Xuanyang Zhang, Pengfei Hou, Xiangyu Zhang, Jian Sun

In this paper, we investigate a new variant of neural architecture search (NAS) paradigm -- searching with random labels (RLNAS).

Neural Architecture Search

RepVGG: Making VGG-style ConvNets Great Again

18 code implementations CVPR 2021 Xiaohan Ding, Xiangyu Zhang, Ningning Ma, Jungong Han, Guiguang Ding, Jian Sun

We present a simple but powerful architecture of convolutional neural network, which has a VGG-like inference-time body composed of nothing but a stack of 3x3 convolution and ReLU, while the training-time model has a multi-branch topology.

Image Classification Semantic Segmentation

Implicit Feature Pyramid Network for Object Detection

no code implementations25 Dec 2020 Tiancai Wang, Xiangyu Zhang, Jian Sun

In this paper, we present an implicit feature pyramid network (i-FPN) for object detection.

object-detection Object Detection

Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification

2 code implementations21 Dec 2020 Siyuan Cheng, Yingqi Liu, Shiqing Ma, Xiangyu Zhang

Trojan (backdoor) attack is a form of adversarial attack on deep neural networks where the attacker provides victims with a model trained/retrained on malicious data.

Backdoor Attack

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection

1 code implementation3 Dec 2020 Tiancai Wang, Tong Yang, Jiale Cao, Xiangyu Zhang

Object detectors usually achieve promising results with the supervision of complete instance annotations.

MULTI-VIEW LEARNING object-detection +3

Microlensing Predictions: Impact of Galactic Disc Dynamical Models

no code implementations30 Oct 2020 Hongjing Yang, Shude Mao, Weicheng Zang, Xiangyu Zhang

Additionally, we find the asymptotic power-law behaviors in both $\theta_{\rm E}$ and $\pi_{\rm E}$ distributions, and we provide a simple model to understand them.

Astrophysics of Galaxies Earth and Planetary Astrophysics Solar and Stellar Astrophysics

Joint COCO and Mapillary Workshop at ICCV 2019: COCO Instance Segmentation Challenge Track

no code implementations6 Oct 2020 Zeming Li, Yuchen Ma, Yukang Chen, Xiangyu Zhang, Jian Sun

In this report, we present our object detection/instance segmentation system, MegDetV2, which works in a two-pass fashion, first to detect instances then to obtain segmentation.

Instance Segmentation object-detection +2

EqCo: Equivalent Rules for Self-supervised Contrastive Learning

1 code implementation5 Oct 2020 Benjin Zhu, Junqiang Huang, Zeming Li, Xiangyu Zhang, Jian Sun

In this paper, we propose EqCo (Equivalent Rules for Contrastive Learning) to make self-supervised learning irrelevant to the number of negative samples in the contrastive learning framework.

Contrastive Learning Self-Supervised Learning

MPG-Net: Multi-Prediction Guided Network for Segmentation of Retinal Layers in OCT Images

no code implementations28 Sep 2020 Zeyu Fu, Yang Sun, Xiangyu Zhang, Scott Stainton, Shaun Barney, Jeffry Hogg, William Innes, Satnam Dlay

In this paper, we propose a novel multiprediction guided attention network (MPG-Net) for automated retinal layer segmentation in OCT images.

Deep Learning & Software Engineering: State of Research and Future Directions

1 code implementation17 Sep 2020 Prem Devanbu, Matthew Dwyer, Sebastian Elbaum, Michael Lowry, Kevin Moran, Denys Poshyvanyk, Baishakhi Ray, Rishabh Singh, Xiangyu Zhang

The intent of this report is to serve as a potential roadmap to guide future work that sits at the intersection of SE & DL.

Activate or Not: Learning Customized Activation

4 code implementations CVPR 2021 Ningning Ma, Xiangyu Zhang, Ming Liu, Jian Sun

We present a simple, effective, and general activation function we term ACON which learns to activate the neurons or not.

object-detection Object Detection +1

WeightNet: Revisiting the Design Space of Weight Networks

2 code implementations ECCV 2020 Ningning Ma, Xiangyu Zhang, Jiawei Huang, Jian Sun

WeightNet is easy and memory-conserving to train, on the kernel space instead of the feature space.

Funnel Activation for Visual Recognition

6 code implementations ECCV 2020 Ningning Ma, Xiangyu Zhang, Jian Sun

We present a conceptually simple but effective funnel activation for image recognition tasks, called Funnel activation (FReLU), that extends ReLU and PReLU to a 2D activation by adding a negligible overhead of spatial condition.

Scene Generation Semantic Segmentation

LabelEnc: A New Intermediate Supervision Method for Object Detection

1 code implementation ECCV 2020 Miao Hao, Yitao Liu, Xiangyu Zhang, Jian Sun

In this paper we propose a new intermediate supervision method, named LabelEnc, to boost the training of object detection systems.

object-detection Object Detection

Weight-dependent Gates for Network Pruning

no code implementations4 Jul 2020 Yun Li, Zechun Liu, Weiqun Wu, Haotian Yao, Xiangyu Zhang, Chi Zhang, Baoqun Yin

In this paper, a simple yet effective network pruning framework is proposed to simultaneously address the problems of pruning indicator, pruning ratio, and efficiency constraint.

Network Pruning

Spherical Motion Dynamics: Learning Dynamics of Neural Network with Normalization, Weight Decay, and SGD

no code implementations15 Jun 2020 Ruosi Wan, Zhanxing Zhu, Xiangyu Zhang, Jian Sun

In this work, we comprehensively reveal the learning dynamics of neural network with normalization, weight decay (WD), and SGD (with momentum), named as Spherical Motion Dynamics (SMD).

D-square-B: Deep Distribution Bound for Natural-looking Adversarial Attack

no code implementations12 Jun 2020 Qiu-Ling Xu, Guanhong Tao, Xiangyu Zhang

We propose a novel technique that can generate natural-looking adversarial examples by bounding the variations induced for internal activation values in some deep layer(s), through a distribution quantile bound and a polynomial barrier loss function.

Adversarial Attack

Exhaustive goodness-of-fit via smoothed inference and graphics

1 code implementation26 May 2020 Sara Algeri, Xiangyu Zhang

Classical tests of goodness-of-fit aim to validate the conformity of a postulated model to the data under study.

Methodology Statistics Theory Applications Statistics Theory

Joint Multi-Dimension Pruning via Numerical Gradient Update

no code implementations18 May 2020 Zechun Liu, Xiangyu Zhang, Zhiqiang Shen, Zhe Li, Yichen Wei, Kwang-Ting Cheng, Jian Sun

To tackle these three naturally different dimensions, we proposed a general framework by defining pruning as seeking the best pruning vector (i. e., the numerical value of layer-wise channel number, spacial size, depth) and construct a unique mapping from the pruning vector to the pruned network structures.

Angle-based Search Space Shrinking for Neural Architecture Search

1 code implementation ECCV 2020 Yiming Hu, Yuding Liang, Zichao Guo, Ruosi Wan, Xiangyu Zhang, Yichen Wei, Qingyi Gu, Jian Sun

Comprehensive experiments show that ABS can dramatically enhance existing NAS approaches by providing a promising shrunk search space.

Neural Architecture Search

Dynamic Scale Training for Object Detection

4 code implementations26 Apr 2020 Yukang Chen, Peizhen Zhang, Zeming Li, Yanwei Li, Xiangyu Zhang, Lu Qi, Jian Sun, Jiaya Jia

We propose a Dynamic Scale Training paradigm (abbreviated as DST) to mitigate scale variation challenge in object detection.

Instance Segmentation Model Optimization +3

Personalized Re-ranking for Improving Diversity in Live Recommender Systems

no code implementations14 Apr 2020 Yichao Wang, Xiangyu Zhang, Zhirong Liu, Zhenhua Dong, Xinhua Feng, Ruiming Tang, Xiuqiang He

To overcome such limitation, our re-ranking model proposes a personalized DPP to model the trade-off between accuracy and diversity for each individual user.

Recommendation Systems Re-Ranking

Attentive Normalization for Conditional Image Generation

1 code implementation CVPR 2020 Yi Wang, Ying-Cong Chen, Xiangyu Zhang, Jian Sun, Jiaya Jia

Traditional convolution-based generative adversarial networks synthesize images based on hierarchical local operations, where long-range dependency relation is implicitly modeled with a Markov chain.

Conditional Image Generation Semantic correspondence +2

Learning Human-Object Interaction Detection using Interaction Points

1 code implementation CVPR 2020 Tiancai Wang, Tong Yang, Martin Danelljan, Fahad Shahbaz Khan, Xiangyu Zhang, Jian Sun

Human-object interaction (HOI) detection strives to localize both the human and an object as well as the identification of complex interactions between them.

Human-Object Interaction Detection Keypoint Detection +1

Dynamic Region-Aware Convolution

no code implementations CVPR 2021 Jin Chen, Xijun Wang, Zichao Guo, Xiangyu Zhang, Jian Sun

More gracefully, our DRConv transfers the increasing channel-wise filters to spatial dimension with learnable instructor, which not only improve representation ability of convolution, but also maintains computational cost and the translation-invariance as standard convolution dose.

Face Recognition General Classification +2

Learning Dynamic Routing for Semantic Segmentation

1 code implementation CVPR 2020 Yanwei Li, Lin Song, Yukang Chen, Zeming Li, Xiangyu Zhang, Xingang Wang, Jian Sun

To demonstrate the superiority of the dynamic property, we compare with several static architectures, which can be modeled as special cases in the routing space.

Semantic Segmentation

Detection in Crowded Scenes: One Proposal, Multiple Predictions

3 code implementations CVPR 2020 Xuangeng Chu, Anlin Zheng, Xiangyu Zhang, Jian Sun

We propose a simple yet effective proposal-based object detector, aiming at detecting highly-overlapped instances in crowded scenes.

Object Detection Pedestrian Detection

PointINS: Point-based Instance Segmentation

no code implementations13 Mar 2020 Lu Qi, Yi Wang, Yukang Chen, Yingcong Chen, Xiangyu Zhang, Jian Sun, Jiaya Jia

In this paper, we explore the mask representation in instance segmentation with Point-of-Interest (PoI) features.

Instance Segmentation Object Detection +2

Learning Delicate Local Representations for Multi-Person Pose Estimation

4 code implementations ECCV 2020 Yuanhao Cai, Zhicheng Wang, Zhengxiong Luo, Binyi Yin, Angang Du, Haoqian Wang, Xiangyu Zhang, Xinyu Zhou, Erjin Zhou, Jian Sun

To tackle this problem, we propose an efficient attention mechanism - Pose Refine Machine (PRM) to make a trade-off between local and global representations in output features and further refine the keypoint locations.

Keypoint Detection Multi-Person Pose Estimation

Beyond Application End-Point Results: Quantifying Statistical Robustness of MCMC Accelerators

no code implementations5 Mar 2020 Xiangyu Zhang, Ramin Bashizade, Yicheng Wang, Cheng Lyu, Sayan Mukherjee, Alvin R. Lebeck

Applying the framework to guide design space exploration shows that statistical robustness comparable to floating-point software can be achieved by slightly increasing the bit representation, without floating-point hardware requirements.

Towards Stabilizing Batch Statistics in Backward Propagation of Batch Normalization

1 code implementation ICLR 2020 Junjie Yan, Ruosi Wan, Xiangyu Zhang, Wei zhang, Yichen Wei, Jian Sun

Therefore many modified normalization techniques have been proposed, which either fail to restore the performance of BN completely, or have to introduce additional nonlinear operations in inference procedure and increase huge consumption.

Learning-Accelerated ADMM for Distributed Optimal Power Flow

no code implementations8 Nov 2019 David Biagioni, Peter Graf, Xiangyu Zhang, Ahmed Zamzam, Kyri Baker, Jennifer King

We propose a novel data-driven method to accelerate the convergence of Alternating Direction Method of Multipliers (ADMM) for solving distributed DC optimal power flow (DC-OPF) where lines are shared between independent network partitions.

Distributed Optimization

A Case for Quantifying Statistical Robustness of Specialized Probabilistic AI Accelerators

no code implementations27 Oct 2019 Xiangyu Zhang, Sayan Mukherjee, Alvin R. Lebeck

Although a common approach is to compare the end-point result quality using community-standard benchmarks and metrics, we claim a probabilistic architecture should provide some measure (or guarantee) of statistical robustness.

Resizable Neural Networks

no code implementations25 Sep 2019 Yichen Zhu, Xiangyu Zhang, Tong Yang, Jian Sun

We introduce the adaptive resizable networks as dynamic networks, which further improve the performance with less computational cost via data-dependent inference.

Data Augmentation Neural Architecture Search

VAENAS: Sampling Matters in Neural Architecture Search

no code implementations25 Sep 2019 Shizheng Qin, Yichen Zhu, Pengfei Hou, Xiangyu Zhang, Wenqiang Zhang, Jian Sun

In this paper, we propose a learnable sampling module based on variational auto-encoder (VAE) for neural architecture search (NAS), named as VAENAS, which can be easily embedded into existing weight sharing NAS framework, e. g., one-shot approach and gradient-based approach, and significantly improve the performance of searching results.

Neural Architecture Search

Arbitrage of Energy Storage in Electricity Markets with Deep Reinforcement Learning

no code implementations28 Apr 2019 Hanchen Xu, Xiao Li, Xiangyu Zhang, Junbo Zhang

In this letter, we address the problem of controlling energy storage systems (ESSs) for arbitrage in real-time electricity markets under price uncertainty.

reinforcement-learning Reinforcement Learning (RL)

DetNAS: Backbone Search for Object Detection

2 code implementations NeurIPS 2019 Yukang Chen, Tong Yang, Xiangyu Zhang, Gaofeng Meng, Xinyu Xiao, Jian Sun

In this work, we present DetNAS to use Neural Architecture Search (NAS) for the design of better backbones for object detection.

General Classification Image Classification +3

Meta-SR: A Magnification-Arbitrary Network for Super-Resolution

2 code implementations CVPR 2019 Xuecai Hu, Haoyuan Mu, Xiangyu Zhang, Zilei Wang, Tieniu Tan, Jian Sun

In this work, we propose a novel method called Meta-SR to firstly solve super-resolution of arbitrary scale factor (including non-integer scale factors) with a single model.

Image Super-Resolution

Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples

1 code implementation NeurIPS 2018 Guanhong Tao, Shiqing Ma, Yingqi Liu, Xiangyu Zhang

Results show that our technique can achieve 94% detection accuracy for 7 different kinds of attacks with 9. 91% false positives on benign inputs.

Face Recognition General Classification

DetNet: Design Backbone for Object Detection

no code implementations ECCV 2018 Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun

(1) Recent object detectors like FPN and RetinaNet usually involve extra stages against the task of image classification to handle the objects with various scales.

Classification General Classification +5

CrowdHuman: A Benchmark for Detecting Human in a Crowd

1 code implementation30 Apr 2018 Shuai Shao, Zijian Zhao, Boxun Li, Tete Xiao, Gang Yu, Xiangyu Zhang, Jian Sun

There are a total of $470K$ human instances from the train and validation subsets, and $~22. 6$ persons per image, with various kinds of occlusions in the dataset.

Ranked #5 on Pedestrian Detection on Caltech (using extra training data)

Human Detection Object Detection +1

DetNet: A Backbone network for Object Detection

2 code implementations17 Apr 2018 Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun

Due to the gap between the image classification and object detection, we propose DetNet in this paper, which is a novel backbone network specifically designed for object detection.

Classification General Classification +5

ExFuse: Enhancing Feature Fusion for Semantic Segmentation

no code implementations ECCV 2018 Zhenli Zhang, Xiangyu Zhang, Chao Peng, Dazhi Cheng, Jian Sun

Modern semantic segmentation frameworks usually combine low-level and high-level features from pre-trained backbone convolutional models to boost performance.

Ranked #3 on Semantic Segmentation on PASCAL VOC 2012 val (using extra training data)

Semantic Segmentation

MegDet: A Large Mini-Batch Object Detector

6 code implementations CVPR 2018 Chao Peng, Tete Xiao, Zeming Li, Yuning Jiang, Xiangyu Zhang, Kai Jia, Gang Yu, Jian Sun

The improvements in recent CNN-based object detection works, from R-CNN [11], Fast/Faster R-CNN [10, 31] to recent Mask R-CNN [14] and RetinaNet [24], mainly come from new network, new framework, or novel loss design.

object-detection Object Detection

Light-Head R-CNN: In Defense of Two-Stage Object Detector

5 code implementations20 Nov 2017 Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun

More importantly, simply replacing the backbone with a tiny network (e. g, Xception), our Light-Head R-CNN gets 30. 7 mmAP at 102 FPS on COCO, significantly outperforming the single-stage, fast detectors like YOLO and SSD on both speed and accuracy.

Channel Pruning for Accelerating Very Deep Neural Networks

1 code implementation ICCV 2017 Yihui He, Xiangyu Zhang, Jian Sun

In this paper, we introduce a new channel pruning method to accelerate very deep convolutional neural networks. Given a trained CNN model, we propose an iterative two-step algorithm to effectively prune each layer, by a LASSO regression based channel selection and least square reconstruction.


ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices

29 code implementations CVPR 2018 Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, Jian Sun

We introduce an extremely computation-efficient CNN architecture named ShuffleNet, which is designed specially for mobile devices with very limited computing power (e. g., 10-150 MFLOPs).

General Classification Image Classification +2

Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network

2 code implementations CVPR 2017 Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun

One of recent trends [30, 31, 14] in network architec- ture design is stacking small filters (e. g., 1x1 or 3x3) in the entire network because the stacked small filters is more ef- ficient than a large kernel, given the same computational complexity.

Semantic Segmentation

Identity Mappings in Deep Residual Networks

54 code implementations16 Mar 2016 Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

Deep residual networks have emerged as a family of extremely deep architectures showing compelling accuracy and nice convergence behaviors.

Image Classification

Deep Residual Learning for Image Recognition

416 code implementations CVPR 2016 Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

Image-to-Image Translation Medical Image Classification +7

Accelerating Very Deep Convolutional Networks for Classification and Detection

no code implementations26 May 2015 Xiangyu Zhang, Jianhua Zou, Kaiming He, Jian Sun

This paper aims to accelerate the test-time computation of convolutional neural networks (CNNs), especially very deep CNNs that have substantially impacted the computer vision community.

Classification General Classification +3

Object Detection Networks on Convolutional Feature Maps

no code implementations23 Apr 2015 Shaoqing Ren, Kaiming He, Ross Girshick, Xiangyu Zhang, Jian Sun

We discover that aside from deep feature maps, a deep and convolutional per-region classifier is of particular importance for object detection, whereas latest superior image classification models (such as ResNets and GoogLeNets) do not directly lead to good detection accuracy without using such a per-region classifier.

General Classification Image Classification +2

Efficient and Accurate Approximations of Nonlinear Convolutional Networks

no code implementations CVPR 2015 Xiangyu Zhang, Jianhua Zou, Xiang Ming, Kaiming He, Jian Sun

This paper aims to accelerate the test-time computation of deep convolutional neural networks (CNNs).

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

13 code implementations18 Jun 2014 Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

This requirement is "artificial" and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/scale.

General Classification Image Classification +3

Cannot find the paper you are looking for? You can Submit a new open access paper.