Search Results for author: Xiao Tan

Found 29 papers, 16 papers with code

A Simple Structure For Building A Robust Model

1 code implementation25 Apr 2022 Xiao Tan, Jingbo Gao, Ruolin Li

As deep learning applications, especially programs of computer vision, are increasingly deployed in our lives, we have to think more urgently about the security of these applications. One effective way to improve the security of deep learning models is to perform adversarial training, which allows the model to be compatible with samples that are deliberately created for use in attacking the model. Based on this, we propose a simple architecture to build a model with a certain degree of robustness, which improves the robustness of the trained network by adding an adversarial sample detection network for cooperative training. At the same time, we design a new data sampling strategy that incorporates multiple existing attacks, allowing the model to adapt to many different adversarial attacks with a single training. We conducted some experiments to test the effectiveness of this design based on Cifar10 dataset, and the results indicate that it has some degree of positive effect on the robustness of the model. Our code could be found at https://github. com/dowdyboy/simple_structure_for_robust_model.

GitNet: Geometric Prior-based Transformation for Birds-Eye-View Segmentation

no code implementations16 Apr 2022 Shi Gong, Xiaoqing Ye, Xiao Tan, Jingdong Wang, Errui Ding, Yu Zhou, Xiang Bai

Birds-eye-view (BEV) semantic segmentation is critical for autonomous driving for its powerful spatial representation ability.

Autonomous Driving Semantic Segmentation

Reusing the Task-specific Classifier as a Discriminator: Discriminator-free Adversarial Domain Adaptation

1 code implementation8 Apr 2022 Lin Chen, Huaian Chen, Zhixiang Wei, Xin Jin, Xiao Tan, Yi Jin, Enhong Chen

Such NWD can be coupled with the classifier to serve as a discriminator satisfying the K-Lipschitz constraint without the requirements of additional weight clipping or gradient penalty strategy.

Unsupervised Domain Adaptation

Rope3D: TheRoadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task

no code implementations25 Mar 2022 Xiaoqing Ye, Mao Shu, Hanyu Li, Yifeng Shi, YingYing Li, Guangjie Wang, Xiao Tan, Errui Ding

On the other hand, the data captured from roadside cameras have strengths over frontal-view data, which is believed to facilitate a safer and more intelligent autonomous driving system.

Autonomous Driving Monocular 3D Object Detection

SGM3D: Stereo Guided Monocular 3D Object Detection

1 code implementation3 Dec 2021 Zheyuan Zhou, Liang Du, Xiaoqing Ye, Zhikang Zou, Xiao Tan, Li Zhang, xiangyang xue, Jianfeng Feng

Monocular 3D object detection aims to predict the object location, dimension and orientation in 3D space alongside the object category given only a monocular image.

Autonomous Driving Depth Estimation +2

Weakly-Supervised Spatio-Temporal Anomaly Detection in Surveillance Video

no code implementations9 Aug 2021 Jie Wu, Wei zhang, Guanbin Li, Wenhao Wu, Xiao Tan, YingYing Li, Errui Ding, Liang Lin

In this paper, we introduce a novel task, referred to as Weakly-Supervised Spatio-Temporal Anomaly Detection (WSSTAD) in surveillance video.

Anomaly Detection

Good Practices and A Strong Baseline for Traffic Anomaly Detection

1 code implementation9 May 2021 Yuxiang Zhao, Wenhao Wu, Yue He, YingYing Li, Xiao Tan, Shifeng Chen

In this paper, we propose a straightforward and efficient framework that includes pre-processing, a dynamic track module, and post-processing.

Anomaly Detection Video Stabilization

On the Undesired Equilibria Induced by Control Barrier Function Based Quadratic Programs

no code implementations30 Apr 2021 Xiao Tan, Dimos V. Dimarogonas

We then provide analytical results on how a certain parameter in the original quadratic program should be chosen to remove the undesired equilibrium points or to confine them in a small neighborhood of the origin.

High-order Barrier Functions: Robustness, Safety and Performance-Critical Control

no code implementations31 Mar 2021 Xiao Tan, Wenceslao Shaw Cortez, Dimos V. Dimarogonas

Furthermore, the proposed formulation accounts for "performance-critical" control: it guarantees that a subset of the forward invariant set will admit any existing, bounded control law, while still ensuring forward invariance of the set.

Revealing the Reciprocal Relations Between Self-Supervised Stereo and Monocular Depth Estimation

no code implementations ICCV 2021 Zhi Chen, Xiaoqing Ye, Wei Yang, Zhenbo Xu, Xiao Tan, Zhikang Zou, Errui Ding, Xinming Zhang, Liusheng Huang

Second, we introduce an occlusion-aware distillation (OA Distillation) module, which leverages the predicted depths from StereoNet in non-occluded regions to train our monocular depth estimation network named SingleNet.

Monocular Depth Estimation Stereo Matching

Understanding Image Retrieval Re-Ranking: A Graph Neural Network Perspective

1 code implementation14 Dec 2020 Xuanmeng Zhang, Minyue Jiang, Zhedong Zheng, Xiao Tan, Errui Ding, Yi Yang

We argue that the first phase equals building the k-nearest neighbor graph, while the second phase can be viewed as spreading the message within the graph.

Image Retrieval Re-Ranking

Coherent Loss: A Generic Framework for Stable Video Segmentation

no code implementations25 Oct 2020 Mingyang Qian, Yi Fu, Xiao Tan, YingYing Li, Jinqing Qi, Huchuan Lu, Shilei Wen, Errui Ding

Video segmentation approaches are of great importance for numerous vision tasks especially in video manipulation for entertainment.

Frame Semantic Segmentation +2

Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching

1 code implementation NeurIPS 2020 Di Hu, Rui Qian, Minyue Jiang, Xiao Tan, Shilei Wen, Errui Ding, Weiyao Lin, Dejing Dou

First, we propose to learn robust object representations by aggregating the candidate sound localization results in the single source scenes.

Object Localization

Face Sketch Synthesis with Style Transfer using Pyramid Column Feature

1 code implementation18 Sep 2020 Chaofeng Chen, Xiao Tan, Kwan-Yee K. Wong

We utilize a fully convolutional neural network (FCNN) to create the content image, and propose a style transfer approach to introduce textures and shadings based on a newly proposed pyramid column feature.

Face Sketch Synthesis Style Transfer

PointTrack++ for Effective Online Multi-Object Tracking and Segmentation

1 code implementation3 Jul 2020 Zhenbo Xu, Wei zhang, Xiao Tan, Wei Yang, Xiangbo Su, Yuchen Yuan, Hongwu Zhang, Shilei Wen, Errui Ding, Liusheng Huang

In this work, we present PointTrack++, an effective on-line framework for MOTS, which remarkably extends our recently proposed PointTrack framework.

Data Augmentation Instance Segmentation +5

Segment as Points for Efficient Online Multi-Object Tracking and Segmentation

1 code implementation ECCV 2020 Zhenbo Xu, Wei zhang, Xiao Tan, Wei Yang, Huan Huang, Shilei Wen, Errui Ding, Liusheng Huang

The resulting online MOTS framework, named PointTrack, surpasses all the state-of-the-art methods including 3D tracking methods by large margins (5. 4% higher MOTSA and 18 times faster over MOTSFusion) with the near real-time speed (22 FPS).

Multi-Object Tracking Multi-Object Tracking and Segmentation +1

ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection

1 code implementation1 Mar 2020 Zhenbo Xu, Wei zhang, Xiaoqing Ye, Xiao Tan, Wei Yang, Shilei Wen, Errui Ding, Ajin Meng, Liusheng Huang

The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes.

2D object detection 3D Object Detection +2

Dynamic Inference: A New Approach Toward Efficient Video Action Recognition

no code implementations9 Feb 2020 Wenhao Wu, Dongliang He, Xiao Tan, Shifeng Chen, Yi Yang, Shilei Wen

In a nutshell, we treat input frames and network depth of the computational graph as a 2-dimensional grid, and several checkpoints are placed on this grid in advance with a prediction module.

Action Recognition Action Recognition In Videos +1

Perspective-Guided Convolution Networks for Crowd Counting

1 code implementation ICCV 2019 Zhaoyi Yan, Yuchen Yuan, WangMeng Zuo, Xiao Tan, Yezhen Wang, Shilei Wen, Errui Ding

In this paper, we propose a novel perspective-guided convolution (PGC) for convolutional neural network (CNN) based crowd counting (i. e. PGCNet), which aims to overcome the dramatic intra-scene scale variations of people due to the perspective effect.

Crowd Counting

Recognizing Part Attributes with Insufficient Data

1 code implementation ICCV 2019 Xiangyun Zhao, Yi Yang, Feng Zhou, Xiao Tan, Yuchen Yuan, Yingze Bao, Ying Wu

Although great progress has been made to apply object-level recognition, recognizing the attributes of parts remains less applicable since the training data for part attributes recognition is usually scarce especially for internet-scale applications.

Semi-Supervised Learning for Face Sketch Synthesis in the Wild

1 code implementation12 Dec 2018 Chaofeng Chen, Wei Liu, Xiao Tan, Kwan-Yee K. Wong

Instead of supervising the network with ground truth sketches, we first perform patch matching in feature space between the input photo and photos in a small reference set of photo-sketch pairs.

Face Sketch Synthesis Patch Matching

Fine-grained Video Categorization with Redundancy Reduction Attention

no code implementations ECCV 2018 Chen Zhu, Xiao Tan, Feng Zhou, Xiao Liu, Kaiyu Yue, Errui Ding, Yi Ma

Specifically, it firstly summarizes the video by weight-summing all feature vectors in the feature maps of selected frames with a spatio-temporal soft attention, and then predicts which channels to suppress or to enhance according to this summary with a learned non-linear transform.

Video Classification

Improving Annotation for 3D Pose Dataset of Fine-Grained Object Categories

2 code implementations19 Oct 2018 Yaming Wang, Xiao Tan, Yi Yang, Ziyu Li, Xiao Liu, Feng Zhou, Larry S. Davis

Existing 3D pose datasets of object categories are limited to generic object types and lack of fine-grained information.

3D Pose Estimation Object Recognition

Multipoint Filtering with Local Polynomial Approximation and Range Guidance

no code implementations CVPR 2014 Xiao Tan, Changming Sun, Tuan D. Pham

By using the hybrid of the local polynomial model and color/intensity based range guidance, the proposed method not only preserves edges but also does a much better job in preserving spatial variation than existing popular filtering methods.

Depth Image Upsampling Image Denoising

Cannot find the paper you are looking for? You can Submit a new open access paper.