Search Results for author: Xiaolin Hu

Found 76 papers, 51 papers with code

Boosting Decision-based Black-box Adversarial Attacks with Random Sign Flip

no code implementations ECCV 2020 Wei-Lun Chen, Zhao-Xiang Zhang, Xiaolin Hu, Baoyuan Wu

Decision-based black-box adversarial attacks (decision-based attack) pose a severe threat to current deep neural networks, as they only need the predicted label of the target model to craft adversarial examples.

TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion

1 code implementation25 Jan 2024 Samuel Pegg, Kai Li, Xiaolin Hu

TDANet serves as the architectural foundation for the auditory and visual networks within TDFNet, offering an efficient model with fewer parameters.

speech-recognition Speech Recognition +1

HEDNet: A Hierarchical Encoder-Decoder Network for 3D Object Detection in Point Clouds

1 code implementation NeurIPS 2023 Gang Zhang, Junnan Chen, Guohuan Gao, Jianmin Li, Xiaolin Hu

To reduce computational costs, these methods resort to submanifold sparse convolutions, which prevent the information exchange among spatially disconnected features.

3D Object Detection Autonomous Driving +2

RTFS-Net: Recurrent Time-Frequency Modelling for Efficient Audio-Visual Speech Separation

1 code implementation29 Sep 2023 Samuel Pegg, Kai Li, Xiaolin Hu

This is the first time-frequency domain audio-visual speech separation method to outperform all contemporary time-domain counterparts.

Audio-Visual Speech Recognition speech-recognition +2

IIANet: An Intra- and Inter-Modality Attention Network for Audio-Visual Speech Separation

no code implementations16 Aug 2023 Kai Li, Runxuan Yang, Fuchun Sun, Xiaolin Hu

Recent research has made significant progress in designing fusion modules for audio-visual speech separation.

Speech Separation

NP-SemiSeg: When Neural Processes meet Semi-Supervised Semantic Segmentation

1 code implementation5 Aug 2023 JianFeng Wang, Daniela Massiceti, Xiaolin Hu, Vladimir Pavlovic, Thomas Lukasiewicz

This is useful in a wide range of real-world applications where collecting pixel-wise labels is not feasible in time or cost.

Segmentation Self-Driving Cars +3

Physically Realizable Natural-Looking Clothing Textures Evade Person Detectors via 3D Modeling

1 code implementation CVPR 2023 Zhanhao Hu, Wenda Chu, Xiaopei Zhu, HUI ZHANG, Bo Zhang, Xiaolin Hu

In order to craft natural-looking adversarial clothes that can evade person detectors at multiple viewing angles, we propose adversarial camouflage textures (AdvCaT) that resemble one kind of the typical textures of daily clothes, camouflage textures.

Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model

1 code implementation31 May 2023 Héctor Martel, Julius Richter, Kai Li, Xiaolin Hu, Timo Gerkmann

We propose Audio-Visual Lightweight ITerative model (AVLIT), an effective and lightweight neural network that uses Progressive Learning (PL) to perform audio-visual speech separation in noisy environments.

Speech Separation

Amplification trojan network: Attack deep neural networks by amplifying their inherent weakness

1 code implementation28 May 2023 Zhanhao Hu, Jun Zhu, Bo Zhang, Xiaolin Hu

Recent works found that deep neural networks (DNNs) can be fooled by adversarial examples, which are crafted by adding adversarial noise on clean inputs.

On the Importance of Backbone to the Adversarial Robustness of Object Detectors

no code implementations27 May 2023 Xiao Li, Hang Chen, Xiaolin Hu

We argue that using adversarially pre-trained backbone networks is essential for enhancing the adversarial robustness of object detectors.

Adversarial Robustness Autonomous Driving +3

Meta Semantics: Towards better natural language understanding and reasoning

no code implementations20 Apr 2023 Xiaolin Hu

Natural language understanding is one of the most challenging topics in artificial intelligence.

Natural Language Understanding

MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems

no code implementations10 Mar 2023 Aminul Huq, Weiyi Zhang, Xiaolin Hu

We merge the capabilities of both supervised and unsupervised approaches in our method to generate new adversarial samples which aid in improving model robustness.

Adversarial Attack Automatic Speech Recognition +2

CEDNet: A Cascade Encoder-Decoder Network for Dense Prediction

2 code implementations13 Feb 2023 Gang Zhang, Ziyi Li, Chufeng Tang, Jianmin Li, Xiaolin Hu

A hallmark of CEDNet is its ability to incorporate high-level features from early stages to guide low-level feature learning in subsequent stages, thereby enhancing the effectiveness of multi-scale feature fusion.

Instance Segmentation object-detection +3

NP-Match: Towards a New Probabilistic Model for Semi-Supervised Learning

1 code implementation31 Jan 2023 JianFeng Wang, Xiaolin Hu, Thomas Lukasiewicz

In this work, we adjust neural processes (NPs) to the semi-supervised image classification task, resulting in a new method named NP-Match.

Classification Semi-Supervised Image Classification

Language-Driven Anchors for Zero-Shot Adversarial Robustness

1 code implementation30 Jan 2023 Xiao Li, Wei zhang, Yining Liu, Zhanhao Hu, Bo Zhang, Xiaolin Hu

Previous researches mainly focus on improving adversarial robustness in the fully supervised setting, leaving the challenging domain of zero-shot adversarial robustness an open question.

Adversarial Defense Adversarial Robustness +3

An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits

2 code implementations21 Dec 2022 Kai Li, Fenghua Xie, Hang Chen, Kexin Yuan, Xiaolin Hu

Then, inspired by the large number of connections between cortical regions and the thalamus, the model fuses the auditory and visual information in a thalamic subnetwork through top-down connections.

Speech Separation

Extracting Semantic Knowledge from GANs with Unsupervised Learning

no code implementations30 Nov 2022 Jianjin Xu, Zhaoxiang Zhang, Xiaolin Hu

Second, we train image-to-image translation networks on the synthesized datasets, enabling semantic-conditional image synthesis without human annotations.

Image Segmentation Image-to-Image Translation +2

Drug repositioning for Alzheimer's disease with transfer learning

no code implementations27 Oct 2022 Yetao Wu, Han Liu, Jie Yan, Xiaolin Hu

After training, the model is used for virtual screening to find potential drugs for Alzheimer's disease (AD) treatment.

Drug Discovery Transfer Learning

An efficient encoder-decoder architecture with top-down attention for speech separation

1 code implementation30 Sep 2022 Kai Li, Runxuan Yang, Xiaolin Hu

In addition, a large-size version of TDANet obtained SOTA results on three datasets, with MACs still only 10\% of Sepformer and the CPU inference time only 24\% of Sepformer.

Speech Separation

On the Privacy Effect of Data Enhancement via the Lens of Memorization

1 code implementation17 Aug 2022 Xiao Li, Qiongxiu Li, Zhanhao Hu, Xiaolin Hu

We demonstrate that the generalization gap and privacy leakage are less correlated than those of the previous results.

Adversarial Robustness Data Augmentation +1

Visual Recognition by Request

1 code implementation CVPR 2023 Chufeng Tang, Lingxi Xie, Xiaopeng Zhang, Xiaolin Hu, Qi Tian

Humans have the ability of recognizing visual semantics in an unlimited granularity, but existing visual recognition algorithms cannot achieve this goal.

Instance Segmentation Semantic Segmentation

Active Pointly-Supervised Instance Segmentation

1 code implementation23 Jul 2022 Chufeng Tang, Lingxi Xie, Gang Zhang, Xiaopeng Zhang, Qi Tian, Xiaolin Hu

In this paper, we present an economic active learning setting, named active pointly-supervised instance segmentation (APIS), which starts with box-level annotations and iteratively samples a point within the box and asks if it falls on the object.

Active Learning Instance Segmentation +2

NP-Match: When Neural Processes meet Semi-Supervised Learning

1 code implementation3 Jul 2022 JianFeng Wang, Thomas Lukasiewicz, Daniela Massiceti, Xiaolin Hu, Vladimir Pavlovic, Alexandros Neophytou

Semi-supervised learning (SSL) has been widely explored in recent years, and it is an effective way of leveraging unlabeled data to reduce the reliance on labeled data.

Semi-Supervised Image Classification

Infrared Invisible Clothing:Hiding from Infrared Detectors at Multiple Angles in Real World

no code implementations12 May 2022 Xiaopei Zhu, Zhanhao Hu, Siyuan Huang, Jianmin Li, Xiaolin Hu

We simulated the process from cloth to clothing in the digital world and then designed the adversarial "QR code" pattern.

An STDP-Based Supervised Learning Algorithm for Spiking Neural Networks

no code implementations7 Mar 2022 Zhanhao Hu, Tao Wang, Xiaolin Hu

Compared with rate-based artificial neural networks, Spiking Neural Networks (SNN) provide a more biological plausible model for the brain.

The Winning Solution to the iFLYTEK Challenge 2021 Cultivated Land Extraction from High-Resolution Remote Sensing Image

1 code implementation22 Feb 2022 Zhen Zhao, Yuqiu Liu, Gang Zhang, Liang Tang, Xiaolin Hu

This report introduces our solution to the iFLYTEK challenge 2021 cultivated land extraction from high-resolution remote sensing image.

Instance Segmentation Segmentation +1

Infrared Invisible Clothing: Hiding From Infrared Detectors at Multiple Angles in Real World

no code implementations CVPR 2022 Xiaopei Zhu, Zhanhao Hu, Siyuan Huang, Jianmin Li, Xiaolin Hu

We simulated the process from cloth to clothing in the digital world and then designed the adversarial "QR code" pattern.

RSG: A Simple but Effective Module for Learning Imbalanced Datasets

1 code implementation CVPR 2021 JianFeng Wang, Thomas Lukasiewicz, Xiaolin Hu, Jianfei Cai, Zhenghua Xu

Imbalanced datasets widely exist in practice and area great challenge for training deep neural models with agood generalization on infrequent classes.

Long-tail Learning

Convolutional Neural Networks with Gated Recurrent Connections

1 code implementation5 Jun 2021 JianFeng Wang, Xiaolin Hu

The critical element of RCNN is the recurrent convolutional layer (RCL), which incorporates recurrent connections between neurons in the standard convolutional layer.

object-detection Object Detection +2

Attack on practical speaker verification system using universal adversarial perturbations

1 code implementation19 May 2021 Weiyi Zhang, Shuning Zhao, Le Liu, Jianmin Li, Xingliang Cheng, Thomas Fang Zheng, Xiaolin Hu

In authentication scenarios, applications of practical speaker verification systems usually require a person to read a dynamic authentication text.

Real-World Adversarial Attack Room Impulse Response (RIR) +3

RefineMask: Towards High-Quality Instance Segmentation with Fine-Grained Features

1 code implementation CVPR 2021 Gang Zhang, Xin Lu, Jingru Tan, Jianmin Li, Zhaoxiang Zhang, Quanquan Li, Xiaolin Hu

In this work, we propose a new method called RefineMask for high-quality instance segmentation of objects and scenes, which incorporates fine-grained features during the instance-wise segmenting process in a multi-stage manner.

Instance Segmentation Semantic Segmentation +1

Rethinking Natural Adversarial Examples for Classification Models

1 code implementation23 Feb 2021 Xiao Li, Jianmin Li, Ting Dai, Jie Shi, Jun Zhu, Xiaolin Hu

A detection model based on the classification model EfficientNet-B7 achieved a top-1 accuracy of 53. 95%, surpassing previous state-of-the-art classification models trained on ImageNet, suggesting that accurate localization information can significantly boost the performance of classification models on ImageNet-A.

Classification General Classification +2

The MSR-Video to Text Dataset with Clean Annotations

1 code implementation12 Feb 2021 Haoran Chen, Jianmin Li, Simone Frintrop, Xiaolin Hu

We cleaned the MSR-VTT annotations by removing these problems, then tested several typical video captioning models on the cleaned dataset.

Sentence Video Captioning

Frame Difference-Based Temporal Loss for Video Stylization

2 code implementations11 Feb 2021 Jianjin Xu, Zheyang Xiong, Xiaolin Hu

To ensure temporal inconsistency between the frames of the stylized video, a common approach is to estimate the optic flow of the pixels in the original video and make the generated pixels match the estimated optical flow.

Optical Flow Estimation Style Transfer

Fooling thermal infrared pedestrian detectors in real world using small bulbs

no code implementations20 Jan 2021 Xiaopei Zhu, Xiao Li, Jianmin Li, Zheyao Wang, Xiaolin Hu

By using a combination method, we successfully hide from the visible light and infrared object detection systems at the same time.

Autonomous Driving object-detection +1

DAM: Discrepancy Alignment Metric for Face Recognition

no code implementations ICCV 2021 Jiaheng Liu, Yudong Wu, Yichao Wu, Chuming Li, Xiaolin Hu, Ding Liang, Mengyu Wang

To estimate the LID of each face image in the verification process, we propose two types of LID Estimation (LIDE) methods, which are reference-based and learning-based estimation methods, respectively.

Face Recognition

Generating Adjacency-Constrained Subgoals in Hierarchical Reinforcement Learning

1 code implementation NeurIPS 2020 Tianren Zhang, Shangqi Guo, Tian Tan, Xiaolin Hu, Feng Chen

In this paper, we show that this problem can be effectively alleviated by restricting the high-level action space from the whole goal space to a $k$-step adjacent region of the current state using an adjacency constraint.

Continuous Control Hierarchical Reinforcement Learning +2

Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection

7 code implementations NeurIPS 2020 Xiang Li, Wenhai Wang, Lijun Wu, Shuo Chen, Xiaolin Hu, Jun Li, Jinhui Tang, Jian Yang

Specifically, we merge the quality estimation into the class prediction vector to form a joint representation of localization quality and classification, and use a vector to represent arbitrary distribution of box locations.

Dense Object Detection General Classification

End-to-End Face Parsing via Interlinked Convolutional Neural Networks

1 code implementation12 Feb 2020 Zi Yin, Valentin Yiu, Xiaolin Hu, Liang Tang

Face parsing is an important computer vision task that requires accurate pixel segmentation of facial parts (such as eyes, nose, mouth, etc.

Face Parsing

6D Object Pose Regression via Supervised Learning on Point Clouds

1 code implementation24 Jan 2020 Ge Gao, Mikko Lauri, Yulong Wang, Xiaolin Hu, Jianwei Zhang, Simone Frintrop

We use depth information represented by point clouds as the input to both deep networks and geometry-based pose refinement and use separate networks for rotation and translation regression.

Object regression +1

Delving Deeper into the Decoder for Video Captioning

1 code implementation16 Jan 2020 Haoran Chen, Jianmin Li, Xiaolin Hu

Video captioning is an advanced multi-modal task which aims to describe a video clip using a natural language sentence.

Sentence Video Captioning +1

Interpretable Disentanglement of Neural Networks by Extracting Class-Specific Subnetwork

no code implementations7 Oct 2019 Yulong Wang, Xiaolin Hu, Hang Su

We also apply extracted subnetworks in visual explanation and adversarial example detection tasks by merely replacing the original full model with class-specific subnetworks.

Disentanglement

Pruning from Scratch

1 code implementation27 Sep 2019 Yulong Wang, Xiaolu Zhang, Lingxi Xie, Jun Zhou, Hang Su, Bo Zhang, Xiaolin Hu

Network pruning is an important research field aiming at reducing computational costs of neural networks.

Network Pruning

A Semantics-Assisted Video Captioning Model Trained with Scheduled Sampling

2 code implementations31 Aug 2019 Haoran Chen, Ke Lin, Alexander Maye, Jianming Li, Xiaolin Hu

Given the features of a video, recurrent neural networks can be used to automatically generate a caption for the video.

Sentence Video Captioning

Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks

3 code implementations23 May 2019 Xiang Li, Xiaolin Hu, Jian Yang

The Convolutional Neural Networks (CNNs) generate the feature representation of complex objects by collecting hierarchical and different parts of semantic sub-features.

Image Classification Object Detection

Knowledge Distillation via Route Constrained Optimization

1 code implementation ICCV 2019 Xiao Jin, Baoyun Peng, Yi-Chao Wu, Yu Liu, Jiaheng Liu, Ding Liang, Xiaolin Hu

However, we find that the representation of a converged heavy model is still a strong constraint for training a small student model, which leads to a high lower bound of congruence loss.

Face Recognition Knowledge Distillation

Selective Kernel Networks

20 code implementations CVPR 2019 Xiang Li, Wenhai Wang, Xiaolin Hu, Jian Yang

A building block called Selective Kernel (SK) unit is designed, in which multiple branches with different kernel sizes are fused using softmax attention that is guided by the information in these branches.

Ranked #98 on Image Classification on CIFAR-100 (using extra training data)

Image Classification

Dynamic Multi-path Neural Network

no code implementations28 Feb 2019 Yingcheng Su, Shunfeng Zhou, Yi-Chao Wu, Tian Su, Ding Liang, Jiaheng Liu, Dixin Zheng, Yingxu Wang, Junjie Yan, Xiaolin Hu

Although deeper and larger neural networks have achieved better performance, the complex network structure and increasing computational cost cannot meet the demands of many resource-constrained applications.

Interlinked Convolutional Neural Networks for Face Parsing

no code implementations7 Jun 2018 Yisu Zhou, Xiaolin Hu, Bo Zhang

It amounts to labeling each pixel with appropriate facial parts such as eyes and nose.

Face Parsing

Interpret Neural Networks by Identifying Critical Data Routing Paths

no code implementations CVPR 2018 Yulong Wang, Hang Su, Bo Zhang, Xiaolin Hu

Interpretability of a deep neural network aims to explain the rationale behind its decisions and enable the users to understand the intelligent agents, which has become an important issue due to its importance in practical applications.

High Performance Visual Tracking With Siamese Region Proposal Network

5 code implementations CVPR 2018 Bo Li, Junjie Yan, Wei Wu, Zheng Zhu, Xiaolin Hu

Visual object tracking has been a fundamental topic in recent years and many deep learning based trackers have achieved state-of-the-art performance on multiple benchmarks.

Region Proposal Visual Object Tracking +2

Adversarial Attacks and Defences Competition

1 code implementation31 Mar 2018 Alexey Kurakin, Ian Goodfellow, Samy Bengio, Yinpeng Dong, Fangzhou Liao, Ming Liang, Tianyu Pang, Jun Zhu, Xiaolin Hu, Cihang Xie, Jian-Yu Wang, Zhishuai Zhang, Zhou Ren, Alan Yuille, Sangxia Huang, Yao Zhao, Yuzhe Zhao, Zhonglin Han, Junjiajia Long, Yerkebulan Berdibekov, Takuya Akiba, Seiya Tokui, Motoki Abe

To accelerate research on adversarial examples and robustness of machine learning classifiers, Google Brain organized a NIPS 2017 competition that encouraged researchers to develop new methods to generate adversarial examples as well as to develop new ways to defend against them.

BIG-bench Machine Learning

Understanding the Disharmony between Dropout and Batch Normalization by Variance Shift

4 code implementations CVPR 2019 Xiang Li, Shuo Chen, Xiaolin Hu, Jian Yang

Theoretically, we find that Dropout would shift the variance of a specific neural unit when we transfer the state of that network from train to test.

A Hierarchical Recurrent Neural Network for Symbolic Melody Generation

2 code implementations14 Dec 2017 Jian Wu, Changran Hu, Yulong Wang, Xiaolin Hu, Jun Zhu

In this paper, we present a hierarchical recurrent neural network for melody generation, which consists of three Long-Short-Term-Memory (LSTM) subnetworks working in a coarse-to-fine manner along time.

Sound Multimedia

Gated Recurrent Convolution Neural Network for OCR

1 code implementation NeurIPS 2017 Jianfeng Wang, Xiaolin Hu

Its critical component, Gated Recurrent Convolution Layer (GRCL), is constructed by adding a gate to the Recurrent Convolution Layer (RCL), the critical component of RCNN.

General Classification Image Classification +2

Boosting Adversarial Attacks with Momentum

7 code implementations CVPR 2018 Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, Jianguo Li

To further improve the success rates for black-box attacks, we apply momentum iterative algorithms to an ensemble of models, and show that the adversarially trained models with a strong defense ability are also vulnerable to our black-box attacks.

Adversarial Attack

Estimation of the volume of the left ventricle from MRI images using deep neural networks

1 code implementation13 Feb 2017 Fangzhou Liao, Xi Chen, Xiaolin Hu, Sen Song

In 2016, Kaggle organized a competition to estimate the volume of LV from MRI images.

UnrealStereo: Controlling Hazardous Factors to Analyze Stereo Vision

no code implementations14 Dec 2016 Yi Zhang, Weichao Qiu, Qi Chen, Xiaolin Hu, Alan Yuille

We generate a large synthetic image dataset with automatically computed hazardous regions and analyze algorithms on these regions.

Image Generation

Joint Training of Cascaded CNN for Face Detection

no code implementations CVPR 2016 Hongwei Qin, Junjie Yan, Xiu Li, Xiaolin Hu

Cascade has been widely used in face detection, where classifier with low computation cost can be firstly used to shrink most of the background while keeping the recall.

Face Detection Region Proposal

Convolutional Neural Networks with Intra-Layer Recurrent Connections for Scene Labeling

no code implementations NeurIPS 2015 Ming Liang, Xiaolin Hu, Bo Zhang

We adopt a deep recurrent convolutional neural network (RCNN) for this task, which is originally proposed for object recognition.

Object Recognition Scene Labeling

Recurrent Convolutional Neural Network for Object Recognition

no code implementations CVPR 2015 Ming Liang, Xiaolin Hu

Inspired by this fact, we propose a recurrent CNN (RCNN) for object recognition by incorporating recurrent connections into each convolutional layer.

Object Object Recognition

A Reverse Hierarchy Model for Predicting Eye Fixations

no code implementations CVPR 2014 Tianlin Shi, Liang Ming, Xiaolin Hu

A number of psychological and physiological evidences suggest that early visual attention works in a coarse-to-fine way, which lays a basis for the reverse hierarchy theory (RHT).

Image Super-Resolution Saliency Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.