Boosting Deep Neural Network Efficiency with Dual-Module Inference

no code implementations ICML 2020 Liu Liu, Lei Deng, Zhaodong Chen, yuke wang, Shuangchen Li, Jingwei Zhang, Yihua Yang, Zhenyu Gu, Yufei Ding, Yuan Xie

Using Deep Neural Networks (DNNs) in machine learning tasks is promising in delivering high-quality results but challenging to meet stringent latency requirements and energy constraints because of the memory-bound and the compute-bound execution pattern of DNNs.

LatticeNet: Towards Lightweight Image Super-resolution with Lattice Block

no code implementations ECCV 2020 Xiaotong Luo, Yuan Xie, Yulun Zhang, Yanyun Qu, Cuihua Li, Yun Fu

Drawing lessons from lattice filter bank, we design the lattice block (LB) in which two butterfly structures are applied to combine two RBs.

Image Super-Resolution

Understanding GNN Computational Graph: A Coordinated Computation, IO, and Memory Perspective

no code implementations18 Oct 2021 Hengrui Zhang, Zhongming Yu, Guohao Dai, Guyue Huang, Yufei Ding, Yuan Xie, Yu Wang

The same data are propagated through the graph structure to perform the same neural operation multiple times in GNNs, leading to redundant computation which accounts for 92. 4% of total operators.

Program-to-Circuit: Exploiting GNNs for Program Representation and Circuit Translation

no code implementations13 Sep 2021 Nan Wu, Huake He, Yuan Xie, Pan Li, Cong Hao

Pioneering in this direction, we expect more GNN endeavors to revolutionize this high-demand Program-to-Circuit problem and to enrich the expressiveness of GNNs on programs.

Transfer Learning Translation

H2Learn: High-Efficiency Learning Accelerator for High-Accuracy Spiking Neural Networks

no code implementations25 Jul 2021 Ling Liang, Zheng Qu, Zhaodong Chen, Fengbin Tu, Yujie Wu, Lei Deng, Guoqi Li, Peng Li, Yuan Xie

Although spiking neural networks (SNNs) take benefits from the bio-plausible neural modeling, the low accuracy under the common local synaptic plasticity learning rules limits their application in many practical tasks.

Dual Reweighting Domain Generalization for Face Presentation Attack Detection

no code implementations30 Jun 2021 Shubao Liu, Ke-Yue Zhang, Taiping Yao, Kekai Sheng, Shouhong Ding, Ying Tai, Jilin Li, Yuan Xie, Lizhuang Ma

Face anti-spoofing approaches based on domain generalization (DG) have drawn growing attention due to their robustness for unseen scenarios.

Domain Generalization Face Anti-Spoofing +1

Novelty Detection via Contrastive Learning with Negative Data Augmentation

no code implementations18 Jun 2021 Chengwei Chen, Yuan Xie, Shaohui Lin, Ruizhi Qiao, Jian Zhou, Xin Tan, Yi Zhang, Lizhuang Ma

Moreover, our model is more stable for training in a non-adversarial manner, compared to other adversarial based novelty detection methods.

Contrastive Learning Data Augmentation +2

Towards Compact Single Image Super-Resolution via Contrastive Self-distillation

1 code implementation25 May 2021 Yanbo Wang, Shaohui Lin, Yanyun Qu, Haiyan Wu, Zhizhong Zhang, Yuan Xie, Angela Yao

Convolutional neural networks (CNNs) are highly successful for super-resolution (SR) but often require sophisticated architectures with heavy memory cost and computational overhead, significantly restricts their practical deployments on resource-limited devices.

Image Super-Resolution SSIM +1

Discrete-continuous Action Space Policy Gradient-based Attention for Image-Text Matching

no code implementations CVPR 2021 ShiYang Yan, Li Yu, Yuan Xie

We propose a novel attention scheme which projects the image and text embedding into a common space and optimises the attention weights directly towards the evaluation metrics.

Text Matching

Contrastive Learning for Compact Single Image Dehazing

2 code implementations CVPR 2021 Haiyan Wu, Yanyun Qu, Shaohui Lin, Jian Zhou, Ruizhi Qiao, Zhizhong Zhang, Yuan Xie, Lizhuang Ma

In this paper, we propose a novel contrastive regularization (CR) built upon contrastive learning to exploit both the information of hazy images and clear images as negative and positive samples, respectively.

Contrastive Learning Image Dehazing +1

Farewell to Mutual Information: Variational Distillation for Cross-Modal Person Re-Identification

1 code implementation CVPR 2021 Xudong Tian, Zhizhong Zhang, Shaohui Lin, Yanyun Qu, Yuan Xie, Lizhuang Ma

The Information Bottleneck (IB) provides an information theoretic principle for representation learning, by retaining all information relevant for predicting label while minimizing the redundancy.

Cross-Modality Person Re-identification Cross-Modal Person Re-Identification +3

Roles of the Narrow Electronic Band near the Fermi Level in 1$T$-TaS$_2$-Related Layered Materials

no code implementations11 Mar 2021 Chenhaoping Wen, Jingjing Gao, Yuan Xie, Qing Zhang, Pengfei Kong, Jinghui Wang, Yilan Jiang, Xuan Luo, Jun Li, Wenjian Lu, Yu-Ping Sun, Shichao Yan

4$H_{\rm b}$-TaS$_2$ is a superconducting compound with alternating 1$T$-TaS$_2$ and 1$H$-TaS$_2$ layers, where the 1$H$-TaS$_2$ layer has weak charge density wave (CDW) pattern and reduces the CDW coupling between the adjacent 1$T$-TaS$_2$ layers.

Mesoscale and Nanoscale Physics Materials Science

MPU: Towards Bandwidth-abundant SIMT Processor via Near-bank Computing

no code implementations11 Mar 2021 Xinfeng Xie, Peng Gu, Yufei Ding, Dimin Niu, Hongzhong Zheng, Yuan Xie

For general purpose scenarios, lightweight hardware designs for diverse data paths, architectural supports for the SIMT programming model, and end-to-end software optimizations remain challenging.

Hardware Architecture

A Case for 3D Integrated System Design for Neuromorphic Computing & AI Applications

no code implementations2 Mar 2021 Eren Kurshan, Hai Li, Mingoo Seok, Yuan Xie

Over the last decade, artificial intelligence has found many applications areas in the society.

A Survey of Machine Learning for Computer Architecture and Systems

no code implementations16 Feb 2021 Nan Wu, Yuan Xie

It has been a long time that computer architecture and systems are optimized to enable efficient execution of machine learning (ML) algorithms or models.

IronMan: GNN-assisted Design Space Exploration in High-Level Synthesis via Reinforcement Learning

no code implementations16 Feb 2021 Nan Wu, Yuan Xie, Cong Hao

Despite the great success of High-Level Synthesis (HLS) tools, we observe several unresolved challenges: 1) the high-level abstraction of programming styles in HLS sometimes conceals optimization opportunities; 2) existing HLS tools do not provide flexible trade-off (Pareto) solutions among different objectives and constraints; 3) the actual quality of the resulting RTL designs is hard to predict.

Boundary-Aware Geometric Encoding for Semantic Segmentation of Point Clouds

no code implementations7 Jan 2021 Jingyu Gong, Jiachen Xu, Xin Tan, Jie zhou, Yanyun Qu, Yuan Xie, Lizhuang Ma

Boundary information plays a significant role in 2D image segmentation, while usually being ignored in 3D point cloud segmentation where ambiguous features might be generated in feature extraction, leading to misclassification in the transition area between two objects.

Point Cloud Segmentation Semantic Segmentation

Redefining Self-Normalization Property

no code implementations1 Jan 2021 Zhaodong Chen, Zhao WeiQin, Lei Deng, Guoqi Li, Yuan Xie

Moreover, analysis on the activation's mean in the forward pass reveals that the self-normalization property gets weaker with larger fan-in of each layer, which explains the performance degradation on large benchmarks like ImageNet.

Data Augmentation

Perturbed Self-Distillation: Weakly Supervised Large-Scale Point Cloud Semantic Segmentation

no code implementations ICCV 2021 Yachao Zhang, Yanyun Qu, Yuan Xie, Zonghao Li, Shanshan Zheng, Cuihua Li

In this way, the graph topology of the whole point cloud can be effectively established by the introduced auxiliary supervision, such that the information propagation between the labeled and unlabeled points will be realized.

Self-Supervised Learning Semantic Segmentation

AU-Expression Knowledge Constrained Representation Learning for Facial Expression Recognition

1 code implementation29 Dec 2020 Tao Pu, Tianshui Chen, Yuan Xie, Hefeng Wu, Liang Lin

In this work, we explore the correlations among the action units and facial expressions, and devise an AU-Expression Knowledge Constrained Representation Learning (AUE-CRL) framework to learn the AU representations without AU annotations and adaptively use representations to facilitate facial expression recognition.

Facial Expression Recognition Representation Learning

Training and Inference for Integer-Based Semantic Segmentation Network

no code implementations30 Nov 2020 Jiayi Yang, Lei Deng, Yukuan Yang, Yuan Xie, Guoqi Li

However, neural network quantization can be used to reduce computation load while maintaining comparable accuracy and original network structure.

Quantization Semantic Segmentation

Rubik: A Hierarchical Architecture for Efficient Graph Learning

no code implementations26 Sep 2020 Xiaobing Chen, yuke wang, Xinfeng Xie, Xing Hu, Abanti Basak, Ling Liang, Mingyu Yan, Lei Deng, Yufei Ding, Zidong Du, Yunji Chen, Yuan Xie

Graph convolutional network (GCN) emerges as a promising direction to learn the inductive representation in graph data commonly used in widespread applications, such as E-commerce, social networks, and knowledge graphs.

Hardware Architecture

Adversarial Graph Representation Adaptation for Cross-Domain Facial Expression Recognition

1 code implementation3 Aug 2020 Yuan Xie, Tianshui Chen, Tao Pu, Hefeng Wu, Liang Lin

However, most of these works focus on holistic feature adaptation, and they ignore local features that are more transferable across different datasets.

Facial Expression Recognition

Cross-Domain Facial Expression Recognition: A Unified Evaluation Benchmark and Adversarial Graph Learning

1 code implementation3 Aug 2020 Tianshui Chen, Tao Pu, Hefeng Wu, Yuan Xie, Lingbo Liu, Liang Lin

Although each declares to achieve superior performance, fair comparisons are lacking due to the inconsistent choices of the source/target datasets and feature extractors.

Domain Adaptation Facial Expression Recognition +2

Brain Tumor Anomaly Detection via Latent Regularized Adversarial Network

no code implementations9 Jul 2020 Nan Wang, Chengwei Chen, Yuan Xie, Lizhuang Ma

The brain structure in the collected data is complicated, thence, doctors are required to spend plentiful energy when diagnosing brain abnormalities.

Anomaly Detection Semi-supervised Anomaly Detection

GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs

1 code implementation11 Jun 2020 Yuke Wang, Boyuan Feng, Gushu Li, Shuangchen Li, Lei Deng, Yuan Xie, Yufei Ding

As the emerging trend of graph-based deep learning, Graph Neural Networks (GNNs) excel for their capability to generate high-quality node feature vectors (embeddings).

Distributed, Parallel, and Cluster Computing

SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost Computation

no code implementations7 May 2020 Yang Zhao, Xiaohan Chen, Yue Wang, Chaojian Li, Haoran You, Yonggan Fu, Yuan Xie, Zhangyang Wang, Yingyan Lin

We present SmartExchange, an algorithm-hardware co-design framework to trade higher-cost memory storage/access for lower-cost computation, for energy-efficient inference of deep neural networks (DNNs).

Model Compression Quantization

TIMELY: Pushing Data Movements and Interfaces in PIM Accelerators Towards Local and in Time Domain

no code implementations3 May 2020 Weitao Li, Pengfei Xu, Yang Zhao, Haitong Li, Yuan Xie, Yingyan Lin

Resistive-random-access-memory (ReRAM) based processing-in-memory (R$^2$PIM) accelerators show promise in bridging the gap between Internet of Thing devices' constrained resources and Convolutional/Deep Neural Networks' (CNNs/DNNs') prohibitive energy cost.

Comparing SNNs and RNNs on Neuromorphic Vision Datasets: Similarities and Differences

1 code implementation2 May 2020 Weihua He, Yujie Wu, Lei Deng, Guoqi Li, Haoyu Wang, Yang Tian, Wei Ding, Wenhui Wang, Yuan Xie

Neuromorphic data, recording frameless spike events, have attracted considerable attention for the spatiotemporal information components and the event-driven processing fashion.


Computation on Sparse Neural Networks: an Inspiration for Future Hardware

no code implementations24 Apr 2020 Fei Sun, Minghai Qin, Tianyun Zhang, Liu Liu, Yen-Kuang Chen, Yuan Xie

We show that for practically complicated problems, it is more beneficial to search large and sparse models in the weight dominated region.

Meta Segmentation Network for Ultra-Resolution Medical Images

no code implementations19 Feb 2020 Tong Wu, Yuan Xie, Yanyun Qu, Bicheng Dai, Shuxin Chen

MSN can fast generate the weights of fusion layers through a simple meta-learner, requiring only a few training samples and epochs to converge.

Meta-Learning Semantic Segmentation

Anomaly Detection by One Class Latent Regularized Networks

no code implementations5 Feb 2020 Chengwei Chen, Pan Chen, Haichuan Song, Yiqing Tao, Yuan Xie, Shouhong Ding, Lizhuang Ma

Anomaly detection is a fundamental problem in computer vision area with many real-world applications.

Anomaly Detection

Novelty Detection via Non-Adversarial Generative Network

no code implementations3 Feb 2020 Chengwei Chen, Wang Yuan, Yuan Xie, Yanyun Qu, Yiqing Tao, Haichuan Song, Lizhuang Ma

One-class novelty detection is the process of determining if a query example differs from the training examples (the target class).

Image Reconstruction

SceneEncoder: Scene-Aware Semantic Segmentation of Point Clouds with A Learnable Scene Descriptor

1 code implementation24 Jan 2020 Jiachen Xu, Jingyu Gong, Jie zhou, Xin Tan, Yuan Xie, Lizhuang Ma

Besides local features, global information plays an essential role in semantic segmentation, while recent works usually fail to explicitly extract the meaningful global information and make full use of it.

Semantic Segmentation

Memristor Hardware-Friendly Reinforcement Learning

no code implementations20 Jan 2020 Nan Wu, Adrien Vincent, Dmitri Strukov, Yuan Xie

Namely, neuromorphic architectures that leverage memristors, the programmable and nonvolatile two-terminal devices, as synaptic weights in hardware neural networks, are candidates of choice to realize such highly energy-efficient and complex nervous systems.

HyGCN: A GCN Accelerator with Hybrid Architecture

no code implementations7 Jan 2020 Mingyu Yan, Lei Deng, Xing Hu, Ling Liang, Yujing Feng, Xiaochun Ye, Zhimin Zhang, Dongrui Fan, Yuan Xie

In this work, we first characterize the hybrid execution patterns of GCNs on Intel Xeon CPU.

Distributed, Parallel, and Cluster Computing

Exploring Adversarial Attack in Spiking Neural Networks with Spike-Compatible Gradient

no code implementations1 Jan 2020 Ling Liang, Xing Hu, Lei Deng, Yujie Wu, Guoqi Li, Yufei Ding, Peng Li, Yuan Xie

Recently, backpropagation through time inspired learning algorithms are widely introduced into SNNs to improve the performance, which brings the possibility to attack the models accurately given Spatio-temporal gradient maps.

Adversarial Attack

A Comprehensive and Modularized Statistical Framework for Gradient Norm Equality in Deep Neural Networks

1 code implementation1 Jan 2020 Zhaodong Chen, Lei Deng, Bangyan Wang, Guoqi Li, Yuan Xie

Powered by our metric and framework, we analyze extensive initialization, normalization, and network structures.

Proq: Projection-based Runtime Assertions for Debugging on a Quantum Computer

no code implementations28 Nov 2019 Gushu Li, Li Zhou, Nengkun Yu, Yufei Ding, Mingsheng Ying, Yuan Xie

In this paper, we propose Proq, a runtime assertion scheme for testing and debugging quantum programs on a quantum computer.

DARB: A Density-Aware Regular-Block Pruning for Deep Neural Networks

no code implementations19 Nov 2019 Ao Ren, Tao Zhang, Yuhao Wang, Sheng Lin, Peiyan Dong, Yen-Kuang Chen, Yuan Xie, Yanzhi Wang

As a further optimization, we propose a density-adaptive regular-block (DARB) pruning that outperforms prior structured pruning work with high pruning ratio and decoding efficiency.

Model Compression Network Pruning

Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization

no code implementations3 Nov 2019 Lei Deng, Yujie Wu, Yifan Hu, Ling Liang, Guoqi Li, Xing Hu, Yufei Ding, Peng Li, Yuan Xie

As well known, the huge memory and compute costs of both artificial neural networks (ANNs) and spiking neural networks (SNNs) greatly hinder their deployment on edge devices with high efficiency.

Model Compression Quantization

Training High-Performance and Large-Scale Deep Neural Networks with Full 8-bit Integers

2 code implementations5 Sep 2019 Yukuan Yang, Shuang Wu, Lei Deng, Tianyi Yan, Yuan Xie, Guoqi Li

In this way, all the operations in the training and inference can be bit-wise operations, pushing towards faster processing speed, decreased memory cost, and higher energy efficiency.


AccD: A Compiler-based Framework for Accelerating Distance-related Algorithms on CPU-FPGA Platforms

no code implementations26 Aug 2019 Yuke Wang, Boyuan Feng, Gushu Li, Lei Deng, Yuan Xie, Yufei Ding

As a promising solution to boost the performance of distance-related algorithms (e. g., K-means and KNN), FPGA-based acceleration attracts lots of attention, but also comes with numerous challenges.

Distributed, Parallel, and Cluster Computing Programming Languages

Semi-Supervised Video Salient Object Detection Using Pseudo-Labels

1 code implementation ICCV 2019 Pengxiang Yan, Guanbin Li, Yuan Xie, Zhen Li, Chuan Wang, Tianshui Chen, Liang Lin

Specifically, we present an effective video saliency detector that consists of a spatial refinement network and a spatiotemporal module.

Salient Object Detection Unsupervised Video Object Segmentation +1

Neural Network Model Extraction Attacks in Edge Devices by Hearing Architectural Hints

no code implementations10 Mar 2019 Xing Hu, Ling Liang, Lei Deng, Shuangchen Li, Xinfeng Xie, Yu Ji, Yufei Ding, Chang Liu, Timothy Sherwood, Yuan Xie

As neural networks continue their reach into nearly every aspect of software operations, the details of those networks become an increasingly sensitive subject.

Cryptography and Security Hardware Architecture

FPSA: A Full System Stack Solution for Reconfigurable ReRAM-based NN Accelerator Architecture

no code implementations28 Jan 2019 Yu Ji, Youyang Zhang, Xinfeng Xie, Shuangchen Li, Peiqi Wang, Xing Hu, Youhui Zhang, Yuan Xie

In this paper, we propose a full system stack solution, composed of a reconfigurable architecture design, Field Programmable Synapse Array (FPSA) and its software system including neural synthesizer, temporal-to-spatial mapper, and placement & routing.

QGAN: Quantized Generative Adversarial Networks

no code implementations24 Jan 2019 Peiqi Wang, Dongsheng Wang, Yu Ji, Xinfeng Xie, Haoxuan Song, XuXin Liu, Yongqiang Lyu, Yuan Xie

The intensive computation and memory requirements of generative adversarial neural networks (GANs) hinder its real-world deployment on edge devices such as smartphones.


A Secure and Persistent Memory System for Non-volatile Memory

no code implementations3 Jan 2019 Pengfei Zuo, Yu Hua, Yuan Xie

Specifically, SecPM leverages the CWT scheme to guarantee the crash consistency via ensuring both the data and its counter are durable before the data flush completes, and leverages the CWR scheme to improve the system performance via exploiting the spatial locality of counter storage, log and data writes.

Distributed, Parallel, and Cluster Computing Hardware Architecture Cryptography and Security

Facial Landmark Machines: A Backbone-Branches Architecture with Progressive Representation Learning

no code implementations10 Dec 2018 Lingbo Liu, Guanbin Li, Yuan Xie, Yizhou Yu, Qing Wang, Liang Lin

In this paper, we propose a novel cascaded backbone-branches fully convolutional neural network~(BB-FCN) for rapidly and accurately localizing facial landmarks in unconstrained and cluttered settings.

Face Alignment Face Detection +2

Designing by Training: Acceleration Neural Network for Fast High-Dimensional Convolution

no code implementations NeurIPS 2018 Longquan Dai, Liang Tang, Yuan Xie, Jinhui Tang

Over the decades, people took a handmade approach to design fast algorithms for the Gaussian convolution.

TETRIS: TilE-matching the TRemendous Irregular Sparsity

no code implementations NeurIPS 2018 Yu Ji, Ling Liang, Lei Deng, Youyang Zhang, Youhui Zhang, Yuan Xie

Increasing the sparsity granularity can lead to better hardware utilization, but it will compromise the sparsity for maintaining accuracy.

HitNet: Hybrid Ternary Recurrent Neural Network

no code implementations NeurIPS 2018 Peiqi Wang, Xinfeng Xie, Lei Deng, Guoqi Li, Dongsheng Wang, Yuan Xie

For example, we improve the perplexity per word (PPW) of a ternary LSTM on Penn Tree Bank (PTB) corpus from 126 (the state-of-the-art result to the best of our knowledge) to 110. 3 with a full precision model in 97. 2, and a ternary GRU from 142 to 113. 5 with a full precision model in 102. 7.


Image Captioning Based on a Hierarchical Attention Mechanism and Policy Gradient Optimization

no code implementations13 Nov 2018 Shi-Yang Yan, Yuan Xie, Fang-Yu Wu, Jeremy S. Smith, Wenjin Lu, Bai-Ling Zhang

Automatically generating the descriptions of an image, i. e., image captioning, is an important and fundamental topic in artificial intelligence, which bridges the gap between computer vision and natural language processing.

Image Captioning

Bi-GANs-ST for Perceptual Image Super-resolution

no code implementations1 Nov 2018 Xiaotong Luo, Rong Chen, Yuan Xie, Yanyun Qu, Cuihua Li

In this paper, motivated by [1], we aim to generate a high-quality SR result which balances between the two indices, i. e., the perception index and root-mean-square error (RMSE).

Image Super-Resolution SSIM

Batch Normalization Sampling

no code implementations25 Oct 2018 Zhaodong Chen, Lei Deng, Guoqi Li, Jiawei Sun, Xing Hu, Xin Ma, Yuan Xie

In this paper, we propose alleviating this problem through sampling only a small fraction of data for normalization at each iteration.

Dynamic Sparse Graph for Efficient Deep Learning

no code implementations ICLR 2019 Liu Liu, Lei Deng, Xing Hu, Maohua Zhu, Guoqi Li, Yufei Ding, Yuan Xie

We propose to execute deep neural networks (DNNs) with dynamic and sparse graph (DSG) structure for compressive memory and accelerative execution during both training and inference.

Dimensionality Reduction

Jointly Deep Multi-View Learning for Clustering Analysis

no code implementations19 Aug 2018 Bingqian Lin, Yuan Xie, Yanyun Qu, Cuihua Li, Xiaodan Liang

To our best knowledge, this is the first work to model the multi-view clustering in a deep joint framework, which will provide a meaningful thinking in unsupervised multi-view learning.


Crossbar-aware neural network pruning

no code implementations25 Jul 2018 Ling Liang, Lei Deng, Yueling Zeng, Xing Hu, Yu Ji, Xin Ma, Guoqi Li, Yuan Xie

Crossbar architecture based devices have been widely adopted in neural network accelerators by taking advantage of the high efficiency on vector-matrix multiplication (VMM) operations.

Network Pruning

Structurally Sparsified Backward Propagation for Faster Long Short-Term Memory Training

no code implementations1 Jun 2018 Maohua Zhu, Jason Clemons, Jeff Pool, Minsoo Rhu, Stephen W. Keckler, Yuan Xie

Further, we can enforce structured sparsity in the gate gradients to make the LSTM backward pass up to 45% faster than the state-of-the-art dense approach and 168% faster than the state-of-the-art sparsifying method on modern GPUs.

Weakly Supervised Salient Object Detection Using Image Labels

no code implementations17 Mar 2018 Guanbin Li, Yuan Xie, Liang Lin

Our algorithm is based on alternately exploiting a graphical model and training a fully convolutional network for model updating.

RGB Salient Object Detection Saliency Detection +1

L1-Norm Batch Normalization for Efficient Training of Deep Neural Networks

no code implementations27 Feb 2018 Shuang Wu, Guoqi Li, Lei Deng, Liu Liu, Yuan Xie, Luping Shi

Batch Normalization (BN) has been proven to be quite effective at accelerating and improving the training of deep neural networks (DNNs).


Bridging the Gap Between Neural Networks and Neuromorphic Hardware with A Neural Network Compiler

no code implementations15 Nov 2017 Yu Ji, Youhui Zhang, WenGuang Chen, Yuan Xie

Different from developing neural networks (NNs) for general-purpose processors, the development for NN chips usually faces with some hardware-specific restrictions, such as limited precision of network signals and parameters, constrained computation scale, and limited types of non-linear functions.

Effective Image Retrieval via Multilinear Multi-index Fusion

no code implementations27 Sep 2017 Zhizhong Zhang, Yuan Xie, Wensheng Zhang, Qi Tian

In this paper, we propose a new multi-index fusion scheme for image retrieval.

Image Retrieval

Robust Kernelized Multi-View Self-Representations for Clustering by Tensor Multi-Rank Minimization

no code implementations15 Sep 2017 Yanyun Qu, Jinyan Liu, Yuan Xie, Wensheng Zhang

In particular, the original tensor-based multi-view self-representation clustering problem is a special case of our approach and can be solved by our algorithm.

Face Clustering

On Unifying Multi-View Self-Representations for Clustering by Tensor Multi-Rank Minimization

no code implementations23 Oct 2016 Yuan Xie, DaCheng Tao, Wensheng Zhang, Lei Zhang, Yan Liu, Yanyun Qu

Different from traditional unfolding based tensor norm, this low-rank tensor constraint has optimality properties similar to that of matrix rank derived from SVD, so the complementary information among views can be explored more efficiently and thoroughly.

Multi-view Subspace Clustering

CNNLab: a Novel Parallel Framework for Neural Networks using GPU and FPGA-a Practical Study with Trade-off Analysis

no code implementations20 Jun 2016 Maohua Zhu, Liu Liu, Chao Wang, Yuan Xie

To improve the performance and maintain the scalability, we present CNNLab, a novel deep learning framework using GPU and FPGA-based accelerators.

DLAU: A Scalable Deep Learning Accelerator Unit on FPGA

no code implementations23 May 2016 Chao Wang, Qi Yu, Lei Gong, Xi Li, Yuan Xie, Xuehai Zhou

As the emerging field of machine learning, deep learning shows excellent ability in solving complex learning problems.

Weighted Schatten $p$-Norm Minimization for Image Denoising and Background Subtraction

no code implementations3 Dec 2015 Yuan Xie, Shuhang Gu, Yan Liu, WangMeng Zuo, Wensheng Zhang, Lei Zhang

However, NNM tends to over-shrink the rank components and treats the different rank components equally, limiting its flexibility in practical applications.

Image Denoising

A New Low-Rank Tensor Model for Video Completion

no code implementations7 Sep 2015 Wenrui Hu, DaCheng Tao, Wensheng Zhang, Yuan Xie, Yehui Yang

On the other, t-TNN is equal to the nuclear norm of block circulant matricization of the twist tensor in the original domain, which extends the traditional matrix nuclear norm in a block circulant way.

Distortion-driven Turbulence Effect Removal using Variational Model

no code implementations17 Jan 2014 Yuan Xie, Wensheng Zhang, DaCheng Tao, Wenrui Hu, Yanyun Qu, Hanzi Wang

To solve, or at least reduce these effects, we propose a new scheme to recover a latent image from observed frames by integrating a new variational model and distortion-driven spatial-temporal kernel regression.

