Search Results for author: Tao Luo

Found 59 papers, 12 papers with code

Demystifying Lazy Training of Neural Networks from a Macroscopic Viewpoint

no code implementations7 Apr 2024 Yuqing Li, Tao Luo, Qixuan Zhou

While NTK typically assumes that $\lim_{m\to\infty}\frac{\log \kappa}{\log m}=\frac{1}{2}$, and imposes each weight parameters to scale by the factor $\frac{1}{\sqrt{m}}$, in our theta-lazy regime, we discard the factor and relax the conditions to $\lim_{m\to\infty}\frac{\log \kappa}{\log m}>0$.

On the dynamics of three-layer neural networks: initial condensation

no code implementations25 Feb 2024 Zheng-an Chen, Tao Luo

Empirical and theoretical works show that the input weights of two-layer neural networks, when initialized with small values, converge towards isolated orientations.

A priori Estimates for Deep Residual Network in Continuous-time Reinforcement Learning

no code implementations24 Feb 2024 Shuyu Yin, Qixuan Zhou, Fei Wen, Tao Luo

However, existing performance analyses ignores the unique characteristics of continuous-time control problems, is unable to directly estimate the generalization error of the Bellman optimal loss and require a boundedness assumption.

Importance-Aware Image Segmentation-based Semantic Communication for Autonomous Driving

no code implementations16 Jan 2024 Jie Lv, Haonan Tong, Qiang Pan, Zhilong Zhang, Xinxin He, Tao Luo, Changchuan Yin

Therefore, we propose a vehicular image segmentation-oriented semantic communication system, termed VIS-SemCom, where image segmentation features of important objects are transmitted to reduce transmission redundancy.

Autonomous Driving Image Segmentation +2

VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering

1 code implementation19 Dec 2023 Chun-Mei Feng, Yang Bai, Tao Luo, Zhen Li, Salman Khan, WangMeng Zuo, Xinxing Xu, Rick Siow Mong Goh, Yong liu

By feeding the retrieved image and question to the VQA model, one can find the images inconsistent with relative caption when the answer by VQA is inconsistent with the answer in the QA pair.

Image Retrieval Question Answering +2

Ultra-Long Sequence Distributed Transformer

no code implementations4 Nov 2023 Xiao Wang, Isaac Lyngaas, Aristeidis Tsaris, Peng Chen, Sajal Dash, Mayanka Chandra Shekar, Tao Luo, Hong-Jun Yoon, Mohamed Wahib, John Gouley

This paper presents a novel and efficient distributed training method, the Long Short-Sequence Transformer (LSS Transformer), for training transformer with long sequences.

Structure and Gradient Dynamics Near Global Minima of Two-layer Neural Networks

no code implementations1 Sep 2023 Leyang Zhang, Yaoyu Zhang, Tao Luo

Under mild assumptions, we investigate the structure of loss landscape of two-layer neural networks near global minima, determine the set of parameters which give perfect generalization, and fully characterize the gradient flows around it.

Exploring Winograd Convolution for Cost-effective Neural Network Fault Tolerance

no code implementations16 Aug 2023 Xinghua Xue, Cheng Liu, Bo Liu, Haitong Huang, Ying Wang, Tao Luo, Lei Zhang, Huawei Li, Xiaowei Li

When it is applied on fault-tolerant neural networks enhanced with fault-aware retraining and constrained activation functions, the resulting model accuracy generally shows significant improvement in presence of various faults.

Computational Efficiency

Optimistic Estimate Uncovers the Potential of Nonlinear Models

no code implementations18 Jul 2023 Yaoyu Zhang, Zhongwang Zhang, Leyang Zhang, Zhiwei Bai, Tao Luo, Zhi-Qin John Xu

We propose an optimistic estimate to evaluate the best possible fitting performance of nonlinear models.

Stochastic Modified Equations and Dynamics of Dropout Algorithm

no code implementations25 May 2023 Zhongwang Zhang, Yuqing Li, Tao Luo, Zhi-Qin John Xu

In order to investigate the underlying mechanism by which dropout facilitates the identification of flatter minima, we study the noise structure of the derived stochastic modified equation for dropout.

Relation

DeepFire2: A Convolutional Spiking Neural Network Accelerator on FPGAs

no code implementations9 May 2023 Myat Thu Linn Aung, Daniel Gerlinghoff, Chuping Qu, Liwei Yang, Tian Huang, Rick Siow Mong Goh, Tao Luo, Weng-Fai Wong

Brain-inspired spiking neural networks (SNNs) replace the multiply-accumulate operations of traditional neural networks by integrate-and-fire neurons, with the goal of achieving greater energy efficiency.

Hierarchical Weight Averaging for Deep Neural Networks

no code implementations23 Apr 2023 Xiaozhe Gu, Zixun Zhang, Yuncheng Jiang, Tao Luo, Ruimao Zhang, Shuguang Cui, Zhen Li

Despite the simplicity, stochastic gradient descent (SGD)-like algorithms are successful in training deep neural networks (DNNs).

Phase Diagram of Initial Condensation for Two-layer Neural Networks

no code implementations12 Mar 2023 Zhengan Chen, Yuqing Li, Tao Luo, Zhangchen Zhou, Zhi-Qin John Xu

The phenomenon of distinct behaviors exhibited by neural networks under varying scales of initialization remains an enigma in deep learning research.

Vocal Bursts Valence Prediction

Linear Stability Hypothesis and Rank Stratification for Nonlinear Models

no code implementations21 Nov 2022 Yaoyu Zhang, Zhongwang Zhang, Leyang Zhang, Zhiwei Bai, Tao Luo, Zhi-Qin John Xu

By these results, model rank of a target function predicts a minimal training data size for its successful recovery.

Desire Backpropagation: A Lightweight Training Algorithm for Multi-Layer Spiking Neural Networks based on Spike-Timing-Dependent Plasticity

1 code implementation10 Nov 2022 Daniel Gerlinghoff, Tao Luo, Rick Siow Mong Goh, Weng-Fai Wong

Spiking neural networks (SNNs) are a viable alternative to conventional artificial neural networks when resource efficiency and computational complexity are of importance.

Statistical Modeling of Soft Error Influence on Neural Networks

no code implementations12 Oct 2022 Haitong Huang, Xinghua Xue, Cheng Liu, Ying Wang, Tao Luo, Long Cheng, Huawei Li, Xiaowei Li

Prior work mainly rely on fault simulation to analyze the influence of soft errors on NN processing.

Quantization

URGLQ: An Efficient Covariance Matrix Reconstruction Method for Robust Adaptive Beamforming

1 code implementation5 Oct 2022 Tao Luo, Peng Chen, Zhenxin Cao, Le Zheng, Zongxin Wang

The computational complexity of the conventional adaptive beamformer is relatively large, and the performance degrades significantly due to the model mismatch errors and the unwanted signals in received data.

Embedding Principle in Depth for the Loss Landscape Analysis of Deep Neural Networks

no code implementations26 May 2022 Zhiwei Bai, Tao Luo, Zhi-Qin John Xu, Yaoyu Zhang

Regarding the easy training of deep networks, we show that local minimum of an NN can be lifted to strict saddle points of a deeper NN.

An Experimental Comparison Between Temporal Difference and Residual Gradient with Neural Network Approximation

no code implementations25 May 2022 Shuyu Yin, Tao Luo, Peilin Liu, Zhi-Qin John Xu

In this work, we perform extensive experiments to show that TD outperforms RG, that is, when the training leads to a small Bellman residual error, the solution found by TD has a better policy and is more robust against the perturbation of neural network parameters.

Q-Learning reinforcement-learning +1

Empirical Phase Diagram for Three-layer Neural Networks with Infinite Width

no code implementations24 May 2022 Hanxu Zhou, Qixuan Zhou, Zhenyuan Jin, Tao Luo, Yaoyu Zhang, Zhi-Qin John Xu

Through experiments under three-layer condition, our phase diagram suggests a complicated dynamical regimes consisting of three possible regimes, together with their mixture, for deep NNs and provides a guidance for studying deep NNs in different initialization regimes, which reveals the possibility of completely different dynamics emerging within a deep NN for its different layers.

Winograd Convolution: A Perspective from Fault Tolerance

no code implementations17 Feb 2022 Xinghua Xue, Haitong Huang, Cheng Liu, Ying Wang, Tao Luo, Lei Zhang

Winograd convolution is originally proposed to reduce the computing overhead by converting multiplication in neural network (NN) with addition via linear transformation.

Limitation of Characterizing Implicit Regularization by Data-independent Functions

no code implementations28 Jan 2022 Leyang Zhang, Zhi-Qin John Xu, Tao Luo, Yaoyu Zhang

In recent years, understanding the implicit regularization of neural networks (NNs) has become a central task in deep learning theory.

Learning Theory

Overview frequency principle/spectral bias in deep learning

no code implementations19 Jan 2022 Zhi-Qin John Xu, Yaoyu Zhang, Tao Luo

This low-frequency implicit bias reveals the strength of neural network in learning low-frequency functions as well as its deficiency in learning high-frequency functions.

Optimizing for In-memory Deep Learning with Emerging Memory Technology

no code implementations1 Dec 2021 Zhehui Wang, Tao Luo, Rick Siow Mong Goh, Wei zhang, Weng-Fai Wong

In-memory deep learning has already demonstrated orders of magnitude higher performance density and energy efficiency.

Embedding Principle: a hierarchical structure of loss landscape of deep neural networks

no code implementations30 Nov 2021 Yaoyu Zhang, Yuqing Li, Zhongwang Zhang, Tao Luo, Zhi-Qin John Xu

We prove a general Embedding Principle of loss landscape of deep neural networks (NNs) that unravels a hierarchical structure of the loss landscape of NNs, i. e., loss landscape of an NN contains all critical points of all the narrower NNs.

MOD-Net: A Machine Learning Approach via Model-Operator-Data Network for Solving PDEs

no code implementations8 Jul 2021 Lulu Zhang, Tao Luo, Yaoyu Zhang, Weinan E, Zhi-Qin John Xu, Zheng Ma

In this paper, we propose a a machine learning approach via model-operator-data network (MOD-Net) for solving PDEs.

Privacy Budget Scheduling

1 code implementation29 Jun 2021 Tao Luo, Mingen Pan, Pierre Tholoniat, Asaf Cidon, Roxana Geambasu, Mathias Lécuyer

We describe PrivateKube, an extension to the popular Kubernetes datacenter orchestrator that adds privacy as a new type of resource to be managed alongside other traditional compute resources, such as CPU, GPU, and memory.

Fairness Scheduling

Towards Understanding the Condensation of Neural Networks at Initial Training

no code implementations25 May 2021 Hanxu Zhou, Qixuan Zhou, Tao Luo, Yaoyu Zhang, Zhi-Qin John Xu

Our theoretical analysis confirms experiments for two cases, one is for the activation function of multiplicity one with arbitrary dimension input, which contains many common activation functions, and the other is for the layer with one-dimensional input and arbitrary multiplicity.

DTNN: Energy-efficient Inference with Dendrite Tree Inspired Neural Networks for Edge Vision Applications

no code implementations25 May 2021 Tao Luo, Wai Teng Tang, Matthew Kay Fei Lee, Chuping Qu, Weng-Fai Wong, Rick Goh

DTNN achieved significant energy saving (19. 4X and 64. 9X improvement on ResNet-18 and VGG-11 with ImageNet, respectively) with negligible loss of accuracy.

Quantization

Efficient Spiking Neural Networks with Radix Encoding

no code implementations14 May 2021 Zhehui Wang, Xiaozhe Gu, Rick Goh, Joey Tianyi Zhou, Tao Luo

Traditionally, a spike train needs around one thousand time steps to approach similar accuracy as its ANN counterpart.

Nonlinear Weighted Directed Acyclic Graph and A Priori Estimates for Neural Networks

no code implementations30 Mar 2021 Yuqing Li, Tao Luo, Chao Ma

In an attempt to better understand structural benefits and generalization power of deep neural networks, we firstly present a novel graph theoretical formulation of neural network models, including fully connected, residual network (ResNet) and densely connected networks (DenseNet).

RCT: Resource Constrained Training for Edge AI

no code implementations26 Mar 2021 Tian Huang, Tao Luo, Ming Yan, Joey Tianyi Zhou, Rick Goh

For example, quantisation-aware training (QAT) method involves two copies of model parameters, which is usually beyond the capacity of on-chip memory in edge devices.

QROSS: QUBO Relaxation Parameter Optimisation via Learning Solver Surrogates

no code implementations19 Mar 2021 Tian Huang, Siong Thye Goh, Sabrish Gopalakrishnan, Tao Luo, Qianxiao Li, Hoong Chuin Lau

In this way, we are able capture the common structure of the instances and their interactions with the solver, and produce good choices of penalty parameters with fewer number of calls to the QUBO solver.

Traveling Salesman Problem

Linear Frequency Principle Model to Understand the Absence of Overfitting in Neural Networks

no code implementations30 Jan 2021 Yaoyu Zhang, Tao Luo, Zheng Ma, Zhi-Qin John Xu

Why heavily parameterized neural networks (NNs) do not overfit the data is an important long standing open question.

Open-Ended Question Answering

Meta-Reinforcement Learning for Reliable Communication in THz/VLC Wireless VR Networks

1 code implementation29 Jan 2021 Yining Wang, Mingzhe Chen, Zhaohui Yang, Walid Saad, Tao Luo, Shuguang Cui, H. Vincent Poor

The problem is formulated as an optimization problem whose goal is to maximize the reliability of the VR network by selecting the appropriate VAPs to be turned on and controlling the user association with SBSs.

Meta-Learning Meta Reinforcement Learning +2

Adaptive Precision Training for Resource Constrained Devices

no code implementations23 Dec 2020 Tian Huang, Tao Luo, Joey Tianyi Zhou

We use model of the same precision for both forward and backward pass in order to reduce memory usage for training.

A comprehensive study on the semileptonic decay of heavy flavor mesons

no code implementations8 Dec 2020 Lu Zhang, Xian-Wei Kang, Xin-Heng Guo, Ling-Yun Dai, Tao Luo, Chao Wang

The semileptonic decay of heavy flavor mesons offers a clean environment for extraction of the Cabibbo-Kobayashi-Maskawa (CKM) matrix elements, which describes the CP-violating and flavor changing process in the Standard Model.

High Energy Physics - Phenomenology High Energy Physics - Experiment

Fourier-domain Variational Formulation and Its Well-posedness for Supervised Learning

no code implementations6 Dec 2020 Tao Luo, Zheng Ma, Zhiwei Wang, Zhi-Qin John Xu, Yaoyu Zhang

A supervised learning problem is to find a function in a hypothesis function space given values on isolated data points.

On the exact computation of linear frequency principle dynamics and its generalization

1 code implementation15 Oct 2020 Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang

Recent works show an intriguing phenomenon of Frequency Principle (F-Principle) that deep neural networks (DNNs) fit the target function from low to high frequency during the training, which provides insight into the training and generalization behavior of DNNs in complex tasks.

A regularized deep matrix factorized model of matrix completion for image restoration

2 code implementations29 Jul 2020 Zhemin Li, Zhi-Qin John Xu, Tao Luo, Hongxia Wang

In this work, we propose a Regularized Deep Matrix Factorized (RDMF) model for image restoration, which utilizes the implicit bias of the low rank of deep neural networks and the explicit bias of total variation.

Image Restoration Matrix Completion

Phase diagram for two-layer ReLU neural networks at infinite-width limit

1 code implementation15 Jul 2020 Tao Luo, Zhi-Qin John Xu, Zheng Ma, Yaoyu Zhang

In this work, inspired by the phase diagram in statistical mechanics, we draw the phase diagram for the two-layer ReLU neural network at the infinite-width limit for a complete characterization of its dynamical regimes and their dependence on hyperparameters related to initialization.

Towards an Understanding of Residual Networks Using Neural Tangent Hierarchy (NTH)

no code implementations7 Jul 2020 Yuqing Li, Tao Luo, Nung Kwan Yip

Gradient descent yields zero training loss in polynomial time for deep neural networks despite non-convex nature of the objective function.

Two-Layer Neural Networks for Partial Differential Equations: Optimization and Generalization Theory

no code implementations28 Jun 2020 Tao Luo, Haizhao Yang

The problem of solving partial differential equations (PDEs) can be formulated into a least-squares minimization problem, where neural networks are used to parametrize PDE solutions.

EDCompress: Energy-Aware Model Compression for Dataflows

no code implementations8 Jun 2020 Zhehui Wang, Tao Luo, Joey Tianyi Zhou, Rick Siow Mong Goh

EDCompress could also find the optimal dataflow type for specific neural networks in terms of energy consumption, which can guide the deployment of CNN models on hardware systems.

Model Compression

Deep Learning for Optimal Deployment of UAVs with Visible Light Communications

no code implementations28 Nov 2019 Yining Wang, Mingzhe Chen, Zhaohui Yang, Tao Luo, Walid Saad

Using GRUs and CNNs, the UAVs can model the long-term historical illumination distribution and predict the future illumination distribution.

Gated Recurrent Units Learning for Optimal Deployment of Visible Light Communications Enabled UAVs

no code implementations17 Sep 2019 Yining Wang, Mingzhe Chen, Zhaohui Yang, Xue Hao, Tao Luo, Walid Saad

This problem is formulated as an optimization problem whose goal is to minimize the total transmit power while meeting the illumination and communication requirements of users.

Theory of the Frequency Principle for General Deep Neural Networks

1 code implementation21 Jun 2019 Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang

Along with fruitful applications of Deep Neural Networks (DNNs) to realistic problems, recently, some empirical studies of DNNs reported a universal phenomenon of Frequency Principle (F-Principle): a DNN tends to learn a target function from low to high frequencies during the training.

Explicitizing an Implicit Bias of the Frequency Principle in Two-layer Neural Networks

1 code implementation24 May 2019 Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma

It remains a puzzle that why deep neural networks (DNNs), with more parameters than samples, often generalize well.

A type of generalization error induced by initialization in deep neural networks

no code implementations19 May 2019 Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma

Overall, our work serves as a baseline for the further investigation of the impact of initialization and loss function on the generalization of DNNs, which can potentially guide and improve the training of DNNs in practice.

Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks

3 code implementations19 Jan 2019 Zhi-Qin John Xu, Yaoyu Zhang, Tao Luo, Yanyang Xiao, Zheng Ma

We study the training process of Deep Neural Networks (DNNs) from the Fourier analysis perspective.

Cannot find the paper you are looking for? You can Submit a new open access paper.