We also conclude that in Agg-GNNs the selectivity of the mapping operators is tied to the properties of the filters only in the first layer of the CNN stage.
We consider resource management problems in multi-user wireless networks, which can be cast as optimizing a network-wide utility function, subject to constraints on the long-term average performance of users across the network.
In this paper, we exploit this concept to design a potential function of the hypothesis velocity fields, and prove that, if such a function diminishes to zero during the training procedure, the trajectory of the densities generated by the hypothesis velocity fields converges to the solution of the FPE in the Wasserstein-2 sense.
Moreover, our experiments on multi-resolution datasets also demonstrate that VNNs are amenable to transferability of performance over covariance matrices of different dimensions; a feature that is infeasible for PCA-based approaches.
Machine learning frameworks such as graph neural networks typically rely on a given, fixed graph to exploit relational inductive biases and thus effectively learn from network data.
Graph Neural Networks (GNNs) are powerful convolutional architectures that have shown remarkable performance in various node-level and graph-level tasks.
We consider the problems of downlink user selection and power control in wireless networks, comprising multiple transmitters and receivers communicating with each other over a shared wireless medium.
Considering a primal-dual approach, we optimize the primal variables, corresponding to the model parameters, as well as the dual variables, corresponding to the constraints.
As the number of learnable parameters in a neural network grows with the size of the input signal, deep reinforcement learning may fail to scale, limiting the immediate generalization of such scheduling and resource allocation policies to large-scale systems.
In this paper, we study the problem of training GNNs on graphs of moderate size and transferring them to large-scale graphs.
In particular, we leverage semi-infinite optimization and non-convex duality theory to show that adversarial training is equivalent to a statistical problem over perturbation distributions, which we characterize completely.
Hence, in this paper, we analyze the stability properties of convolutional neural networks on manifolds to understand the stability of GNNs on large graphs.
Federated Learning (FL) has emerged as the tool of choice for training deep models over heterogeneous and decentralized datasets.
In this paper we provide stability results for algebraic neural networks (AlgNNs) based on non commutative algebras.
At training time, the joint wide and deep architecture learns nonlinear representations from data.
We consider the broad class of decentralized optimal resource allocation problems in wireless networks, which can be formulated as a constrained statistical learning problems with a localized information structure.
Our framework is implemented by a cascade of a convolutional and a graph neural network (CNN / GNN), addressing agent-level visual perception and feature learning, as well as swarm-level communication, local information aggregation and agent action inference, respectively.
In particular, it proves the expected output difference between the GCNN over random perturbed graphs and the GCNN over the nominal graph is upper bounded by a factor that is linear in the link loss probability.
Graph neural networks (GNNs) use graph convolutions to exploit network invariances and learn meaningful feature representations from network data.
We then define two frequency dependent manifold filters that split the infinite dimensional spectrum of the LB operator in finite partitions, and prove that these filters are stable to absolute and relative perturbations of the LB operator respectively.
Graph neural networks (GNNs) are processing architectures that exploit graph structural information to model representations from network data.
We demonstrate the performance of our GNN-based learning approach in a scenario of active target coverage with large networks of robots.
We present a reinforcement learning algorithm for learning sparse non-parametric controllers in a Reproducing Kernel Hilbert Space.
To overcome this challenge, we propose a task-agnostic, decentralized, low-latency method for data distribution in ad-hoc networks using Graph Neural Networks (GNN).
In this paper, we overcome this issue by learning in the empirical dual domain, where constrained statistical learning problems become unconstrained and deterministic.
Constrained reinforcement learning involves multiple rewards that must individually accumulate to given thresholds.
We are able to demonstrate the scalability of our methods for a large number of robots by employing a graph neural network (GNN) to parameterize policies for the robots.
This paper introduces the constrained Sufficiently Accurate model learning approach, provides examples of such problems, and presents a theorem on how close some approximate solutions can be.
Dynamical systems consisting of a set of autonomous agents face the challenge of having to accomplish a global task, relying only on local information.
Prediction credibility measures, in the form of confidence intervals or probability distributions, are fundamental in statistics and machine learning to characterize model robustness, detect out-of-distribution samples (outliers), and protect against adversarial attacks.
In this regard, we propose a novel Sinkhorn Natural Gradient (SiNG) algorithm which acts as a steepest descent method on the probability space endowed with the Sinkhorn divergence.
We capture the asynchrony by modeling the activation pattern as a characteristic of each node and train a policy-based resource allocation method.
In this work, we approach GCNNs from a state-space perspective revealing that the graph convolutional module is a minimalistic linear state-space model, in which the state update matrix is the graph shift operator.
In this paper we consider a problem known as multi-task learning, consisting of fitting a set of classifier or regression functions intended for solving different tasks.
We then extend this analysis by interpreting the graphon neural network as a generating model for GNNs on deterministic and stochastic graphs instantiated from the original and perturbed graphons.
Algebraic neural networks (AlgNNs) are composed of a cascade of layers each one associated to and algebraic signal model, and information is mapped between layers by means of a nonlinearity function.
We define a notion of discriminability tied to the stability of the architecture, show that GNNs are at least as discriminative as linear graph filter banks, and characterize the signals that cannot be discriminated by either.
To that end we compute unbiased stochastic gradients of the value function which we use as ascent directions to update the policy.
Spherical convolutional neural networks (Spherical CNNs) learn nonlinear representations from 3D data by exploiting the data structure and have shown promising performance in shape analysis, object classification, and planning among others.
Activation functions are crucial in graph neural networks (GNNs) as they allow defining a nonlinear family of functions to capture the relationship between the input graph data and their representations.
Instead, each agent must form a local model and decide what information is fundamental to the learning problem, which will be sent to a central unit.
An AlgNN is a stacked layered information processing structure where each layer is conformed by an algebra, a vector space and a homomorphism between the algebra and the space of endomorphisms of the vector space.
Wireless control systems replace traditional wired communication with wireless networks to exchange information between actuators, plants and sensors in a control system.
They are presented here as generalizations of convolutional neural networks (CNNs) in which individual layers contain banks of graph convolutional filters instead of banks of classical convolutional filters.
This paper investigates the general problem of resource allocation for mitigating channel fading effects in Free Space Optical (FSO) communications.
Stochastic gradient descent is a canonical tool for addressing stochastic optimization problems, and forms the bedrock of modern machine learning and statistics.
Deterministic Policy Gradient (DPG) removes a level of randomness from standard randomized-action Policy Gradient (PG), and demonstrates substantial empirical success for tackling complex dynamic problems involving Markov decision processes.
At testing time, the deep part (nonlinear) is left unchanged, while the wide part is retrained online, leading to a convex problem.
To overcome this issue, we prove that under mild conditions the empirical dual problem of constrained learning is also a PAC constrained learner that now leads to a practical constrained learning algorithm based solely on solving unconstrained problems.
In recent years, considerable work has been done to tackle the issue of designing control laws based on observations to allow unknown dynamical systems to perform pre-specified tasks.
Dynamical systems comprised of autonomous agents arise in many relevant problems such as multi-agent robotics, smart grids, or smart cities.
We also introduce GNN extensions using edge-varying and autoregressive moving average graph filters and discuss their properties.
In this work, we propose a new strategy for pooling and sampling on GNNs using graphons which preserves the spectral properties of the graph.
We consider the problem of downlink power control in wireless networks, consisting of multiple transmitter-receiver pairs communicating with each other over a single shared wireless medium.
This paper is concerned with the study of constrained statistical learning problems, the unconstrained version of which are at the core of virtually all of modern information processing.
More specifically, we consider that each robot has access to a visual perception of the immediate surroundings, and communication capabilities to transmit and receive messages from other neighboring robots.
This is a general linear and local operation that a node can perform and encompasses under one formulation all existing graph convolutional neural networks (GCNNs) as well as graph attention networks (GATs).
We train the model to imitate an expert algorithm, and use the resulting model online in decentralized planning involving only local communication and local observations.
Despite the simplicity and intuitive interpretation of Minimum Mean Squared Error (MMSE) estimators, their effectiveness in certain scenarios is questionable.
Upon further assuming the use of near-universal policy parameterizations, we also develop explicit bounds on the gap between optimal values of initial, infinite dimensional resource allocation problems, and dual values of their parameterized smoothed surrogates.
The later is generally addressed by formulating the conflicting requirements as a constrained RL problem and solved using Primal-Dual methods.
In this paper, we are set to study the effect that a change in the underlying graph topology that supports the signal has on the output of a GNN.
Actor-critic algorithms combine the merits of both approaches by alternating between steps to estimate the value function and policy gradient updates.
Radio on Free Space Optics (RoFSO), as a universal platform for heterogeneous wireless services, is able to transmit multiple radio frequency signals at high rates in free space optical networks.
In this work, we extend scattering transforms to network data by using multiresolution graph wavelets, whose computation can be obtained by means of graph convolutions.
Graph neural networks (GNNs) have emerged as a powerful tool for nonlinear processing of graph signals, exhibiting success in recommender systems, power outage prediction, and motion planning, among others.
To address the complexity issues, we then write the function estimation problem as a sparse functional program that explicitly minimizes the support of the representation leading to low complexity solutions.
Graph neural networks (GNNs) are information processing architectures tailored to these graph signals and made of stacked layers that compose graph convolutional filters with nonlinear activation functions.
We consider the problem of finding distributed controllers for large networks of mobile robots with interacting dynamics and sparsely available communications.
Graph processes model a number of important problems such as identifying the epicenter of an earthquake or predicting weather.
Ranked #11 on Node Classification on CiteSeer (0.5%)
This paper reviews graph convolutional neural networks (GCNNs) through the lens of edge-variant graph filters.
In this paper, we investigate a method to regularize model learning techniques to provide better error characteristics for traditional control and planning algorithms.
Even if they are, recovering sparse solutions using convex relaxations requires assumptions that may be hard to meet in practice.
Graph neural networks (GNNs) have been shown to replicate convolutional neural networks' (CNNs) superior performance in many problems involving graphs.
In this paper, we propose a Distributed Accumulated Newton Conjugate gradiEnt (DANCE) method in which sample size is gradually increasing to quickly obtain a solution whose empirical loss is under satisfactory statistical accuracy.
This paper considers the design of optimal resource allocation policies in wireless communication systems which are generically modeled as a functional optimization problem with stochastic constraints.
When the number of agents increases, the dimensionality of the input and control spaces increase as well, and these methods do not scale well.
We consider Markov Decision Problems defined over continuous state and action spaces, where an autonomous agent seeks to learn a map from its states to actions so as to maximize its long-term discounted accumulation of rewards.
Superior performance and ease of implementation have fostered the adoption of Convolutional Neural Networks (CNNs) for a wide array of inference and reconstruction tasks.
Convolutional neural networks (CNNs) are being applied to an increasing number of problems and fields due to their superior performance in classification and regression tasks.
That is, we establish that with constant step-size selections agents' functions converge to a neighborhood of the globally optimal one while satisfying the consensus constraints as the penalty parameter is increased.
Theoretical analyses show that the use of adaptive sample size methods reduces the overall computational cost of achieving the statistical accuracy of the whole dataset for a broad range of deterministic and stochastic first-order methods.
In this paper, we propose a novel adaptive sample size second-order method, which reduces the cost of computing the Hessian by solving a sequence of ERM problems corresponding to a subset of samples and lowers the cost of computing the Hessian inverse using a truncated eigenvalue decomposition.
Despite their attractiveness, popular perception is that techniques for nonparametric function approximation do not scale to streaming data due to an intractable growth in the amount of storage they require.
We prove that not only the proposed DIAG method converges linearly to the optimal solution, but also its linear convergence factor justifies the advantage of incremental methods on GD.
Existing approaches to resource allocation for nowadays stochastic networks are challenged to meet fast convergence and tolerable delay requirements.
This paper characterizes hierarchical clustering methods that abide by two previously introduced axioms -- thus, denominated admissible methods -- and proposes tractable algorithms for their implementation.
We introduce two practical properties of hierarchical clustering methods for (possibly asymmetric) network data: excisiveness and linear scale preservation.
To do so, we depart from the canonical decentralized optimization framework where agreement constraints are enforced, and instead formulate a problem where each agent minimizes a global objective while enforcing network proximity constraints.
Multiagent Systems Systems and Control Computation
Algorithms that are parallel in either of these dimensions exist, but RAPSA is the first attempt at a methodology that is parallel in both the selection of blocks and the selection of elements of the training set.
We consider discriminative dictionary learning in a distributed online setting, where a network of agents aims to learn a common set of dictionary elements of a feature space and model parameters while sequentially receiving observations.
The resulting dual D-BFGS method is a fully decentralized algorithm in which nodes approximate curvature information of themselves and their neighbors through the satisfaction of a secant condition.
Algorithms that are parallel in either of these dimensions exist, but RAPSA is the first attempt at a methodology that is parallel in both, the selection of blocks and the selection of elements of the training set.
In this paper, we address tracking of a time-varying parameter with unknown dynamics.
The decentralized double stochastic averaging gradient (DSA) algorithm is proposed as a solution alternative that relies on: (i) The use of local stochastic averaging gradients.
Optimization and Control
Global convergence of an online (stochastic) limited memory version of the Broyden-Fletcher- Goldfarb-Shanno (BFGS) quasi-Newton method for solving optimization problems with stochastic objectives that arise in large scale machine learning is established.
This paper introduces hierarchical quasi-clustering methods, a generalization of hierarchical clustering for asymmetric networks where the output structure preserves the asymmetry of the input data.
This paper adapts a recently developed regularized stochastic version of the Broyden, Fletcher, Goldfarb, and Shanno (BFGS) quasi-Newton method for the solution of support vector machine classification problems.
Numerical experiments showcase reductions in convergence time relative to stochastic gradient descent algorithms and non-regularized stochastic versions of BFGS.
Our construction of hierarchical clustering methods is based on defining admissible methods to be those methods that abide by the axioms of value - nodes in a network with two nodes are clustered together at the maximum of the two dissimilarities between them - and transformation - when dissimilarities are reduced, the network may become more clustered but not less.