首页|A theory of synaptic neural balance: From local to global order

A theory of synaptic neural balance: From local to global order

扫码查看
We develop a general theory of synaptic neural balance and how it can emerge or be enforced in neural networks. For a given additive cost function R (regularizer), a neuron is said to be in balance if the total cost of its input weights is equal to the total cost of its output weights. The basic example is provided by feedforward networks of ReLU units trained with L_2 regularizes, which exhibit balance after proper training. The theory explains this phenomenon and extends it in several directions. The first direction is the extension to bilinear and other activation functions. The second direction is the extension to more general regularizes, including all L_p (p > 0) regularizes. The third direction is the extension to non-layered architectures, recurrent architectures, convolutional architectures, as well as architectures with mixed activation functions and to different balancing algorithms. Gradient descent on the error function alone does not converge in general to a balanced state, where every neuron is in balance, even when starting from a balanced state. However, gradient descent on the regularized error function ought to converge to a balanced state, and thus network balance can be used to assess learning progress. The theory is based on two local neuronal operations: scaling which is commutative, and balancing which is not commutative. Finally, and most importantly, given any set of weights, when local balancing operations are applied to each neuron in a stochastic manner, global order always emerges through the convergence of the stochastic balancing algorithm to the same unique set of balanced weights. The reason for this convergence is the existence of an underlying strictly convex optimization problem where the relevant variables are constrained to a linear, only architecture-dependent, manifold. Simulations show that balancing neurons prior to learning, or during learning in alternation with gradient descent steps, can improve learning speed and performance thereby expanding the arsenal of available training tools. Scaling and balancing operations are entirely local and thus physically plausible in biological and neuromorphic neural networks.

Neural networksDeep learningActivation functionsRegularizationScalingNeural balance

Pierre Baldi、Antonios Alexos、Ian Domingo、Alireza Rahmansetayesh

展开 >

Department of Computer Science, University of California, Irvine, United States of America

2025

Artificial intelligence

Artificial intelligence

SCI
ISSN:0004-3702
年,卷(期):2025.346(Sep.)
  • 39