Photocopying permittedbylicenseonly theGordon and BreachScience Publishersimprint.
PrintedinMalaysia.
Cognition: Differential-geometrical View on Neural Networks
S.A. BUFFALOV*
Radio-physical Department, TomskState University, Lytkina 24-109, Tomsk 634034,Russia
(Received4February1999)
A neural network taken as a model of a trainablesystem appears to be nothing but a dynamicalsystemevolvingon atangentbundle withchangeablemetrics.Inotherwordsto learn meansto changemetrics of adefinite manifold.
Keywords." Neuralnetwork, Cognition, Dynamical system, Manifold, Metrics
I. INTRODUCTION
Anapplicationof differentialorintegro-differential calculus for modeling of dynamical and self- organizing processesin social and natural systems hasbecomeatradition sincetheworksof
.
Lottcawho released a book "Elements of Physical Biol-
ogy"
(Baltimore,1925)
and W. Waltterra whose paper "Sulla periodicita delle fluttuazioni bio- logiche" appeared in 1927. Lots of complicated problems in mathematics, physics, astronomy, chemistryandbiologyfind their decisions(Hilborn and Tufillaro,1997)
when implementing the modern sophisticated and carefully elaborated nonlinear-dynamical approach. It combines dy- namicalsystems(Katok
andHasselblatt,1995)
and category theories, topology (Akin,1993)
and dif- ferentialgeometry,ergodic(Pollicottand Michiko,1997)
and fixed point theories, combinatorics* E-mail:[email protected].
(Harper and Mandelbaum,
1985),
representation theory (Vershik,1992),
domaintheory(Potts,1995)
etc. Pleiad of these theories works wonderful for natural phenomena and is absolutely helpless as only one tries to apply one of them to social and culturalevents.
Todayoneshould ask oneself whetheraformal- ism ofintegro-differential equations he applies in social realm is sufficient for adequate synergetic exposition of phenomenon, for example, socio- economic development? Tobe fair the most often answerisgoingto be"no".The reason is inman.
Modelsof somenatural,socioeconomic, political etc.dynamical and self-organizing processes should take into account a presence of anthropological factorintrinsic totheseones.
A
manwith his diverse set of behavioral patterns enriches any kind of human-loaded phenomena(HLP)
with unpredict- ability andenormouscomplexity.43
In particular, in this humanitarian context a cognitive activity ofahuman being appears to be a part of HLP almost the most difficult for explication and at the same time to be a generic feature of a carrier of cultural patterns and archetypes. In modeling of synergetic aspects in physical,chemicaland other "behavioral
systems"
thereis no such difficulty. Therefore a due regard for cognitioninsocial-synergeticmodels,being an independent scientific problem, is suitable to be a criterionof theircompleteness.
The article presents some kind of an elabora- tion ofHLP models that use differential calculus by introducing a mathematical caption of cogni- tion due to consideration of a dynamical system embedded in a manifold with inconstant metrics.
The author shows that such system isnothing but an "intellectually and mentally inspired" neural network
(Buffalov, 1998)
capable of learning (Mitchell,1997),
recognition (Ripley,1996),
gen- eralizing and forecasting. It is also shown that metrics alteration actually is the training of this neuralnetwork.There is an alternative attempt of Scott and Fucks
(1995)
to depict some features of human brain using thetheoriesof attractors and Sil’nikov chaos. It givesanotionabout dynamics complexity and perpetuity bymeans of thedynamical systems theory, and we try using the same theory and differential geometry to show how to provide a dynamical system with intellectual and mental propertiestomakeit suitablefor modelingof social and culturalHLP.Intellectual systems with cognition and self- regulationusually representawide class ofcomplex adaptive living beings studied by humanitarian, medical and biological sciences. Machine learning theory (Mitchell,
1997)
reflects on manmade self- trainingdevicesanaloguetotheirbiological proto- types.Weaddressaneural networkstudiedbythis theory as one of such artificial systems endowed with a synthetic intellect and cognition suitable for "intellectual" sophistication of the ordinary differentialcalculus.II. NEURAL NETWORKS
Neuralnetworks (Ripley,
1996)
are aninformation processing technique based on the way biological nervous systems, such as thebrain, process infor- mation. The fundamental concept of neural net- worksisthe structure oftheinformationprocessing system. Composed of a large number of highly interconnected processing elements or neurons, a neural network system uses the human-like tech- niqueoflearning byexample to resolveproblems.The neural network is configured for a specific application, such as data classification or pattern recognition, through a learning process called training. Just as in biological systems, learning involves adjustments to the synaptic connections that exist between the neurons. Neural networks candifferon:the way theirneurons areconnected;
thespecifickindsofcomputationstheir neuronsdo;
the way theytransmitpatterns of activity through- outthenetwork; and the way theylearnincluding theirlearningrate.
Inthisarticle we aregoingtousethedifferential- geometricalformalism todescribe neuralnetworks ofacertainarchitecture outlined in Petritis
(1995)
and toimplementthem for "intellectualization" of differential formalism and dynamical systems, in particular. This approach is rather new though therewere some attemptsinPotts
(1995)
concern- ingforgetfulneuralnetworksto derivetheembed- dingstrengthdecayrateofthe stored patterns using recentadvancesin domainand topologytheories.We consider neural networks, which can be defined as a cascadeconjunctionof severalproperly constructedlayers.The typicalonehas the follow- ingstructure(Petritis,
1995):
(1)
A levelof
input neurons fed with a vector of external signals.(2)
A lineartransformation
level. Here the inputvector is multiplied on a matrix of synaptic weights responsible forinformationstoring.
(3)
Nonlineartransformations
level(a
set ofneu-ronswith nonlineartransferfunctions). Herea
linearly transformed signal is nonlinearly converted.
(4)
Alevelof
outputneurons.The lastlayerisfedbackto the first one.
The previous passage outlines the neural net- work’s description
framework
giving a strictdefini-
tiontoits structurewhat’soffundamental meaning in neuralnetwork technique. Relayingonthat fact we assumethat any system includinga dynamical one, whichallowsadescriptionwithinthat frame- workcanbe treatedas"intellectual"and possessing cognitionso far asneural network.
Nowonecantransfer theconceptofcognitionto the scene ofthedifferentialcalculus and the theory ofdynamical systemsinaverysimple anduniversal fashion. Just develop a generalized description of the dynamical systems in such a manner that it incorporates the neural network’s description as particular case. Such unifying generalization will automatically assign all propertiesof the neural net- workto thedynamical system andvice versa. The context accompanying the assignment will define the differential-geometriccontentof cognition.
III. DIFFERENTIALGEOMETRY BACKGROUND
Topological spaceis aset of pointswithsubsets indicated tobe open.Itisrequired thatanarbitrary intersectionordisjointunionofanyfinalnumberof opensetsshouldbeopenaswell. Theset
.
itselfandempty set should be open. We will work with important particular case of topological space metricspace for any two points x andy ofwhich there is defined a functionp(x, y) called adistance betweenx andywiththefollowing properties:
1. p(x, y) p(y,
x);
2.3. p(x,Triangle inequality:
x)
0andp(x,y)p(x, y)>
0,if x<_
p(x,-
y;z) +
p(z, y).Let Mbe adifferentiable manifold.We say that M is a Riemannian
manifold
if there is an innerproduct gx(’,
")
defined oneachtangentspaceTxM
for x M such that for any smooth vector fields Xand Y on M the function xgx(X(x),Y(x)) is asmoothfunction ofx.
Ineveryneighborhood Uiwithlocal coordinates
(x)]=
a positively defined symmetrical matrixgci (X] X)
sets aRiemannian metricssothatfor anyvector in apointxthe equalityl
2gi
holds.
Metrics
gij(Y,...
,yn) is said to be Euclidean if there exists a system of coordinatesxl,...,x n,
xi= xi(y,...,
y), i=1,...,n,such that Given a setMone saythat there is a structure ofn-dimensional
differentiable manifold
on M if foreach x M there exists a neighborhood U of x and homeomorphism h from Uto anopen ball in
.
We call (U,h) a chart(or
system of localcoordinates)aboutx.
If Misamanifoldand x Misapoint,thenwe definethe tangent space toMat x
(denote TxM)
to bethe setof all vectorstangentto Minx.The tangent bundle ofM, denote TM, is defined to be the disjoint union over x M of
TxM,
i.e.TM
Ux
MTxM.
We think of TM as the set ofpairs(x,v),wherex Mand v
TxM.
The tangent bundleis in factamanifold itself.Onecanintroduce the cotangent bundle if we consider a covector instead ofavector.det
\OyJJ
0 andOx Ox
k:l Oyi oyj Thesecoordinates
x,..., x"
arecalledEuclidean.IV. DYNAMICALSYSTEMS BACKGROUND Forthepurpose ofthispaperadynamical systemis a topological metric space X and a continuous vector field F. The system is denoted as a pair
(X,F).
Locally it is described by a system of ordinarydifferentialequations ofthe first order.There exist two principal approaches for dy- namical systems, which supposeaconstruction of developed theoretical base. Actually these are Lagrangian and Hamiltonformalisms. The first is
the particularcaseof the last.Thatiswhywerestrict ourselves to Hamiltoniandynamical systems.
Inanyspace
n
with coordinates(yl,...,
yn)and metrics gij, i, j=1,...,n, it is possible to define a scalarproduct andindexraisingup. SothegradientVf
of the functionf(yl,...,
y)looks like(vf)i
g*JOyJ"
Vector field
X7f
has a corresponding system of differential equationsi_ (vf)i
called gradient system.The space with a skew-symmetrical metrics i=l iscalledaphasespace if it allows such coordinates(q, p)that:
I
whereIistheunitmatrix,pis acovectorand(q,p) belongs on acotangent bundle ofaconfiguration manifold.
A
gradient system in a phase space is called a Hamiltoniandynamicalsystem. Ingeneralan even- dimensional manifold (phase space), a symplectic structure on it (integral Poincare invariant) and a function on it (Hamiltonian) completely define a Hamiltoniansystem.V. "INTELLECTUALIZATION" OF DYNAMICAL SYSTEMS
Weareavoiding of consideringof anarbitrary dy- namicalsystemsofar andaddressthe Hamiltonian one embedded in a cotangent bundle of a con- figuration manifold with the Riemannian skew- symmetrical metrics G-
(gij)2n
ij=l" Let it be described by the Hamilton equations for general- ized coordinatesq
and impulsesp., which can be written inthe following form:# GF(y, t), (1)
where
yi= qi, yn+i__pi,
i=1,...,n,OH(y, t)/Oy J,
j 1,...,n.Fj(y, t)
In the case of an arbitrary nonobligatory gradient dynamical system
gl
Qi(y, t),
/5i Pi(y,t),
1,...,n, in a cotangent bundle quantities in Eq.(1)
willhave the following denotation:
Fi(y,t)=Qi(y,t),
Fi(y,
t)-
Pi(Y,t),
andG(y, t) (Offi/OyJ)id=
2n is theJacobi matrix offrametransformation
(GG
Tisthe Euclideanmetrics).Equation
(1)
canbewrittenin a form of a finite differenceschemewith asufficientlysmall time dis- cretization step-.
According to the Euler method weobtain an iterativeprocesswithnthstep givingYn Yn-1
+ 7-GnF(Yn-1, tn-1). (2)
Itcan be easily interpreted in terms ofa neural network withinputvectory,lineartransformation G, nonlinear transformation, i.e. a set oftransfer functionsF., andafeedback signaldecayrate-.
It is known from numerical methods that accuracy of theapproximation
(2)
canbesubstan- tially improvedif to add in the right part of(2)
a vectoroferrorscalculated usingthe firstformulaof Rungey y
R- k2
(3)
where
y
andy
are approximations calculated withdecayrates-
andk- for any integerk.This procedure can be interpreted as a fruitful discussion between two neural networks with differentdecayrates
(or
"intellectuallevels").The discretization ofEq.
(1)
provides two ways for displaying of cognition in the framework of dynamical systemsbymeansof interpretationsheld in terms of neuralnetworktheory:Mathematical caption
of
cognition throughmetrics alterations
Any
Hamiltonian dynamical system (X,F)
evolving in the phase space Xwith the changeable Riemannian skew-symmetrical metricsGdefines a neural networkwiththe setof transfer functionsF;,
i=1,...,2n, and G as the matrixof synaptic weights.Mathematical caption
of
cognitionthroughframe
alterations Both an arbitrary non-Hamiltonian dynamical system
(X,F)
evolving in the metric space X and an inconstant Jacobi matrix G of frame transformation define a neural network with the set oftransfer functions Fi, i=1,..., 2n, andGasthematrixof synaptic weights.Resting upononeoftheseinterpretationsone can treat a dynamical system in a differentiable manifold with a changeablemetrics as a neural networkalongatraining process;
givemoreexplicitsolutionof the centralproblem of theneural networktheory:memorizationofan arbitrary set of patterns and determination of their attractionbasins. Withacertainnetwork’s architecture in hand this problem is solved by appropriate choosingof itstransferfunctions(i.e.
avectorfield Fi, 1,...,n, which is adynamical system in
fact)
and training algorithm(a
law of evolution of a manifoldmetrics). Inother words the solution is given by correct setting up ofa dynamical system(X, F),
where Xis a metricspace with ametricsG providingthebest (accordingto a givencriterion)patterns’
memorization;takeuse ofrich toolkitoftopology and smooth theories forinvestigation of "knowledge" struc- tures generated by neural network invariant to continuous and smoothchanges ofcoordinates, i.e. patterns remaining stable in the memory of thenetworkunderitstraining. Such patternscan becalledunconditionedreflexes;
address the fixed point theory as the most powerful tool forperception ofpatterns stable under network’s "cognitive" dynamics when recognizing, generalizing, predicting and etc. In particular, these patterns can be called condi- tionedreflexes obtainedthroughout learning for certainexternalinputs;
sophisticate and deepen., a research of neural networks usingLiealgebras ofvectorfields and aphase portraitofthetrainedneural network(its outputsignal’s dynamics during recognizingand
etc.),
namely,ofappropriate dynamical systemin acurved manifold;generalize one’sinvestigations dueto categories of topological spaces andvector fieldselaborated in thecategorytheory.
VI. METRICS ALTERATION VERSUS TRAINING
"Intellectualization" endows a dynamical system
(X, F)
with one moredegree of freedom revealedin plasticity of quantities defining the metrics ofX.Thisplasticityreflectstrainingabilitiesoftheneural networkassociated withthe dynamical system.
Let us consider autonomous differential equa- tionsestablishinganarbitrary training algorithm:
1 Gtr(Y), (4)
where y is defined through integral with G in integrand[refer
Eq. (2)].
If closeenoughto anend of the training process theintegro-differential equation
(4)
pertainingtoG canbe simplified to an ordinarydifferentialequa- tion(see
Appendix)lJ
lkr(5)
where
R/(y,t) OGOt;
Fr(y,t),
i,j,k,r- 1,...,2n.Here and further we mean asummation all over dummyindexesvalues.
As
you can see the metrics evolution equations(5)
describethemotionof 2n coupledoscillators.VII. SOLUTION OFTHEMETRICS EVOLUTIONEQUATIONS
Werewrite
Eqs. (5)
in a concise matrixform:R, (6)
where
,
is a vector representation of the metric tensorG(gij)2n
ij=l andRis an operatorrepresen- tationof thetensorRk.
IfR isimplicitly time-dependent and
’(Yo, to)
g, (Yo, to) -’o
areentryconditions thenEq. (6)
hasasolution:
t) cos[W(t- to)] ’o +
W-1sin[W(t- to)] o,
(7)
where
W(y)
andR2t4
C
cos(Wt)
I.I Rt2 +
R2t5
W-1
sin(Wt)
It.
Rt+
We can always find such nonsingular matrix X that
’0- 0 (for
example,- ’0-).
InducingX
CC -
werewriteEq. (7)
asfollows:,(y, t)- {X cos{W(t- to)] +
W- sin[W(t- t0)] }0.
Using the well-known trigonometric relations we obtain
,(y, t)
A(y)sin[W(t-to) + F(y)]
o,(8)
whereA
v/X +
R-1 andF-arcctg[WX].Equation
(8)
is the solution of the metrics evolution equation(6).
It describes a complicated oscillatory dynamics of the neural network’s synaptic weights defining themetrics ofthemani- fold. Such solution is very interesting from the neuro-dynamical point of view since it allows to speak aboutexistence inthe neural networktheory ofanalogofunfading oscillatoryneocortexelectro- chemical activity, i.e. brain’s rhythms(Haken
and Stadler,1990).
During the training the behavior of
,(y, t)
israther complicated because ofconstantly varying amplitudes, frequencies and phases of coupled harmonics in
(8).
But in the very moment when theneural networkis trained all thesemagnitudes accept fixed values and do not vary in time any more. The network passes in aphase of unfading oscillations which parameters reflect an informa- tionstoredbyit.VIII. CATASTROPHE
Assoonasthedynamical system
(2)
settlesdownto somefixedpointy,
i.e. F(y, t)=
0, the elementsof themetric tensor(or
matrixofsynaptic weights)are subjected to anunbounded lineargrowth in time.Itbecomesevident if to considerEq.
(6)
wherethe right partis set to zero.Suchacatastrophicoutcome occursonlyif
y
is a stable fixedpoint and the"cognitive" dynamics of the neural network fades(assume
that our brain stops functioning. It’s impossible!). Otherwise, wheny
is unstable the output signals of the network evolve endlessly and never settle down.Thecatastrophenever occursbut anotherproblem ofeverlasting dynamics appears.
Tosolve thisproblem andtomaketheprocedure of trainingof theneural network decliningonehave to restrict a scope of synaptic weights evolution inlight ofaspecial kind of dynamical system
(2).
Oneofthe possibleways, which lies in wonderful agreementwithexperimentisto consider adynam- ical system displaying the Sil’nikov chaos
(Scott
and Fucks,1995).
In this case it never actually settles inastablefixedpointatall,butcontinuously evolvesin thevicinity ofasaddlefocus.So to avoid the catastrophe and to provide an adequatememorizationofagivensetof patternswe shouldconstructanappropriate dynamical system
(2)
exhibiting the Sil’nikov chaos and a training algorithm(4)
insucha mannerthatany given pattern is a stable fixed point of the map
Ctr;
any stable fixed point of the map
Gtr
coincideswith oneof the saddle focuseslaying onhomo- clinic orbitsofthedynamical system,i.e.transfer functionsof the neuralnetwork.
Now we say that the neural network is trained when its output signal dynamicsis restrained to a vicinity ofone ofthe saddle focuses. In this very moment amplitudes, frequencies and phases of coupledharmonics in
(8)
accept"fixed"values but varyinsignificantlyin time.Thenetwork passesin a phaseofunfading slowly varyingoscillations which parameters reflectaninformation storedbyit.IX. CONCLUSION
We tried to make a due regard for cognition in social-synergetic models ofHLPthatusedifferen- tialcalculusby introducingamathematicalcaption of cognition due to consideration ofa dynamical system embedded in a manifold with changeable metrics.
Any
dynamical system(X,F)
evolving in the phase spaceXwithchangeableRiemannian metrics G appears to be a neural network with transfer functions Fi, i=1,... ,n, and G as the matrix of synaptic weights. Suchinterpretation hastwovery important consequences:It enriches exceedingly the neural network theory by the theoretical and computational power of topology and smooth theories, cate- gory and ergodic theories, dynamical systems and fixed point theories, Lie algebras, phase portrait techniqueetc.
It endows social-synergetic models with extra
"cognitive"degrees of freedom givingareal pos- sibility to grasp anthropological dimension of some natural,cultural, socioeconomic, political, dynamicalandself-organizing processesetc.
When closeenoughto afixedpoint the dynamics of synaptic weights defining the metrics G is described by the system of differential equations for 2n2coupledoscillators.Wefindthis solution to be in wonderful coherence with the fact of the neocortexoscillatory activity.
The idea ofthe dynamical system embedded in the manifold with inconstant metrics plays con- siderable role in the new understanding of neural networks and the nature of training. The inter- pretation offered here doesnotapplyforgenerality andcompletenessof anexpositionof all details.Its mainpurposeis to designatethe newapproachto
comprehensionofanthropologicaldimension in social-synergeticmodels;
understanding of neural networks within the framework of the nonlinear dynamics (synergetics).
References
Akin, E. (1993). The generaltopologyofdynamical systems.
Am.Math.Soc.,261.
Buffalov,S.A. (1998).Neural networks inphysis.Deposited in RussianJ."Fizica",Tomsk, 26.05.98, 1591-A98.P.8.
Haken, H. and Stadler, M. (1990). Synergetics ofCognition.
Springer,Berlin, p. 438.
Harper, J.R. and Mandelbaum, R. (1985). Combinatorial methods in topology and algebraic geometry. Am. Math.
Soc.,p. 349.
Hilborn, R.C. and Tufillaro, N.B. (1997) Resource letter:
nonlineardynamics. Am. J. Phys.65,822-834.
Katok,A.andHasselblatt, B.(1995).Introductiontothe Modern TheoryofDynamical Systems. CambridgeUniversityPress, Cambridge,p. 254.
Mitchell,T. (1997).Machine Training.McGraw Hill, New York, p. 414.
Petritis, D. (1995). Thermodynamic formalism of neural computing.
http://www.ma,utexas,edu/mp_arc/mp_arc-home,html.
Pollicott, M.and Michiko,Y.(1997). Dynamical Systemsand Ergodic Theory. Cambridge University Press, Cambridge, p. 180.
Potts, P.J. (1995). The Storage Capacity ofForgetfulNeural Networks.http://xyz.lanl.gov.
Ripley, B.D.(1996).PatternRecognition and Neural Networks.
Cambridge UniversityPress,Cambridge,p. 416.
Scott, J.A. andFucks,A. (1995). Self-organizing dynamics of the human brain: critical instabilities and Sil’nikovchaos.
Chaos5(1),64-69.
Vershik, A.M. (1992). Representation theory and dynamical systems.Am.Math.Soc.,p. 267.
APPENDIX
For simplification ofEq.
(4)
weuseGtr(Y)
expan- sionintheTailor series in avicinityof a fixedpointyr.
SoGtr(Y f)
0 implies that the neural network istrained or metrics evolution came into a station- arystate:’lJ--Oy oa ;
k(yk--yfk)+ ...,
i,j,k--l,...,2n.yf
We neglect by derivatives of the second and highest orders and thendifferentiatebytime.Here we usethe factthat
[see Eq. (4)]
Ot
Oy
/After this we consider the system of ordinary differentialequations