Chapterhome.eng.iastate.edu/~julied/publications/FCM96.pdfstitute University of Southern California...

35

Transcript of Chapterhome.eng.iastate.edu/~julied/publications/FCM96.pdfstitute University of Southern California...

  • Chapter �

    Virtual Worlds as

    Fuzzy Dynamical Systems

    Julie A� Dickerson and Bart Kosko

    Electrical and Computer Engineering Department� Iowa State University� Ames�IA� �����

    Department of Electrical Engineering�Systems� Signal and Image Processing In�stitute� University of Southern California� Los Angeles� CA� ���������

    Abstract

    Fuzzy cognitive maps �FCMs� can structure virtual worlds that change withtime� A FCM links causal events� actors� values� goals� and trends in a fuzzyfeedback dynamical system� A fuzzy rule de�nes a fuzzy patch in the input�outputstate�space of a system� It links commonsense knowledge with state�space geometry�A FCM connects the fuzzy rules or causal �ow paths that relate events� It can guideactors in a virtual world as the actors move through a web of cause and e�ect andreact to events and to other actors� Experts draw FCM causal pictures of the virtualworld� Complex FCMs can give virtual worlds with �new or chaotic equilibriumbehavior� Simple FCMs give virtual worlds with periodic behavior� They map inputstates to limit�cycle equilibria� A FCM limit cycle repeats a sequence of events ora chain of actions and responses� Limit cycles can control the steady�state rhythmsand patterns in a virtual world� In nested FCMs each causal concept can controlits own FCM or fuzzy function approximator� Appendix A shows how an additivefuzzy system can uniformly approximate any continuous �or bounded measurable�function on a compact domain to any degree of accuracy� This gives levels of fuzzysystems that can choose goals and causal webs as well as move objects and guideactors in the webs� FCM matrices sum to give a combined FCM virtual world forany number of knowledge sources� Adaptive FCMs change their fuzzy causal web ascausal patterns change and as actors act and experts state their causal knowledge�Neural learning laws change the causal rules and the limit cycles� Actors learnnew patterns and reinforce old ones� In complex FCMs the user can choose thedynamical structure of the virtual world from a spectrum that ranges from mildlyto wildly nonlinear� We use an adaptive FCM to model an undersea virtual worldof dolphins� �sh� and sharks�

  • � Chapter ���

    ��� Fuzzy Virtual Worlds

    What is a virtual world� It is what changes in a�virtual reality �

    or �cy�berspace ��� A virtual world links humans and computers in a causal mediumthat can trick the mind or senses�

    At the broadest level a virtual world is a dynamical system� It changes withtime as the user or an actor moves through it� In the simplest case only the usermoves in the virtual world� In general both the user and the virtual world changeand they change each other�

    Change in a virtual world is causal� Actors cause events to happen as theymove in a virtual world� They add new patterns of cause and e�ect and respondto old ones� In turn the virtual world acts on the actors or on their physical orsocial environments� The virtual world changes their behavior and can change itsown web of cause of e�ect� This feedback causality between actors and their virtualworld makes up a complex dynamical system that can model events� actors� actions�and data as they unfold in time�

    Virtual worlds are fuzzy as well as fedback� Events occur and concepts holdonly to some degree� Events cause one another to some degree� In this sensevirtual worlds are fuzzy causal worlds� They are fuzzy dynamical systems� Afuzzy rule de�nes a fuzzy patch in the input�output state�space of a system andlinks commonsense knowledge with state�space geometry� An additive fuzzy systemapproximates a function by covering its graph with fuzzy patches in the input�output state space and averaging patches that overlap�

    How do we model the fuzzy feedback causality� One way is to write downthe di�erential equations that show how the virtual ��ux or ��uid changes intime� This gives an exact model� The Navier�Stokes equations �� used in weathermodels give a �uid model of how actors move in a type of virtual world� They canshow how clouds or tornadoes form and dissolve in a changing atmosphere or howan airplane �ies through pockets of turbulence� The inverse kinematic equationsof robotics �� show how an actor moves through or grasps in a virtual joint space�The coupled di�erential equations of blood glucose and insulin �� cast the patientas a diabetic actor awash in a virtual world of sugar and hormones� Such mathmodels are hard to �nd� hard to solve� and hard to run in realtime� They paint too�ne a picture of the virtual world�

    Fuzzy cognitive maps �FCMs� can model the virtual world in large fuzzychunks� They model the causal web as a fuzzy directed graph ������ The nodesand edges show how causal concepts a�ect one another to some degree in the fuzzydynamical system� The �size of the nodes gives the chunk size� In a virtual worldthe concept nodes can stand for events� actions� values� moods� goals� or trends�The causal edges state fuzzy rules or causal �ows between concepts� In a predator�prey world survival threat increases prey runaway� The fuzzy rule states how muchone node grows or falls as some other node grows or falls�

    Experts draw the FCMs as causal pictures� They do not state equations�They state concept nodes and link them to other nodes� The FCM system turnseach picture into a matrix of fuzzy rule weights� The system weights and adds theFCM matrices to combine any number of causal pictures� More FCMs tend to sumto a better picture of the causal web with rich tangles of feedback and fuzzy edgeseven if each expert gives binary �present or absent� edges� This makes it easy to

  • Technology for Multimedia �

    add or delete actors or to change the background of a virtual world or to combinevirtual worlds that are disjoint or overlap� We can also let a FCM node control itsown FCM to give a nested FCM in a hierarchy of virtual worlds� The node FCMcan model the complex nonlinearities between the node�s input and output� It candrive the motions� sounds� actions� or goals of a virtual actor�

    The FCM itself acts as a nonlinear dynamical system� Like a neural net itmaps inputs to output equilibrium states� Each input digs a path through thevirtual state space� In simple FCMs the path ends in a �xed point or limit cycle�In more complex FCMs the path may end in an aperiodic or �chaotic attractor�These �xed points and attractors represent meta�rules of the form �If input� thenattractor or �xed point� The rules are stored in the cube itself�

    ��� Additive Fuzzy Systems

    A fuzzy system approximates a function by covering its graph with fuzzypatches and averaging patches that overlap� The approximation improves as thefuzzy patches grow in number and shrink in size� Figure � shows how fuzzy patchesin the input�output product space X � Y cover the real function f � X � Y � InFigure ��a� a few large patches approximate f� In Figure ��b� several smallerpatches better approximate f� The approximation improves as we add more smallpatches but storage and complexity costs increase� This section gives the algebraicdetails of the fuzzy approximation�

    An additive fuzzy system adds the then�parts of �red if�then rules� Otherfuzzy systems combine the then�part sets with pairwise maxima� A fuzzy systemhas rules of the form �If input conditions hold� then output conditions hold or �IfX is A� then Y is B for fuzzy sets A and B� Each fuzzy rule de�nes a fuzzy patchor a Cartesian product A�B as shown in Figure ��� The fuzzy system covers thegraph of a function with fuzzy patches and averages patches that overlap� Uncertain

    ������������

    X

    Y f

    X

    Y f

    Figure ��� �a� Four large fuzzy patches cover part of the graph of the unknownfunction f � X � Y � Fewer patches can decrease computation but decrease ap�proximation accuracy� �b� More smaller fuzzy patches better cover f but at greatercomputational cost� Each fuzzy rule de�nes a patch in the product space X � Y �A large but �nite number of fuzzy rules or precise rules can cover the graph witharbitrary accuracy�

  • � Chapter ���

    X

    Y

    ������

    A1 A2 A3

    B

    1

    B 2

    B

    3

    A1 B1x

    IF X=A1, THEN Y=B1

    Figure ��� The fuzzy rule patch �If X is fuzzy set A�� then Y is fuzzy set B�� isthe fuzzy Cartesian product A� �B� in the input�output product space X � Y �

    fuzzy sets give a large patch or fuzzy rule� Small or more certain fuzzy sets givesmall patches�

    Additive fuzzy systems �re all rules in parallel and average the scaled out�put sets B�j to get the output fuzzy set B as in Figure ��� Correlation productinference scales each output set Bj by the degree mAj �x� �or aj�x�� that the rule�IF Aj � THEN Bj �res� Most rules �re to degree �� Defuzzi�cation of B givesa number or a control signal output� Centroidal defuzzi�cation with correlationproduct inference �� gives the output value yk at time k�

    yk � F �xk� �

    RymB�y�dyRmB�y�dy

    ��

    Pmj�� V olume �B

    j �Centroid�B�

    j�Pmj�� V olume�B

    j �

    Pmj�� cyjVjmAj �xk�Pmj�� VjmAj �xk�

    Vj is the volume of the j th output set Bj � We can always normalize the �nitevolumes Vj to unity to keep some rules from dominating others� cyj is the centroidof the j th output set� Fit value mAj �xk� scales the output set Bj � m is the numberof output fuzzy sets� In practice A is connected� It need not be� But then we couldview the rule �If X is A� then Y is A as two or more rules of the form �If X isA� then Y is B� and �If X is A� then Y is B� where B� and B� are two of thedisjoint components of A� So assume B is connected� Then the rule patch A�Bis connected and a patch proper�

    The additive fuzzy system computes the conditional expectation E�Y jX � xif we view fuzzy sets as random sets ����� � if the curve mA � ���

    � X is alocus of two point conditional densities� Then mA�x� is the probability of A giventhat X takes on x or mA�x� � p�x � A j X � x� and mA�x� � p�x �� A j X � x��The conditional mean E�Y jX is the mean�squared optimal estimate of Y given theinformation known about X�given the information in the random or fuzzy subsetsA of X�

  • Technology for Multimedia �

    •••

    CentroidalDefuzzifier y

    B2́

    IF A1 THEN B1

    IF A2 THEN B2

    x

    IF Am THEN Bm

    B

    1B´

    1w

    2w

    wm B́m

    Figure ��� Additive fuzzy system architecture� The input xk acts as a deltapulse �or unit bit vector� and �res each rule to some degree� The system adds thescaled output fuzzy sets� The centroid of this combined set gives the output valueyk � The system computes the conditional expectation value EY jX xk��

    In Appendix A we show that a fuzzy system can approximate any continuousreal function de�ned on a compact �closed and bounded inRn domain and show thateven a bivalent expert system can uniformly approximate a bounded measurablefunction� The fuzzy systems have a feedforward architecture that resembles thefeedforward multilayer neural systems used to approximate functions �

    � Theuniform approximation of continuous functions allows us to replace each continuousfuzzy set with a �nite discretization or a point in a unit hypercube �� of highdimension�

    Combining the scaled or ��red consequent fuzzy sets B��� � � � � B�

    m in Fig�ure �� with pairwise maximum gives the envelope of the fuzzy sets and tendstowards the uniform distribution� Max combination ignores overlap in the fuzzysets Bj � Sum combination adds overlap to the peakedness of B� When the inputchanges slightly� the additive output B changes slightly� The max�combined outputmay ignore small input changes since for large sets of rules most change occurs inthe overlap regions of the fuzzy sets B�j � Here overlap problem arises since the cen�troid tends to stay the same for small changes in input� But the centroid smoothlytracks changes in the fuzzy�set sum ���

    We now formally derive the standard additive model �SAM� in �� that weshall use in this chapter and show how an additive fuzzy system acts as a conditionalmean� A general additive fuzzy system is a map F � Rn � Rp� Both in practiceand in uniform approximation proofs we restrict the domain to a compact subsetU � Rn but we need not� Watkins �� has proved that an additive fuzzy systemwith just two rules can exactly represent any bounded function f � R� R even iff is not continuous� In this case the domain is the entire real line�

    The additive fuzzy system stores m fuzzy patches Aj�Bj or rules of the form�IfXisAj � thenY isBj Here Aj � R

    n and Bj � Rp multivalued or �fuzzy sets

    with set functions aj � Rn � ���

    and bj � Rp � ���

    � We also use the membershipnotation mAj �x� and mBj �y� in this chapter for the set functions� For the followingderivation we use the t �fuzzy unit� notation aj and bj for simplicity�

    In practice we de�ne the then�part set Aj by its n coordinate�projection setsA�j � � � � � A

    nj and thus Aj � A

    �j �A

    �j � � � ��A

    nj � How we de�ne this fuzzy Cartesian

    product dictates the conjunctive �or t�norm� form of how we factor the joint set

  • � Chapter ���

    function aj into its coordinate set functions a�j � � � � � anj � Minimum combination is

    the most popular form

    aj�x� � a�j�x�� � a

    �j �x�� � a

    �j �x�� � � � �� a

    nj �xn� ���

    for input vector x � �xl� � � � � xn�� Product combination

    aj �x� �nYi��

    a ij �x i� ���

    can simplify the analysis and computation of additive systems with Gaussian ��or radial�basis �� set functions of the form

    aij �xi� � sji exp

    ���

    �xi � �x

    ji

    �ji

    ���� ���for scaling constant � � sji � � The choice of combination operator does not a�ectthe structure of the standard model ���

    The �rst step to show the conditional�mean property is to view each scalarfuzzy set aij as a random set ��� Then a

    ij�xi� is not the degree to which xi � A

    ij but

    the conditional probability p�xi � Aij j Xi � xi�� In the same way the complement

    �t value � aij�xi� is just the dual conditional probability� p�xi �� Aij j Xi � xi��

    So Aij is not a locus of membership degrees but a locus of two�point conditionaldensities�

    The next step is the additive step� The m �t values aij�xi� ��re the then�

    part sets Bj to give the �inferred sets B�j � Again the result combines aij�xi� and

    Bj in some conjunctive �t�norm� way and again it depends on how we de�ne theCartesian patch Aj � Bj � Here min is less popular than product� The min �clipdiscards all information in Bj above the �t height a

    ij�xi� and can thus change the

    centroid of Bj if Bj is not symmetric� Product combination or correlation productdecoding �� keeps all relative information in Bj and does not change its centroid�

    B�j � aij�x�Bj ���

    We use ��� as a default for a SAM� We can also view the inferred sets B�j as randomsets� An additive model �� then sums these inferred sets to produce the �nal outputset B�

    B �mXj��

    B�j ���

    Each rule can have a weight wj that scales B�j in ���� Learning can change theseweights or we can use them to model frequency or �usuality rule weights� Herewe take them as unity� wj � �

    The only constraint on B or b is that it have a �nite integral or volume�

    � � V �

    Zb �y� dy � � ���

  • Technology for Multimedia �

    This means that each input x �res at least one rule to non zero degree� ThenB�V is a probability density function� Indeed it is a conditional probability sinceit depends on the fuzzy variable X taking on the input value x �the ratio of a jointto a marginal��

    B

    V� p�Y jX � x� ���

    Note this does not require that we view the if�part sets as probability densityfunctions� They are not� Each is a locus of continuum�many two�point conditionaldensities� Formally the system accepts input x� as a delta pulse to produce the m�t values�

    aj �x�� �

    Z� �x� x�� aj �x� dx ���

    Then the additive system output F �x� equals the centroid of B�

    F �x� �

    Ryb�x� y�dyRb�x� y�dy

    ���

    Zyp�Y jX � x�dy �

    � E�Y jX � x ���

    What holds for one realization of a random vector holds for them all� Hence F �E�Y j X as claimed� The SAM model �� then computes the global conditionalmean value E�Y j X � x as a convex sum of local conditional means in �����

    We now assume that the additive fuzzy system maps real vectors into scalarsF � Rn � R� Then put the additive assumption ��� in the centroidal output ���to get the standard form of a additive model �� we use in this chapter�

    F �x� �

    R�

    ��y

    mPj��

    b�j�y�dy

    R�

    ��

    mPj��

    b�j�y�dy���

    mPj��

    R�

    ��yaj�x�bj�y�dy

    mPj��

    R�

    ��aj�x�bj�y�dy

    ���

    mPj��

    aj�x�Vj

    R�

    ��

    ybj�y�dy

    Vj

    mPj��

    aj�x�Vj

    ���

  • � Chapter ���

    mPj��

    aj�x�Vjcj

    mPj��

    aj�x�Vj

    ���

    for then�part set volumes

    Vj �

    Z�

    ��

    bj�y�dy ���

    and then�part set centroids

    cj �

    R�

    ��ybj�y�dyR

    ��bj�y�dy

    ���

    The model in ��� is the standard additive model or SAM and the same as ��� Itholds for F � Rn � Rp as well�

    The standard model ��� reduces to the Gaussian additive model of Wangand Mendel ��

    F �x� �

    mPj��

    �zj�nQi��

    �A

    j

    i

    �xi��

    mPj��

    �nQi��

    �A

    j

    i

    �xi�����

    for the Gaussian if�part set in ��� and Gaussian then�part sets with these identi��cations�

    y � z ����

    aj�x� �nYi��

    aij�xi� ���

    �nYi��

    �A

    j

    i

    �xi� ����

    Vj � ����

    cj � �zj ����

    The choice of product combination ��� gives ��� and ����� The unity volume followsin ���� since Wang and Mendel integrate their m then�part Gaussian sets over allof R �and thus use the scaling constant in ��� to account for the input truncation toa compact set�� ���� follows because the mode of a Gaussian set equals its centroidand Wang and Mendel use the mode de�nition �is the point in R at which �Bj �z�achieves its maximum value� They used the Stone�Weierstrass Theorem to provethat additive Gaussian systems with all�product combination in ��� are uniformapproximators of continuous maps on compact sets� This non�constructive result isa special case of the uniform approximation theorem for all additive systems� Wereview this general theorem and its constructive proof in Appendix A� It holds aswell for Gaussian sets with min combination ��� of if�part �t values or min clippingof then�part sets Bj �

  • Technology for Multimedia �

    Next observe that taking the centroid of the additive B in ��� leads to a setof convex coe�cients�

    F �x� �

    mPj��

    aj �x� Vj cj

    mPj��

    aj �x� Vj

    ����

    �mXj��

    pj �x� cj ����

    for the m convex coe�cients �or m terms of a discrete probability density�

    pj �x� �aj �x� Vj

    mPk��

    ak �x� Vk

    ����

    Wang and Mendel �� refer to the convex sum of centroids ���� in the Gaussiancase as a �fuzzy basis function expansion even though the �basis functions pj�x�in ���� are not orthogonal�

    Feedforward fuzzy systems su�er exponential rule explosion as the numberof inputs increases� Optimal rules �� and function representation �� o�er twoways to deal with this �curse of dimensionality� Appendix B shows how supervisedlearning can tune the parameters of an additive fuzzy system� FCMs allow a fuzzysystem to approximate nonlinear dynamical systems with a �xed number of rules���

    ��� Fuzzy Cognitive Maps

    Fuzzy cognitive maps �FCMs� are fuzzy signed digraphs with feedback ������An FCM is an additive fuzzy system with feedback� Nodes stand for fuzzy sets orevents that occur to some degree� The nodes are causal concepts� They can modelevents� actions� values� goals� or lumped�parameter processes�

    Directed edges stand for fuzzy rules or the partial causal �ow between theconcepts� The sign �� or �� of an edge stands for causal increase or decrease� Thepositive edge rule in Figure ��a states that a survival threat increases runaway� Itis a positive causal connection� The runaway response grows or falls as the threatgrows or falls� The negative edge rule in Figure ��b states that running away froma predator decreases the survival threat� It is a negative causal connection� Thesurvival threat grows the less the prey runs away and falls the more the prey runsaway� The two rules in Figure ��c de�ne a minimal feedback loop in the FCMcausal web�

    A FCM with n nodes has n� edges� The nodes Ci�t� are fuzzy sets andso take values in ���

    � So a FCM state is the t �fuzzy unit� vector C�t� ��C��t�� � � � � Cn�t�� and thus a point in the fuzzy hypercube I

    n � ���

    n� A FCMinference is a path or point sequence in In� It is a fuzzy process or indexed familyof fuzzy sets C�t�� The FCM can only �forward chain �� to answer what�ifquestions� Nonlinearities do not permit reverse causality� FCMs cannot �backwardchain to answer why questions�

  • Chapter ���

    Survival Threat Run Away+

    �a�

    Run Away Survival Threat–

    �b�

    Run AwaySurvival Threat

    +

    �c�

    Figure ��� Directed edges stand for fuzzy rules or the partial causal �ow betweenthe concepts� The sign � or �� of an edge stands for causal increase or decrease��a� A positive edge rule in states that a survival threat increases runaway� �b� Anegative edge rule states that running away from a predator decreases the survivalthreat� �c� Two rules de�ne a minimal feedback loop in the FCM causal web�

    The FCM nonlinear dynamical system acts as a neural network� For eachinput state C��� it digs a trajectory in In that ends in an equilibrium attractor A�The FCM quickly converges or �settles down to a �xed point� limit cycle� limittorus� or chaotic attractor in the fuzzy cube� Figure �� shows three attractors ormeta�rules for a ��D dynamical FCM�

    The output equilibrium is the answer to a causal what�if question� What ifC��� happens� In this sense each FCM stores a set of global rules of the form �IfC���� then equilibrium attractor A�

    The size of the attractor regions in the fuzzy cube governs the number of theseglobal rules or �hidden patterns ��� All points in the attractor region map to theattractor� A FCM with a global �xed point has only one global rule� All inputballs �roll down its �well� FCMs can have large and small attractor regions in thefuzzy cube� The attractor types can vary in complex FCMs with highly nonlinearconcepts and edges� Then one input state may lead to chaos and a more distantinput state may end in a �xed point or limit cycle�

  • Technology for Multimedia

    F •

    Limit CycleChaotic Attractor

    Fixed Point

    C 0•

    (0,0)

    (0,1) (1,1)

    (1,0)

    Figure ��� The unit square is the state space for a FCM with two nodes� Thesystem has at most four fuzzy edge rules� In this case it has three fuzzy meta�rulesof the form �If input state vector C then attractorA�� The state C� converges toa �xed point F�

    ����� Simple FCMs

    Simple FCMs have bivalent nodes and trivalent edges� Concept values Citake values in f��g� Causal edges take values in f����g� So for a concept eachsimple FCM state vector is one of the �n vertices of the fuzzy cube In� The FCMtrajectory hops from vertex to vertex� In ends in a �xed point or limit cycle at the�rst repeated vector�

    We can draw simple FCMs from articles� editorials� or surveys� Most personscan state the sign of causal �ow between nodes� The hard part is to state its degreeor magnitude� We can average expert responses ����� as in equation ���� belowor use neural systems to learn fuzzy edge weights from data� The expert responsescan initialize the causal learning or modify it as a type of forcing function�

    Figure �� shows a simple FCM with �ve concept nodes� The connection oredge matrix E lists the causal links between nodes�

  • � Chapter ���

    C2: Fatigue

    C3: Rest

    C1: Herd Clustering

    C4: Survival Threat

    C5: Run away

    ++

    +

    +

    +

    ——

    +

    Figure ��� Simple FCM with �ve concept nodes� Edges show directed causal�ow between nodes�

    E �

    C� C� C� C� C�C� � � � �C� � � � �

    C� � � � �

    C� � � �

    C� � � � �

    The ith row lists the connection strength of the edges eik directed out from causalconcept Ci� The ith column lists the edges eki directed into Ci� Ci causally increasesCk if eik � �� decreases Ck if eik � �� and has no e�ect if eik � �� The causalconcept C� causally increases concepts C� and C�� It decreases C�� Concepts C�and C� decrease C�� Concept C� increases C��

    ����� FCM Recall

    FCMs recall as the FCM dynamical system equilibrates� Simple FCM in�ference thresholds a matrix�vector multiplication ������� State vectors Cn cyclethrough the FCM adjacency matrix E � C� � E � C� � E � C� � � � �� Thesystem nonlinearly transforms the weighted input to each node Ci

    Ci �tn�� � S

    �NXk��

    eki �tn� Ck �tn�

    �����

    Here S�x� is a bounded signal function� For simple FCMs the sigmoid function

    S �y� �

    � e�c�y�T�����

    with large c � � approximates a binary threshold function�Simple threshold FCMs quickly converge to stable limit cycles or �xed points

    ������� These limit cycles show �hidden patterns in the causal web of the FCM�

  • Technology for Multimedia �

    The FCM in Figure �� gives a three�step limit cycle when input state C� ��� � � � �res the FCM network� Equation ���� and binary thresholdinggives the four step limit cycle C� � C� � C� � C� � C��

    C� � �� � �

    C�E � � � � �

    � C� � � � � ��

    C�E � �� � � � � �� C� � �� � � ��

    C�E � �� � � �

    � C� � �� � � ��

    C�E � �� � � �

    � C� � �� � � ��

    In a virtual world the limit cycle might make in order wake up� go to work� comehome� then wake up again� Some complex actions such as walking break down intosimple cycles of movement ��

    Each node in a simple FCM turns actions or goals on and o�� Each nodecan control its own FCM� fuzzy control system� goal�directed animation system�force feedback� or other input�output map� The FCM can control the temporalassociations or timing cycles that structure virtual worlds� These patterns establishthe rhythm of the world� �Grandmother nodes can control the time spent on eachstep in a FCM �avalanche ���� This can change the update rate and thus thetiming for the network ����

    ����� Augmented FCMs

    FCM matrices additively combine to form new FCMs ��� This allows com�bination of FCMs for di�erent actors or environments in the virtual world� Thenew �augmented� FCM includes the union of the causal concepts for all the actorsand the environment in the virtual world� If a FCM does not include a concept�then those rows and columns are all zero� The sum of the augmented �zero�padded�FCM matrices for each actor forms the virtual world�

    F �nXi��

    wiFi ����

    The wi are positive weights for the ith FCM Fi� The weights state the relativevalue of each FCM in the virtual world and can weight any subgraph of the FCM�Figure ��a shows three simple FCMs� Equation ���� combines these FCMs to givethe new simple FCM in Figure ��b that has fuzzy or multivalued edges�

    F �

    ��F� � F� �F�� �

    �������

    � � � � � �

    � � � � �� � � � �� � � � � �� �� � � � �

    � � � �

    �� ���

    The FCM sum ���� helps knowledge acquisition� Any number of experts candescribe their FCM virtual world views and ���� will weight and combine them ���

  • � Chapter ���

    C4

    3C

    2C

    1C

    +

    +

    ++

    5C

    +

    +

    C4

    3C

    2C

    6C

    C42C

    1C

    6C

    5C

    +

    +

    +

    +

    +

    +

    +

    +

    +

    FCM 1

    FCM 2 FCM 3

    +

    +

    +–

    – –

    +

    ––

    +

    �a�

    C4

    3C

    2C

    1C

    6C

    5C

    – 13

    – 23

    1

    13

    23

    – 13

    23

    23

    13

    23

    – 13

    13

    – 13

    13

    �b�

    Figure ��� FCMs combine additively� �a� Three bivalent FCMs� �b� AugmentedFCM� The augmented FCM takes the union of the causal concepts of the smallerFCMs and sums the augmented connection matrices as shown in ����

    The additive structure of combined FCMs also permits a Delphi ��� or questionaireapproach to knowledge acquisition� In contrast an AI expert system �� is a binarytree with graph search� Two or more trees need not combine to a tree� CombinedFCMs tend to have feedback or closed loops and that precludes graph search withforward or backward �chaining� The strong law of large numbers �� ensures thatthe knowledge estimate F in ���� improves with the expert sample size n if weview the experts as independent �unique� random knowledge sources with �nite

  • Technology for Multimedia �

    variance �bounded uncertainty� and identical distribution �same problem�domainfocus�� The sample FCM converges to the unknown population FCM as the numberof experts grows�

    The FCM sum ���� can lead to new limit cycles that are not found in theindividual summed FCMs� The limit cycles in the FCMs shown in Figure ��a aregiven below� FCM has the �xed point� ���

    �� and the � step limit cycles�

    �������� �������� �������� �������

    ������� ������� ���

    ���� ������

    FCM � has a � step limit cycle�

    �������� ����

    ��� �������� �������

    FCM � has one �xed point� �

    ����� The combined FCM has no �xed points andone � step limit cycle�

    ������� �

    ������ ��

    ��� ��

    ��� �������

    This limit cycle is distinct from the limit cycles of each of the summed FCMs�

    ����� Nested FCMs

    FCMs can bring goals and intentions to virtual worlds as they de�ne dynamicphysical and social environments� This can give the �common representationneeded for a virtual world ���� The FCM can combine simple actions to model�intelligent behavior ��

    ����� Each node in turn can control its own simple FCMin a nested FCM� Complex actions such as walking emerge from networks of sim�ple re�exes� Nested simple FCMs can mimic this process as a net of �nite statemachines with binary limit cycles�

    The output of a simple FCM is a binary limit cycle that describes actionsor goalsKos��a� This holds even if the binary concept nodes change state asyn�chronously� Each output turns a function on or o� as in a robotic neural net ��

    �This output can control smaller FCMs or fuzzy control systems� These systems candrive visual� auditory� or tactile outputs of the virtual world� The FCM can controlthe temporal associations or timing cycles that structure virtual worlds� The FCMstate vector drives the motion of each character as in a frame in a cartoon� Simpleequations of motion can move each actor between the states�

    FCM nesting extends to any number of fuzzy sets for the inputs� A conceptcan divide into smaller fuzzy sets or subconcepts� The edges or rules link the sets�This leads to a discrete multivalued output for each node� Enough nodes allow thissystem to approximate any continuous function �

    for signal functions of the form����� The subconcepts Qij partition the fuzzy concept Cj

    Cj �

    Nj

    i��

    Qij ����

    Figure �� shows the concept of a SURVIVAL THREAT divided into subconcepts�Each subconcept is the degree of threat�

  • � Chapter ���

    Avoid Predator

    Evade PredatorSmall

    Survival Threat

    Medium Survival Threat

    Large Survival Threat +

    -

    +

    Figure ��� Subconcepts map to other concepts� This gives a more varied re�sponse�

    The FCM edges or rules map one subconcept to another� These subconceptmappings form a fuzzy system or set of fuzzy if�then rules that map inputs tooutputs� Each mapping is a fuzzy rule or state�space patch that links fuzzy sets�The patches cover the graph of some function in the input�output state space� Thefuzzy system then averages the patches that overlap to give an approximation ofa continuous function ��� Figure �� shows how subconcepts can map to di�erentresponses in the FCM� This gives a more varied response to changes in the virtualworld�

    ��� Virtual Undersea World

    Figure �� shows a simple FCM for a virtual dolphin� It lists a causal webof goals and actions in the life of a dolphin ���� The connection matrix ED statesthese causal relations in numbers�

    ED �

    D� D� D� D� D� D D� D� D D��D� � � � � � � � � �D� � � � � � � � � �D� � � � � � � � �

    D� � � � � � � � � �

    D� � � � � � � � � �D � � � � � � � � �D� � � � � � � � � �D� � � � � � � � �D � � � � � � � �

    D�� � � � � � � � � �

    The ith row lists the connection strength of the edges eik directed out from causalconcept Di and the ith column lists the edges eki directed into Di � Row � showshow the concept SURVIVAL THREAT changes the other concepts� Column �shows the concepts that change SURVIVAL THREAT�

    We can model the e�ect of a survival threat on the dolphin FCM as a sustainedinput to D� This means D � for all time tk� C� is the initial input state of thedolphin FCM�

  • Technology for Multimedia �

    D2: Companionship

    D3: Fatigue

    D4: Rest

    D5: Herd Clustering

    D6: Food Search

    D7: Chase food

    D8: Catch & Eat Food

    D9: Survival Threat

    D10: Run away

    +

    -

    ++

    +

    +

    -

    -

    -

    -

    +

    +

    +

    - -

    +

    ++

    -

    -

    -

    -

    -

    - -

    -

    --

    -

    -

    -

    -

    +

    +

    -

    -

    -

    D1:Hunger

    Figure �� Trivalent fuzzy cognitive map for the control of a dolphin actor ina fuzzy virtual world� The rules or edges connect causal concepts in a signedconnection matrix�

    C� � � � � � � � � � � ��

    ThenC�ED � �� � � � � � � �

    � C� � �� � � � � � �

    The arrow stands for a threshold operation with �� as the threshold value� C�

    keeps D� on since we want to study the e�ect of a sustained threat� C� shows thatwhen threatened the dolphins cluster in a herd and �ee the threat� The negativerules in the ninth row of ED show that a threat to survival turns o� other actions�The FCM converges to the limit cycle C� � C� � C� � C� � C� � C� � � � ifthe threat lasts�C�ED � �� � � � � � � � � � � � �

    � C� � � � � � � � � �

  • � Chapter ���

    C�ED � �� � � � � � � � � � � C� � �� � � � � ��

    C�ED � � � � � � � � � � � �

    � C� � � � � � � � ��

    C�ED � � � � � � � � � � � � C� � � � � � � � � ��

    C�ED � �� � � � � � � �

    � C� � �� � � � � � �

    Flight causes fatigue �C��� The dolphin herd stops and rests staying closetogether �C��� All the activity causes hunger �C��C��� If the threat persists�they again try to �ee �C��� A threat surpresses hunger� This limit cycle shows a�hidden global pattern in the causal virtual world�

    The FCM converges to the new limit cycle C � C� � C� � C � C�� �C�� � C�� � C�� � C � � � � when the shark gives up the chase or eats a dolphinand the threat ends �D � ���C � �� � � � � � ��

    C�ED � � � � � � � � � � � � � C� � � � � � � � � ��

    C�ED � � � � � � � � � � �

    � C� � � � � � � � � � � ��

    CED � �� � � � � � � � � � C � � � � � � � � � � ��

    C��ED � �� � � � � � � � � � C�� � �� � � � � � � � ��

    C��ED � �� � � � � � � � � � C�� � �� � � � � � � � ��

    C��ED � �� � � � � � � � � C�� � �� � � � � � � ��

    C��ED � �� � � � � � � � � C�� � �� � � � � � � ��

    C��ED � �� � � � � � �

    � C � �� � � � � � ��

    The dolphin herd rests from the previous chase �C�C��� Then they begina hunt of their own �C�C���� They eat �C��� and then they socialize and rest�C���C���C�� This makes them hungry and the feeding cycle repeats�

    ����� Augmented Virtual World

    Figure �� shows an augmented FCM for an undersea virtual world� It com�bines �sh school� shark� and dolphin herd FCMs with� F � Ffish�Fshark�Fdolphin �The new links among these FCMs are those of predator and prey where the largereats the smaller� The actors chase� �ee� and eat one another� A hungry shark chasesthe dolphins and that leads to the limit cycle �C��C��C��C�� above� Augmentingthe FCM matrices gives a large but sparse FCM since the actors respond to eachother in few ways� Figure �

    shows the connection matrix for the augmentedFCM in Figure ��� The augmented FCM moves the actors in the virtual world�The binary output states of this FCM move the actors� Each FCM state maps toequations or function approximations for movement�

    We used a simple update equation for position�

    p �tn�� � p �tn� � �tn� � tn� v �tn� ����

  • Technology for Multimedia �

    +

    +

    +

    +

    +

    F1:Hunger

    F2: Fatigue

    F3: Rest

    F4: School

    +F5: Catch &

    Eat Food

    F7: Run Away

    + ++

    +

    S2: Fatigue

    S3: Rest

    S7: Catch & Eat Food

    ++

    +-

    S1:Hunger

    F6: Survival Threat

    +

    S5: Chase Fish

    S6: Chase Dolphins

    .

    +

    ++

    ++

    + D1:HungerD2:

    Companionship

    D3: Fatigue

    D4: Rest

    D5: Herd Clustering

    D6: Food

    Search

    D7: Chase food

    D8: Catch & Eat Food

    +D9:

    Survival Threat

    D10: Run away +

    +

    +

    -

    ++

    ++

    S4: Food Search

    ++

    --- -

    ----

    --

    -

    ----

    -

    - -- - - -

    --

    --

    ------

    --

    -

    --

    -

    -

    -

    --

    +

    +

    +

    +

    +

    +

    +

    -

    --

    - -

    Figure ��� Augmented FCM for di�erent actors in a virtual world� The actorsinteract through linked common causal concepts such as chasing food and avoidinga threat�

    The velocity v�t� does not change at time step �t � The FCM �nds the directionand magnitude of movement� The magnitude of the velocity depends on the FCMstate� If the FCM state is �run away� then the velocity is FAST� If the FCM stateis �rest� then the velocity is SLOW� The prey choose the direction that maximizesthe distance from the predator� The predator chases the prey� When a predatorsearches for food it swims at random ���� Each state moves the actors through thesea�

    The FCM in Figure �� encodes limit cycles between the actors� For example�if we start with a hungry shark and We set the causal link between concept S��FOOD SEARCH and S�� CHASE DOLPHINS equal to zero to look at sharkinteractions with the �sh school� Then the �rst state C� is

    C� � �� � � � � � � � � � � � � � � � � � � � � � �This vector gives a ��step limit cycle after four transition steps�

    C�EA � �� � � � � � � � � � � � � � � � � � � � � ��C� � �� � � � � � � � � � � � � � � � � � � � � ��

    C�EA � �� � � � � � � � � � � � � � � � � � � � ��C� � �� � � � � � � � � � � � � � � � � � � � ��

    C�EA � �� � � � � � � � � � � � � � � � � � � � ��C� � �� � � � � � � � � � � � � � � � � � � � ��

  • � Chapter ���

    Dolphin Shark Fish

    D�D�D�D�D�D�D�D�DD�� S� S�S� S� S� S� S� F� F�F� F� F� F�F�D� ����� � � � � � � � � � � � � � � � � � � � � �

    D� � � � � � � � � � � � � � � � � � � � � � � � �

    D� � � � � ����� � � �� � � � � � � � � � � � � � �

    D� � ��� � ����� � � �� � � � � � � � � � � � � � �

    D� � � � � � � � ��� � � � � � � ��� � � � � � � �

    D� � � � ��� � � � � � � � � � � � � � � � � � � �

    D� � � � � � � � � � � � � � � � � � � � � � � � �

    D� �� ��� � � ��� � � � � � � � � � � � � � � � � �

    D � � � � ������� � � � � � � � � � � � � � � � �

    D�� ���� � ����������� � � � � � ��� � � � � � � � �

    S� � � � � � � � � � � � � � � � � � � � � � � � �

    S� � � � � � � � � � � � � � ��� ��� � � � � � � �

    S� � � � � � � � � � � ��� � � � � � � � � � � � �

    S� � � � � � � � � � � � � � � � ��� � � � � � � �

    S� � � � � � � � � � � � � ��� � � � � � � � � � �

    S� � � � � � � � � � � � � ��� � � � � � � � � � �

    S� � � � � � � � � � � �� � ������� � � � � � � � �

    F� � � � � � � � � � � � � � � � � � � � ��� � � �

    F� � � � � � � � � � � � � � � � � � � � � ��� � �

    F� � � � � � � � � � � � � � � � � � ��� � ��� � �

    F� � � � � � � � � � � � � � � � � � � � � � ��� �

    F� � � � � � � � � � � � � � � � � � �� � � � � � �

    F� � � � � � � � � � � � � � � � � � � � � ��� � �

    F� � � � � � � ��� � � � � � ��� � � � � ������� �

    Figure ���� AugmentedFCM connectionmatrix for the dolphinherd� �sh school�and shark� Figure ���� shows the nodes and edges� The lines show the FCMs ofthe actors� The sparse region outside the lines shows the interaction space of theFCMs�

    C�EA � �� � � � � � � � � � � � � � � � � � �

    �C� � �� � � � � � � � � � � � � � � � � � �

    C�EA � �� � � � � � � � � � � � � � � � � � � � � � � �

    �C � �� � � � � � � � � � � � � � � � � � � �

    CEA � �� � � � � � � � � � � � � � � � � � � � � � ��C� � �� � � � � � � � � � � � � � � � � � � ��

    C�EA � �� � � � � � � � � � � � � � � � � � � � � ��C� � �� � � � � � � � � � � � � � � � � � � � ��

    C�EA � �� � � � � � � � � � � � � � � � � � � � � ��C � �� � � � � � � � � � � � � � � � � � � � ��

    CEA � �� � � � � � � � � � � � � � � � � � ��C�� � �� � � � � � � � � � � � � � � � � � ��

  • Technology for Multimedia �

    ⊗⊗

    C

    Α

    Β

    Figure ���� FCMs control the virtual world� The augmented FCM controls theactions of the actors� In event A the hungry shark forces the dolphin herd to runaway� Each dashed line stands for a dolphin swim path� In event B the shark �ndsthe �sh and eats some� Each dashed line stands for the path of a �sh in the school�The cross shows the shark eating a �sh� In event C the �sh run into the dolphinsand su�er more losses� The solid lines are the dolphin paths� The dashes are the�sh swim paths� The cross shows a dolphin eating a �sh�

    C��EA � �� � � � � � � � � � � � � � � � � � � ��C�� � �� � � � � � � � � � � � � � � � � � � ��

    C��EA � �� � � � � � � � � � � � � � � � � � �

    �C� � �� � � � � � � � � � � � � � � � � � �

    In this limit cycle a shark searches for food �C��C��C��� The shark �nds some�sh �C��� chases the �sh �C��� and then eats some of the �sh �C�� To avoid theshark most �sh run away and then regroup as a school �C��C�C��� Then the �shrest and eat while the shark rests �C��C�� In time the shark gets hungry againand searches for �sh �C���C����

    The result is a complex dance among the actors as they move in a ��D ocean�Figure �� shows these movements� The forcing function is a hungry shark �C�� �

    �� The shark encounters the dolphins who cluster and then �ee the shark� Theshark chases but cannot keep up� The shark still searches for food and �nds the

  • �� Chapter ���

    ��������������

    γ p

    αm

    ������

    Case 1 Case

    2

    VpV f

    ≤1

    VpV f

    >1

    Figure ���� Fish change their behavior as the degree of threat changes� �a� The�sh minimize time within the sighting angle of the predator� Case � shows theangle of escape when the �sh swim faster� Case � shows the desired angle whenthe predator swims faster� �b� The �sh maximize the distance between themselvesand the predator to evade the predator� The �sh swim straight ahead when the�sh swim faster than the predator� The �sh swim away at an angle if the predatorswims faster�

    �sh� It catches a �sh and then rests with its hunger sated� Meanwhile the hungrydolphins search for food and eat more �sh� Each actor responds to the actions ofthe other�

    ����� Nested FCMs for Fish Schools

    In a simple FCM the threat response concepts link as a rule� SURVIVALTHREAT implies RUN AWAY� Fish change their behavior as the degree of threatchanges� This rule does not model the e�ects of di�erent threats� For that weneed a nested FCM or a fuzzy function approximator that links the threat degreeto di�erent responses� The size of the threat is a function of the size� speed� andattack angle of the predator ���� A small threat leads to avoidance behavior�Figure ��a shows how �sh avoid a predator� The �sh move in direction � tomaximize their distance from the predator����

    cot� � cot�m �Vp

    Vf sin�m����

    Vp and Vf are the velocities of the predator and the �sh� �m is the angle thatminimizes the time in terms of the predator�s sighting angle p�

    tan�m � � cot p ����

    A large threat causes the �sh to evade the predator� The �sh try to maximizethe minimum distance from the predator Dp ����

    D�p � ��Xo � Vp t� � Vf t cos �� � �Vf t sin��

    � ����

    X� is the initial distance between predator and prey� � is the escape angle of theprey� Vp and Vfare the velocities of the predator and the �sh� Figure ��b showshow �sh evade a predator� A fuzzy system can approximate these responses usinghand�picked rules or a neural�fuzzy learning ���� These threat responses causethe �fountain e�ect and the �burst e�ect in �sh schools ��� as each �sh tries to

  • Technology for Multimedia ��

    F1:Hunger

    F2: Fatigue

    F3: Rest

    F4: School

    F5: Catch & Eat Food

    F7: Large Survival Threat

    F9: Avoid

    +

    +

    +

    +

    +

    +

    +

    F6: Small Survival Threat

    F8: Evade

    +

    +

    ++

    +

    -

    --

    --

    --

    -

    Figure ���� Example of a nested FCM� The concept of a survival threat dividesinto two subconcepts that each map to a di�erent survival tactic�

    increase its chances of survival� The fountain e�ect occurs when a predator movestowards a �sh school and the school splits and �ows around the predator� Theschool re�forms behind the predator� In the burst e�ect the school expands in theform of a sphere to evade the predator�

    A small survival threat may be a slow�moving predator that either has notseen or decided to attack the �sh� A large survival threat may be a fast predatorsuch as a barracuda or shark that swims towards the center of the school� If weinsert this new sub�FCM into the Fish FCM in Figure ��� we get the FCM inFigure ��� Di�erent limit cycles appear for di�erent degrees of threat� For a smallthreat �F� the �sh avoid the predator �F� as they move out of the line�of�sightof the predator� Large threats �F�� cause the �sh to scatter quickly to evade thepredator F�� This leads to fatigue and rest �F� and F���

    ��� Adaptive Fuzzy Cognitive Maps

    An adaptive FCM changes its causal web in time� The causal web learns fromdata� The causal edges or rules change in sign and magnitude� The additive schemeis a type of causal learning since it changes the FCM edge strengths� In general anedge eij changes with some �rst�order learning law�

    �eij � fij �E�C� � gij �t� ����

    Here gij is a forcing function� Data �res the concept nodes and in time this leavesa causal pattern in the edge� Causal learning is local in fij � It depends on just itsown value and on the node signals that it connects�

    �eij � fij�eij � Ci� Cj� �Ci� �Cj

    �� gij �t� ����

    Correlation or Hebbian learning can encode some limit cycles in the FCMs ortemporal associative memories �TAMs� ��� It adds pairwise correlation matrices

  • �� Chapter ���

    in ����� This method can only store a few patterns� Di�erential Hebbian learningencodes changes in a concept in equation ����� Both types of learning are local andlight in computation�

    To encode binary limit cycles in connection matrix E the TAM method sumsthe weighted correlation matrices between successive states ��� To encode the limitcycle C� � C� � C� � C� we �rst convert each binary state Ci into a bipolarstate vector Xi by replacing each � with a �� Then E is the weighted sum

    E � q�XT�X� � q�X

    T�X� � � � �� qn��X

    Tn��Xn � qnX

    TnX� ����

    The length of the limit cycle should be less than the number of concepts� Elsecrosstalk can occur� Proper weighting of each correlation matrix pair can improvethe encoding ��� and thus increase the FCM storage capacity� Correlation learningis a form of the unsupervised signal Hebbian learning law in neural networks ���

    �eij � �eij �Ci�xi�Cj�xj� ����

    A virtual world can encode an event sequence with ���� or ����� A simple chasecycle might be C� � C� � C��

    C� � � � � � � � � �

    C� � � � � � � �

    C� � � � � � � � �

    Then ���� gives the FCM connection matrix E when qi � for all i�

    E �

    D� D� D� D� D� D D� D� D D��D� � �� � �� �� ��

    D� �� � � � � � � � �

    D� � � � � � � � �

    D� � �� �

    D� � � �� � � � � � �D �� � � � � � � � �

    D� �� � � � � � � � �

    D� �� � � � � � � � �

    D � � � � � � � � �D�� � � � � � � � �

    ThenC�E � �� � � � � � � � � � � � �

    � C� � � � � � � ��

    C�E � �� � � � � � � � � � � � � � � � � C� � � � � � � � �

    C�E � �� �� � �� �� �� �� �� �� � � C� � � � � � � � � �

    Correlation encoding treats negative and zero causal edges the same� It can encode�spurious causal implications between concepts such as e�� � �� This meanssearching for food causes a desire to socialize� Correlation encoding is a poormodel of inferred causality� It says two concepts cause each other if they are on

  • Technology for Multimedia ��

    at the same time� Di�erential Hebbian learning encodes causal changes to avoidspurious causality� The concepts must move in the same or opposite directions toinfer a causal link� They must come on and turn o� at the same time or one mustcome on as the other turns o�� Just being on does not lead to a new causal link�The patterns of turning on or o� must correlate positively or negatively�

    The di�erential Hebbian learning law �� correlates concept changes or veloc�ities�

    �eij � �eij � �Ci �xi� �Cj �xj� ���

    So �Ci�xi� �Cj�xj� � � i� concepts Ci and Cj move in the same direction��Ci�xi� �Cj�xj� � � i� concepts Ci and Cj move in opposite directions� In this sense��� learns patterns of causal change� The �rst�order structure of ��� implies thateij�t� is an exponentially weighted average of paired �or lagged� changes� The mostrecent changes have the most weight� The discrete change �Ci�t� � Ci�t��Ci�t��lies in f����g� The discrete di�erential Hebbian learning can take the form

    eij �t� � �

    eij �t� � ct ��Ci �xi��Cj �xj� � eij �t� if �Ci �xi� �� �

    eij �t� if �Ci �xi� � �����

    Here ct is a learning coe�cient that decreases in time ���� The sequence of learningcoe�cients fctg should decrease slowly �� in the sense of

    �Xt��

    ct ��

    but not too slowly in the sense that

    �Xt��

    c�t ���

    In practice ct

    �t� �Ci�Cj � � i� concepts Ci and Cj move in the same direction�

    �Ci�Cj � � i� concepts Ci and Cj move in opposite directions� E changes onlyif a concept changes� The changed edge slowly �forgets the old causal changes infavor of the new ones� This causal law can learn higher�order causal relations if itcorrelates multiple cause changes with e�ect changes�

    We used di�erential Hebbian learning to encode a feeding sequence and achase sequence in a FCM� The concepts in the ith row learn only when �Ci�xi�equals or �� We used

    ct �tk� � ��

    tk

    �N

    ��

  • �� Chapter ���

    The training data came from the rest� eat� play and the chase sequences in Sec�tion ��� This gave the ED �

    D� D� D� D� D� D D� D� D D��D� ����� ���� ���� ����� ����� ���� ���� ���� ���� ����D� ���� ����� ���� ���� ���� ���� ���� ���� ���� ����D� ����� ���� ����� ��� ���� ���� ���� ���� ���� ����D� ��� ���� ����� ����� ����� ����� ���� ���� ���� ����D� ��� ���� ���� ����� ���� ����� ���� ���� ���� ����D ���� ���� ���� ���� ���� ����� ��� ���� ���� ����D� ���� ���� ���� ���� ���� ���� ����� ��� ���� ����D� ���� ��� ����� ���� ���� ���� ���� ����� ���� ����D ���� ���� ���� ���� ��� ���� ���� ���� ���� ���D�� ���� ���� ��� ���� ���� ���� ���� ���� ���� �����

    This learned edge matrix ED resembles the FCM matrix in Figure ��� Thecausal links it lacks between D�� and �D�D��D�� were not in the training set�The diagonal links terms for self�inhibition of each concept� This occurs since eachconcept is on for one cycle before the matrix transitions to the next state� Thehunger input CL� � � � � � � � � � � � with a threshold of ��� now leads tothe limit cycle�CL�ED � ������ ���� ���� � ���� � ���� ���� � ��� ���� ���� �����

    CL� � �� � � � � � � � ��

    CL�ED � ����� ���� ���� ���� ���� � ���� ��� � ��� ���� �����

    CL� � �� � � � � � � � ��

    CL�ED � ����� � ��� ���� ���� � ��� ���� � ���� ��� ���� �����

    CL� � �� � � � � � � � ��

    CL�ED � ����� ��� � ���� ���� ���� ���� ���� � ���� ���� �����

    CL� � �� � � � � � � ��

    CL�ED � ���� � ���� ���� � ���� � ��� � ���� ���� ���� ���� � ����

    CL� � �� � � � � � � � ��

    CL�ED � ������ ���� � ���� ��� ���� ���� ���� ���� ���� �����

    CL � �� � � � � � � ��

    CLED � ��� � ��� ���� � ���� � ���� � ���� ���� ���� ���� � ����

    CL� � � � � � � � � � � ��

    Figure ���a� shows the hand�designed limit cycle from the previous section� Fig�ure ���b� shows the limit cycle from FCM found with di�erential Hebbian learn�ing� The DHL limit cycle is one step shorter� Both FCMs have just one limit cycle

  • Technology for Multimedia ��

    and the null �xed point in the space of ��� binary state vectors� The value of ED�does not change over � intervals� The learning law in ���� learns only if there is achange in the node�

    D10 Run Away

    D9 Survival Threat

    D8 Catch & Eat Food

    D7 Chase Food

    D6 Food Search

    D5 Herd Clustering

    D4 Rest

    D3 Fatigue

    D2 Companionship

    D1 Hunger

    Time Step 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

    D10 Run Away

    D9 Survival Threat

    D8 Catch & Eat Food

    D7 Chase Food

    D6 Food Search

    D5 Herd Clustering

    D4 Rest

    D3 Fatigue

    D2 Companionship

    D1 Hunger

    Time Step 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

    (a)

    (b)

    Figure ���� Limit cycle comparison between the hand�designed system and theFCM found with di�erential Hebbian learning� Each column is a binary statevector� �a� Rest� feed� play� rest limit cycle for the FCM in Figure ���� �b� Limitcycle for the FCM found with �����

    ��� Conclusions

    Fuzzy cognitive maps can model the causal web of a virtual world� The FCMcan control its local and global nonlinear behavior� The local fuzzy rules or edgesand the fuzzy concepts they connect model the causal links within and between

  • �� Chapter ���

    events� The global FCM nonlinear dynamics give the virtual world an �arrowof time� A user can change these dynamics at will and thus change the causalprocesses in the virtual world� FCMs let experts and users choose a causal web bydrawing causal pictures instead of by stating equations�

    FCMs can also help visualize data� They show how variables relate to oneanother in the causal web� The FCM output states can guide a cartoon of thevirtual world as shown in Figure ��� This cartoon shows the dolphin chase� rest�eat sequence described earlier� The cartoon animates the FCM dynamics as thesystem trajectory moves through the FCM state space� This can apply to modelsin economics� medicine� history� and politics ��

    where the social and causal webcan change in complex ways that may arise from changing the sign or magnitudeof a single FCM causal rule or edge�

    TIME STEP 0: THREAT APPEARS IN THE FORM OF A SHARK.

    TIME STEP 1&2: DOLPHINS FLEE THE SHARK IN A TIGHTLY PACKED HERD.

    TIME STEP 3&4: DOLPHINS CLUSTER TOGETHER AND REST

    TIME STEP 5-7: DOLPHINS AVOID SHARK THEN REST.

    TIME STEP 8&9: DOLPHINS START A SEARCH FOR FOOD.

    TIME STEP 10: THE DOLPHINS FIND A SCHOOL OF FISH THEN BEGIN TO CHASE THEM

    TIME STEP 11: THE DOLPHINS CATCH AND EAT SOME FOOD

    TIME STEP 12-13 : THE DOLPHINS THEN PLAY AND REST. THEN THE CYCLE BEGINS AGAIN.

    Figure ���� The FCM output states can guide a cartoon of the virtual world�This cartoon shows the dolphin chase� rest� eat sequence described in section ��The cartoon animates the FCM dynamics as the system trajectory moves throughthe FCM state space�

  • REFERENCES ��

    The additive structure of combined FCMs permits a Delphi ��� or question�aire approach to knowledge acquisition� These new causal webs can change anadaptive FCM that learns its causal web as neural�like learning laws process time�series data� Experts can add their FCM matrices to the adaptive FCM to initializeor guide the learning� Such a causal web can learn the user�s values and actionhabits and perhaps can test them or train them�

    More complex FCMs have more complex dynamics and can model more com�plex virtual worlds� Each concept node can �re on its own time scale and �re in itsown nonlinear way� The causal edge �ows or rules can have their own time scalestoo and may increase or decrease the causal �ow through them in nonlinear ways�This behavior does not �t in a simple FCM with threshold concepts and constantedge weights�

    A FCM can model these complex virtual worlds if it uses more nonlinear mathto change its nodes and edges� The price paid may be a chaotic virtual world withunknown equilibrium behavior� Some users may want this to add novelty to theirvirtual world or to make it more exciting� A user might choose a virtual world thatis mildly nonlinear and has periodic equilibria� At the other extreme the user mightchoose a virtual world that is so wildly nonlinear it has only aperiodic equilibria�Think of a virtual game of tennis or raquetball where the gravitational potentialchanges at will or at random�

    Fuzziness and nonlinearity are design parameters for a virtual world� Theymay give a better model of a real process�

    REFERENCES

    M� Krueger� Articial Reality II� Second ed� Addison�Wesley� ���

    �� W� Gibson� Neuromancer� New York� Ace Books� ����

    �� R� A� Brown� Fluid Mechanics of the Atmosphere� New York� Academic Press�

    ���

    �� J� J� Craig� Introduction to Robotics� Reading� MA� Addison�Wesley� ����

    �� E� Ackerman� L� Gatewood� J� Rosevear and G� Molnar� �Blood Glucose Regu�lation and Diabetes� in Concepts and Models of Biomathematics� F� Heinmets�Ed�� Marcel Dekker� ����

    �� B� Kosko� �Fuzzy Cognitive Maps� International Journal Man�Machine Studies�Vol� ��� No� � pp� ������ ����

    �� B� Kosko� �Hidden Patterns in Combined and Adaptive Knowledge Networks�International Journal of Approximate Reasoning� Vol� �� No� � pp� �������� ����

    �� B� Kosko� Neural Networks and Fuzzy Systems Englewood Cli�s� Prentice Hall�

    ����

    �� B� Kosko� �Fuzzy Systems as Universal Approximators� IEEE Transactions onComputers� Vol� ��� No�

    � November� pp� �������� ����

    �� H� T� Nguyen� �On Random Sets and Belief Functions� Journal of MathematicalAnalysis and Applications� Vol� ��� No� ��� pp� ������� ����

  • � Technology for Multimedia

    K� Hornik� M� Stinchcombe and H� White� �Multilayer Feedforward Networks areUniversal Approximators� Neural Networks� Vol� �� No� � pp� ��� � ���� ����

    �� F� A� Watkins� �Fuzzy Engineering�Ph�D� Thesis� University of California atIrvine� ����

    �� L� Wang and J� M� Mendel� �Fuzzy Basis Functions� Universal Approximation�and Orthogonal Least�Squares Learning� IEEE Transactions on Neural Networks�Vol� �� No� �� September� pp� ��� � ��� ����

    �� D� F� Specht� �A General Regression Neural Network� IEEE Transactions onNeural Networks� Vol� �� No� �� November� pp� �������� ���

    �� B� Kosko� �Optimal Fuzzy Rules Cover Extrema� International Journal of Intel�ligent Systems� Vol� �� No� �� pp� �������� ����

    �� F� A� Watkins� �The Representation Problem for Additive Fuzzy Systems� Pro�ceedings of the the ���� IEEE International Conference on Fuzzy Systems �IEEEFUZZ����Vol� I� pp�

    ���������

    �� J� A� Dickerson and B� Kosko� �Virtual Worlds as Fuzzy Cognitive Maps� Pres�ence� Vol� �� No� �� Spring� pp� ������ ����

    �� P� H� Winston� Articial Intelligence� Second ed� Reading� MA� Addison�Wesley�

    ����

    �� W� R� Taber and M� Siegel� �Estimation of Expert Weights with Fuzzy CognitiveMaps� Proceedings of the �st IEEE International Conference on Neural Networks�ICNN���� San Diego�Vol� II� pp� �����������

    ��� B� Kosko� �Bidirectional Associative Memories� IEEE Transactions Systems�Man� and Cybernetics� Vol� �� No� � pp� ������ ����

    ��

    R� A� Brooks� �A Robot that Walks� Emergent Behaviors from a Carefully EvolvedNetwork� Neural Computation� Vol� � No� �� pp� �������� ����

    ��� S� Grossberg� Studies of Mind and Brain� Boston� Reidel� ����

    ��� N� I� Badler� B� L� Webber� J� Kalita and J� Esakov� �Animation from Instruc�tions� in Making Them Move� Mechanics� Control� and Animation of ArticulatedFigures� N� I� Badler� B� A� Barsky and D� Zeltzer� Eds� San Mateo� CA� MorganKaufmann� pp� ����� ���

    ��� J� H� Connell� Minimalist Mobile Robotics� A Colony�style Architecture for anArticial Creature Academic Press� Harcourt Brace Jovanovich� ����

    ��� S� H� Shane� �Comparison of Bottlenose Dolphin Behavior in Texas and Florida�with a Critique of Methods for Studying Dolphin Behavior� in The BottlenoseDolphin� S� Leatherwood and R� R� Reeves� Eds�� Academic Press� pp� �������

    ����

    ��� B� O� Koopman� Search and Screening� New York� Pergamon Press� ����

  • REFERENCES �

    ��� B� L� Partridge� �The Structure and Function of Fish Schools� Scientic Ameri�can� Vol� ���� No� �� pp�

    ����� ����

    ��� D� Weihs and W� P� W�� �Optimal Avoidance and Evasion Tactics in Predator�Prey Interactions� Journal of Theoretical Biology� Vol� ��� No� � pp� �������

    ����

    ��� J� A� Dickerson� �Fuzzy Function Approximation with Ellipsoidal Rules� Ph�D�Thesis� University of Southern California� ����

    ��� Y� F� Wang� J� B� Cruz and J� H� Mulligan� �Guaranteed Recall of All Train�ing Pairs for Bidirectional Associative Memory� IEEE Transactions on NeuralNetworks� Vol� �� No� �� pp� �������� ���

    ��

    W� R� Taber� �Knowledge Processing with Fuzzy Cognitive Maps� Expert Systemswith Applications� Vol� �� No� � pp� ������ ���

    ��� J� P� Martino� Technological Forecasting for Decisionmaking� American Elsevier�

    ����

    ��� J� A� Dickerson and B� Kosko� �Fuzzy Function Approximation with SupervisedEllipsoidal Learning� Proceedings of the World Conference on Neural Networks�WCNN ���� Portland� OR�Vol� II� pp� ��������

    ��� J� A� Dickerson and B� Kosko� �Fuzzy Function Learning with Covariance El�lipsoids� Proceedings of the IEEE International Conference on Neural Networks�IEEE ICNN���� San Francisco� pp�

    ���

    �������

    ��� J� A� Dickerson and B� Kosko� �Fuzzy Function Approximation with EllipsoidalRules� IEEE Transactions on Systems� Man� and Cybernetics� No� August� pp�To Appear� ����

    ��� B� Kosko� �Stochastic Competitive Learning� IEEE Transactions on Neural Net�works� Vol� �� No� �� pp� �������� ���

    ��� H� M� Kim and B� Kosko� �Fuzzy Prediction and Filtering in Impulsive Noise�Fuzzy Sets and Systems� Vol� ��� No� � pp� ����� ����

  • �� Technology for Multimedia

    A Proof of the Fuzzy Approximation Theorem

    Fuzzy Approximation Theorem An additive fuzzy system uniformly approxi�mates f � X � Y if X is compact and f is continuous�Proof�

    Pick any small constant � �� We must show that jF �x�� f�x�j � for allx � X� X is a compact subset of Rn� F �x� is the centroidal output �� of theadditive fuzzy system F � Continuity of f on compact X gives uniform continuity�So there is a �xed distance � such that� for all x and z in X� jf�x�� f�z�j � �� ifjx� zj � �� �Replace � by ��n for any Lp space with p � �� We can construct aset of open cubes M�� �Mm that cover X and that have ordered overlap in theirn coordinates so that each cube corner lies at the midpoint cj of its neighbors Mj �Pick symmetric output fuzzy sets Bj centered on f�cj �� So the centroid of Bj isf�cj��

    Pick u � X� Then by construction u lies in at most �j overlapping opencubes Mj � Pick any w in the same set of cubes� If u � Mj and w � Mk� then forall v � Mj �Mk� ju � vj � � and jv � wj � �� Uniform continuity implies thatjf�u� � f�w�j � jf�u� � f�v�j � jf�v� � f�w�j � �� � So for cube centers cj and ck�jf�cj�� f�ck�j �

    �� �

    Pick x � X� Then x too lies in at most �j open cubes with centers cjand jf�cj� � f�x�j �

    �� � Along the kth coordinate of the range space R

    p thekth component of the additive system centroid F �x� lies on or between the kthcomponents of the centroids of the Bj sets� So� since jf�cj� � f�ck�j �

    �� for all

    f�cj�� jF �x�� f�cj�j ��� � Then

    jF �x�� f�x�j � jF �x�� f�cj�j� jf�cj�� f�x�j �

    ��

    ��

    Q�E�D�

    B Learning in SAMs� Unsupervised Clustering and Supervised

    Gradient Descent

    A fuzzy system learns if and only if its rule patches move or change shape inthe input�output product space X�Y � Learning can change the centers or widths oftriangle or trapezoidal sets� These changing sets then change the shape or positionof the Cartesian rule patches built out of them� The mean�value theorem and thecalculus of variations show �� that optimal lone rules cover the extrema or bumpsof the approximand� Good learning schemes ���� ��� �� tend to quickly move rulespatches to these bumps and then move extra rule patches between them as the rulebudget allows� Hybrid schemes use unsupervised clustering to learn the �rst setof fuzzy rule patches in position and number and to initialize gradient descent insupervised learning�

    Learning changes system parameters with data� Unsupervised learning amountsto blind clustering in the system product space X � Y to learn and tune the mfuzzy rules or the sets that compose them� Then k quantization vectors qj � X�Ymove in the product space to �lter or approximate the stream of incoming datapairs �x�t�� y�t�� or the concatenated data points z�t� � �x�t�jy�t�T � The simplestform of such product space clustering �� centers a rule patch at each data point

  • REFERENCES ��

    and thus puts k � m� In general both the data and the quantizing vectors greatlyoutnumber the rules and so k �� m�

    A natural way to grow and tune rules is to identify a rule patch with theuncertainty ellipsoid ���� ��� �� that forms around each quantizing vector qj fromthe inverse of its positive de�nite covariance matrix Kj � Then sparse or noisy datagrows a patch larger and thus a less certain rule than does denser or less noisy data�Unsupervised competitive learning �� can learn these ellipsoidal rules in three steps�

    kz�t�� qj�t�k � min�kz�t� � q��t�k� � � � � kz�t�� qk�t�k� �B��

    qi�t� � �

    qj�t� � �t�z�t�� qj�t� if i � jqi�t� if i �� j

    �B���

    Ki�t � � �

    Kj�t� � vt��z�t�� qj�t��

    T �z�t� � qj�t���Kj�t� if i � jKi�t� if i �� j

    �B���for the Euclidean norm kzk� � z�� � � z

    �np�

    The �rst step �B�� is the competitive step ���� It picks the nearest quantizingvector qj to the incoming data vector z�t� and ignores the rest� Some schemes maycount nearby vectors as lying in the winning subset� We used just one winnerper datum� This correlation matching approximates the competitive dynamics ofnonlinear neural networks� The second step updates the winning quantization or�synaptic vector and drives it toward the centroid of the sampled data patternclass ���� The third step updates the covariance matrix of the winning quantizationvector� We initialize the quantization vector with sample data �qi��� � z�i�� toavoid skewed groupings and to initialize the covariance matrix with small positivenumbers on its diagonal to keep it positive de�nite� Projection schemes ���� ��� ��can then convert the ellipsoids into fuzzy sets along each coordinate of the input�output space� Other schemes can use the unfactored joint set function directly����Supervised learning can also tune the eigenvalue parameters of the rule ellipsoids�

    The sequences of learning coe�cients f�tg and fvtg should decrease slowly ��

    in the sense of�Xt��

    �t �� but not too slowly in the sense of�Xt��

    ��t ��� In practice

    �t

    �t� The covariance coe�cients obey a like constraint as in our choice of vt �

    ����� t���N

    where N is the total number of data points� The supervised learningschemes below also use a similar sequence of decreasing learning coe�cients�

    Supervised learning changes SAM parameters with error data� The error ateach time t is the desired system output minus the actual SAM output� �t �dt � F �xt�� Unsupervised learning uses the blind data point z�t� instead of thedesired or labeled value dt� The teacher or supervisor supervises the learning processby giving the desired value dt at each training time t� Most supervised learningschemes perform stochastic gradient descent on the squared error and do so throughiterated use of the chain rule of di�erential calculus�

    Supervised gradient descent can learn or tune SAM systems ���� ��� by chang�ing the rule weights wj in �B���� the then�part volumes Vj � the then�part centroidscj� or parameters of the if�part set functions aj� The rule weight wj enters the ratioform of the general SAM system

  • �� Technology for Multimedia

    F �x� �

    mXj��

    wj aj�x� Vj cj

    mXj��

    wj aj�x� Vj

    �B���

    in the same way as does the then�part volume Vj in the SAM Theorem� Both cancelfrom �B��� if they have the same value if w� � � wm � � or if V� � � Vm ��� So both have the same learning law if we replace the nonzero weight wj withthe nonzero volume Vj ����

    wj�t� � � wj�t� � �t�Et�wj

    �B���

    � wj�t� � �t�Et�F

    �F

    �wj�B���

    � wj�t� � �t �tpj�xt�

    wj�t��cj � F �xt� �B���

    for instantaneous squared error Et ����dt � F �xt��

    � with desired�minus�actualerror �t � dt � F �xt�� We include the rule weights here for completeness� Ourfuzzy systems were unweighted and thus used w� � � wm � �� The volumesthen change in the same way if they are independent of the weights �which theymay not be in some ellipsoidal learning schemes��

    Vj�t� � � Vj�t�� �t�Et�Vj

    �B���

    � Vj�t� � �t �tpj�xt�

    Vj�t��cj � F �xt� �B���

    The learning law �B��� follows since �Et�F

    � �� and since

    �F

    �wj�

    aj�x� Vj cj

    mXi��

    wi ai�x� Vi � aj�x� Vj

    mXi��

    wi ai�x� Vi ci

    �mXi��

    wi ai�x� Vi��

    �B���

    �wj aj�x� Vj

    wj

    mXi��

    wi ai�x� Vi

    ������cj

    mXi��

    wi ai�x� Vi

    mXi��

    wi ai�x� Vi

    mXi��

    wi ai�x� Vi ci

    mXi��

    wi ai�x� Vi

    �� �B�

    �pj�x�

    wj�cj � F �x� �B���

    from the SAM Theorem�

  • REFERENCES ��

    The centroid cj in the SAM Theorem has the simplest learning law�

    cj�t � � � cj�t�� �t�Et�F

    �F

    �cj�B���

    � cj�t� � �t �t pj�xt�� �B���

    So the terms wj� Vj � and cj do not change when pj � and thus when thejthif�part set barely �res� aj�xt� ��

    Tuning the if�part sets involves more computation since the update law con�tains an extra partial derivative� Suppose that the if�part set function aj is a func�tion of l parameters� aj � aj�m�j � � � � �m

    lj�� Then we can update each parameter

    with

    mkj �t� � � mkj �t�� �t

    �Et�F

    �F

    �aj

    �aj

    �mkj�B���

    � mkj �t� � �t �tpj�xt�

    aj�xt��cj � F �xt�

    �aj

    �mkj� �B���

    Exponential if�part set functions can reduce the learning complexity� They

    have the form aj � efj �m

    j �����mlj � and obey �aj

    �mkj

    � aj�fj�m

    j �����mlj �

    �mkj

    � Then the param�

    eter update �B��� simpli�es to

    mkj �t� � � mkj �t� � �t �t pj�xt��cj � F �xt�

    �fj

    �mkj� �B���

    This can arise for independent exponential or Gaussian sets aj�x� �nYi��

    expff ij�xi�g �

    expfnXi��

    f ij�xi�g � expffj�x�g� The exponential set function

    aj�x� � expfnXi��

    uij�vij � xi�g �B���

    has partial derivatives�fj

    �ukj

    � vkj � xk�t� and�fj

    �vkj

    � ukj �

    The Gaussian set function

    aj�x� � expf�

    nXi��

    �xi �mij

    �ij��g �B���

    has mean partial derivative�fj

    �mkj

    �xk�m

    kj

    ��kj��

    and variance partial derivative�fj

    ��kj

    �xk�mkj �

    ��kj��

    � Such Gaussian set functions reduce the SAM model to Specht�s ��

    radial basis function network� We can use the smooth update law