Learning Communication Strategies in Multiagent Systems

P1: VBI

Applied Intelligence KL598-06-Kinney June 16, 1998 16:20

Applied Intelligence 9, 71–91 (1998)c© 1998 Kluwer Academic Publishers. Manufactured in The Netherlands.

Learning Communication Strategies in Multiagent Systems

MICHAEL KINNEY AND COSTAS TSATSOULISDepartment of Electrical Engineering and Computer Science, The University of Kansas, Lawrence, KS 66045

[email protected]

Abstract. In this paper we describe a dynamic, adaptive communication strategy for multiagent systems. Wediscuss the behavioral parameters of each agent that need to be computed, and provide a quantitative solutionto the problem of controlling these parameters. We also describe the testbed we built and the experiments weperformed to evaluate the effectiveness of our methodology. Several experiments using varying populations andvarying organizations of agents were performed and are reported. A number of performance measurements werecollected as each experiment was performed so the effectiveness of the adaptive communications strategy could bemeasured quantitatively.

The adaptive communications strategy proved effective for fully connected networks of agents. The performanceof these experiments improved for larger populations of agents and even approached optimal performance levels.Experiments with non-fully connected networks showed that the adaptive communications strategy is extremelyeffective, but does not approach optimality. Other experiments investigated the ability of the adaptive communica-tions strategy to compensate for “distracting” agents, for systems where agents are required to assume the role ofinformation routers, and for systems that must decide between routing paths based on cost information.

Keywords: multiagent systems, adaptive communication, quantitative performance

1. Introduction

The issues that need to be addressed by research inmultiagent systems include communication, coordina-tion, coherence, uncertainty, organizational structure,planning, and knowledge representation. Our workaddresses communication, coordination, and coher-ence, and, as we show in the following sections, ourmethodology allows the multiagent system to dynam-ically organize its structure and problem solving, thusmaking the issues of organizational structure and plan-ning redundant.

More specifically, as with any other computing sys-tem, the fundamental goal of research in multiagentsystems is to make the system have “good” perfor-mance. To achieve good performance an agent shouldnot be idle, waiting to receive the information it needs toperform its task, should not duplicate the work of otheragents (with the exception of systems of agents compet-ing for resources, for example, ETHER [1], or systems

where a high degree of redundancy is required), andshould be assigned tasks appropriate to its abilities.At the same time each agent should perform activitiesthat are globally coherent, that is, improve the problemsolving state of the whole collection of agents [2].

Most research in Distributed Artificial Intelligence(DAI) and multiagent systems has tried to solve theseproblems by addressing the communication needs of anagent:what information to send,when, and towhom.Even research that superficially solves a different prob-lem, is primarily addressing the issue of communi-cation between agents. For example, research in thetopology of agents is motivated by the need to performproblem decomposition, and to determine communica-tion links and activities. Research in agent coordinationis primarily interested in communication of informa-tion and agents’ models of other agents. Even researchin an agent’s knowledge representation is driven by thetypes of information communicated and the quality ofthe communication acts (see, for example, [3]).

P1: VBI


72 Kinney and Tsatsoulis

In general, if an agent had complete knowledge of allother agents, their exact state at any time (that is, theirinformation, knowledge, intent, and internal models ofthemselves and the other agents), and the overall goalof the system, then it could be perfectly coordinatedwith the activities of every other agent. Of course, thisis infeasible, since the communication requirementswould overwhelm the problem solver.

The problem of communicating information toachieve coherence has been approached in many ways:through studies of the effects of different agent orga-nizations on the performance of multiagent systems;or by determining an applications specific communi-cations strategy; or by including planning communi-cations actions as part of the regular problem solvingactivities of each agent; or by providing a common areafor groups of agents to store their knowledge (black-board systems). Most approaches to communicationbetween agents result in static organizations and strate-gies that are optimized for the specific application forwhich they are adopted. In our work we are interestedin developing a methodology that allows the dynamic,adaptive generation of communication strategies forany collection of cooperating agents. To address thecommunication and coordination issue for cooperative,multiagent systems, we introduce an adaptive commu-nications strategy. Instead of each agent followinga static communications protocol, agents learn localcommunications strategies that improve the efficiencyof each agent. These local communications strategiesinteract with one another resulting in a global commu-nications strategy that improves the coordination andefficiency of the entire multiagent system.

Our basic approach is to provide an agent with theability to adapt to the environment in which it is placedand the problem solving context in which it is involved.If the agents cooperate in this learning process, theperformance of the entire multiagent system improves.This allows agents to adapt to different organizationalstructures, adapt to the specific needs of different ap-plications, avoid being constrained by a predeterminedplan of action, and eliminate the overhead of a com-mon area for storing knowledge. It also leads to thedynamic establishment of communication strategies,agent organizations, and problem solving contexts.

In the rest of the paper we describe our methodolog-ical approach and the dynamic, adaptive communica-tions strategy we developed. We discuss the behavioralparameters of each agent that need to be determined,and provide a quantitative solution to the problem

of controlling these parameters. We also describe thetestbed we built and the experiments we performed toevaluate the effectiveness of our theory. Several exper-iments using varying populations and varying organi-zations of agents were performed and are reported. Anumber of performance measurements were collectedas each experiment was performed so the effectivenessof the adaptive communications strategy could be mea-sured quantitatively.

The adaptive communications strategy proved ef-fective for fully connected networks of agents. Theperformance of these experiments improved for largerpopulations of agents and even approached optimalperformance levels. Experiments with non-fully con-nected networks showed that the adaptive communi-cations strategy is extremely effective, but does notapproach optimality. Other experiments investigatedthe ability of the adaptive communications strategyto compensate for “distracting” agents, for systemswhere agents are required to assume the role of in-formation routers, and for systems that must decidebetween routing paths based on cost information. Eventhough the adaptive communication strategy improvedthe efficiency of these systems, the results fell shortof the optimal performance levels. The reasons forthe degraded performance of these special cases arediscussed, and possible improvements to the adaptivecommunications strategy are proposed.

2. Approaches to the Problem of Communicationin Multiagent Systems

There is a large volume of previous research addressingcoherence, communication and coordination in Dis-tributed AI (DAI) systems, and in multiagent problemsolving systems, going back to the early blackboardsystems. The major issue of study has always beencoordination of the problem solving activity, and dif-ferent projects have attempted to answer the questionfrom different perspectives.

One solution to the coordination problem is foragents to plan the actions that the system performs.Centralized and distributed planning allow agents ina distributed system to align and synchronize theiractivities towards a set of common goals. Amongthe issues concerning centralized planning are prob-lem decomposition and plan generation, which, inturn, involve plan representation, plan execution, andplan evaluation [4]. Classical frameworks like the

P1: VBI


Communication Strategies 73

Contract Net Protocol generate a cooperating plan bycontractor-contractee task assignment [5, 6]. Dis-tributed planning is an incremental approach, whichinvolves plan reconciliation, plan synchronization [4],partial plan integration, and conflicting plans resolution[7–11]. Partial Global Planning [12–14] and its exten-sion, Generalized Partial Global Planning [15], is aflexible coordination approach which does not assumeany particular distribution of subproblems, expertise,or other resources; instead it lets the nodes coordinateby exchanging information about their objectives andplans. Through information sharing, nodes exchangetheir partial plans to gain a more global view. In a dis-tributed planning system, local decisions always havenonlocal impacts [16], so in order to converge a set oflocal plans to a global one, both multistage negotiationand information sharing are necessary to the systemplanning. Another solution is offered by blackboardsystems: the agents (knowledge sources) share allknowledge by posting it on the blackboard, thus alwayshaving a complete global view of the problem solvingactivity [17–19]. Finally, one possible solution is notto communicate at all. Shoham and Tennenholtz’s “so-cial laws” can be viewed as an approach which tries tochange the structure of the agent’s tasks by, for exam-ple, restricting an agent’s possible activities [20]. Inother research, agents do not communicate but, given aproblem and a set of possible solutions from which theagents need to choose one, focal points are prominentsolutions of the problem to which agents are drawn[21].

In our work agents transmit very low-level parame-ters, and there is no need for them to reason about theirown subproblems, resources and expertise or these oftheir neighbors, i.e., agents connected to them directly.This simplifies the agent’s knowledge and allows mult-iagent systems to use agents that are “savants”, i.e.,capable only of individual problem solving withoutthe need of social knowledge about the overall group,for example, proposed in [22]. We also minimize thecommunications overhead by having the agents learnexactly what to send, to whom, and how often.

Other work has concentrated on the topologiesrequired by DAI systems, since the structure of anorganization provides a framework of constraints andexpectations about the behavior of the agents in the dis-tributed system. In comparisons between hierarchicaland flat topologies it was found that both have their ad-vantages and disadvantages. The best compromise isa topology that is mostly flat, so low-level informationcan be shared, yet also has a hierarchy, so higher levels

of information can be collected [23, 24]. Other workapproached the topology problem through mathemati-cal models of the costs associated with different markettopologies [25]. The agent topologies required by adistributed air traffic control application were studiedin [26]. In [27], it was pointed out that organizationpolicies not only define a task decomposition, but alsoprescribe communication paths among agents.

Our work is not concerned with special agent or-ganizations, since the system will adaptively organizeitself. We assume a swarm as an initial agent orga-nization to study the communications problem at itsmost basic, but our methodology is applicable to spe-cial agent organizations.

In a multiagent system, the ability of an agent to learncan be viewed as learning as a group, and learning toadjust its views or actions [28]. Most work has con-centrated on the contents of the agents only, either theproblem solving components or the shared view of theworld or the vocabulary used. One learning system in-corporated contract net bidding and genetic algorithmsinto a DAI framework. The contract net bidding pro-cess introduced competition between agents, and thegenetic algorithms allowed agents to evolve into moreefficient problem solvers. Adaptation occurred at avery high level, and took place over a long period oftime as weaker problem solving agents were replacedby more efficient problem solving agents, taking ad-vantage of an evolutionary model of learning [28]. In[29, 30] agents in a multiagent system start with differ-ent views of the world, and use different vocabularies todescribe the world. Theory integration uses inductivelearning to resolve their world views and vocabularies.This results in individual agents modifying their rulesets so the group of agents has a consistent view ofthe world. An architecture for agents in a multiagentsystem that combines machine learning and hypothesissharing was proposed in [6]. Agents use an inductivelearning process to build a generalization hierarchy.The generalized hypothesis is shared with other agentswhich incorporate this hypothesis into their own gener-alization hierarchies resulting in a distributed inductivelearning system. In [31] a DAI system is describedwhere the components of rules are separated acrossnodes, and neural network techniques are used to ad-just the weights of the connections between the nodes.This is a fine grain distribution of a problem, and thenodes are so simple that they cannot be considered in-telligent agents. This type of system also requires alarge number of tightly coupled nodes with connectionscapable of high communication bandwidths. More

P1: VBI



recently, a methodology was developed to allow anagent to adapt its reasoning method, meta-control strat-egy, perceptual strategy, control mode, etc., to fit in achanging environment [32].

All this work in learning agents, though, has concen-trated on an individual autonomous agent. Our workapplies learning to the behavior of a multiagent system,concentrating on communication rather than individualproblem solving and reasoning components.

More recent work has started looking at communi-cation schemas between agents. As pointed out in [33],meta-level communication between agents about theirlocal loads can, with a small communication cost, pin-point the true costs and benefits of the various organi-zation structures, allowing an informed organizationaldecision to be made. For example, agents in Dis-DSS(Distributed Dynamic Scheduling System) share theirability and availability with each other [34]. Dis-DSSis a protocol for the exchange and updating of resourceprofiles including agent’s committed resources, avail-able resources, and estimated future demand. In theDAI system described in [35], an agent selects informa-tion by making use of the agent’s own problem solvingknowledge, and the usefulness of information. Knowl-edge about how the information item was found and therole that it played in problem solving is translated intoan expected value through a set of heuristics. Anothertechnique used a scheduler which studied the currenttasks and goals of the system and enforced a combi-nation of up to five predefined coordination algorithmsfor multiagent systems [36].

Most relevant to our work are recent projects whereagents learn organization or communication strategiesto function in a multiagent group. In the work by Prasad,Lesser, and Lander [37], different agents play differentroles in the search towards a goal in a multiagent sys-tem and these agents learn, through supervised learn-ing techniques, what role to play in the organization byestimating a search control strategy based on Utility,Probability, and Cost [38]. The system LEMMINGlearns to reduce the communication load for task ne-gotiation in a Contract Net by case-based reasoning.An agent uses CBR to decide to which other agent itshould sent the Task Announcement messages whichare part of the Contract Net communications protocol.The cases selected are based on the current task theagent is trying to complete [39]. In a brief paper, Foiselet al. [40] mention that agents initiate interactions withother agents and, after the interaction is finished, theyevaluate the benefits and costs of these collaborations

using a priori defined evaluation criteria. Interactionswith agents which led to positive results are reinforced,while interactions with agents which led to negative re-sults are weakened. In other work, interaction betweenagents was represented as a two-player game where theagent’s objective was to look for a strategy that maxi-mizes the expected sum of rewards in the game. Thegoal is to learn the other agent’s model, so its futurebehavior can be predicted [41]. Similar in goal wasthe work described in [42]: learning allowed agents tolearn about other agents’ preferences via past interac-tions in a communication-intensive negotiating agentarchitecture. In this environment agents negotiate it-eratively andrefine their preference functions until aconsensus is reached. Agents try to guess the prefer-ence functions of other agents, and the guess is updatedas more observed behavior of the other agents becomesavailable.

Most of this work aims to develop learning tech-niques for agent models or types of interaction, and,while the results of the adaptive reorganization of themultiagent system will influence the communicationsstrategy indirectly, the explicit goal of the learningagents is not an improved communications strategy.The work most relevant to ours is the one implementedin LEMMING, where agents learn who to talk to andwhat information to transmit. LEMMING uses case-based reasoning so that an agent will remember whichagents were useful in previous contract negotiationsand will prefer communicating with them; in otherwords the agents learn based on whether the informa-tion received was useful to them or not. Our work hassimilar goals, i.e., reduction of the overall communi-cations load, but we take a more global view: agentsconsider their own information needs, but they alsoact as routers of information, thus considering the in-formation needs of the other agents in the network,too. Also, LEMMING is constrained to a ContractNet architecture, while our system develops a generic,domain- and organization-independent adaptive com-munications strategy.

3. An Adaptive Communications Strategyfor Multiagent Systems

3.1. System Components

For the purposes of our work a multiagent system con-sists of sensors and agents connected by links through

P1: VBI



Figure 1. Layered agent architecture with a rule-based system.

which information can be transmitted. A sensor isa source of information, and the agent uses informa-tion collected from other agents and sensors to performproblem solving activities.

All agents are assumed to be cooperative, and work-ing on the same problem. Every agent is assumed tohave the same communications capabilities, and use acommon information syntax. This assumption is re-quired for agents to share information. The only com-ponent of an agent that is allowed to vary is the agent’sspecific problems skills. This means that agents canbe experts in specific areas, but they are required tocooperate with one another to solve a given problem.

Figure 1 shows the layered architecture for an agent:the agent has a problem solving component, a low-level communications component that allows it to talkto other agents and sensors, and an adaptive component,which allows it to learn communication strategies (thisis discussed in detail in Section 4). In this specificexample, the problem solving component of the agent isrule-based.Ai represents a connection to a neighboringagent, andSj represents a connection to a neighboringsensor.

We assume a flat agent organization; this makescommunication and coordination more difficult, sincethere is no hierarchically “superior” agent to guideand control activities of lower level agents. We se-lected a flat organization to make sure we are address-ing the communication and coordination problem at itsmore difficult level. Our work can be easily applied to

hierarchical agent organizations since we allow agentsto have specialized functions (e.g., problem solvers, or-ganizers, information routers, problem decomposers,etc.) and our links can transmit information betweenagents at any levels of a hierarchy.

3.2. Agent Behavior

Each agent is required to make local decisions that helpthe cooperating group of agents move towards a solu-tion to a given problem. Since each agent has infor-mation about itself and a small amount of informationabout its neighbors only, making good decisions is dif-ficult. The behavior of a single agent is similar to that ofa pleasure seeking automaton [43]. Instead of lookingfor food and avoiding predators, the goals of a pleasureseeking automaton in a problem solving environmentare to collect information that helps the group of agentssolve the given problem, and at the same time, to per-form its own problem solving activities. This meansthat information that effects an agent’s own problemsolving activities and information that might effect theproblem solving activities of its neighbors must be col-lected. By correctly balancing the collection of thesetwo types of information, an agent can perform bothproblem solving and information routing. In addition,each agent is required to collect information from theleast expensive sources possible. The actions of send-ing information and problem solving have costs asso-ciated with them. As information propagates througha system, it has a cost attributed to the actions thatwere performed for it to reach its current location. Ifthere are multiple paths through which a piece of in-formation can be sent or generated, the agents that usethis information must favor the least expensive pathto reduce the communication load and to improve theoverall efficiency of the problem solving system. Notethat there are application where the quality of the in-formation may be more important than the cost. In thispaper we are only concerned with information cost,while other work has looked into quality and reliabilityof information [44].

For agents to learn the information needs of theirneighbors and find the least expensive path for routinginformation, they must provide feedback to their neigh-bors concerning their own information needs. Oncethis feedback has been collected, an agent can makeintelligent decisions concerning the routing of infor-mation. Another type of feedback that is required isthe work load of neighboring agents. If a neighboring

P1: VBI



agent is underworked, then more information should besent to that agent. If a neighboring agent is overworked,then less information should be sent to that agent. Thecombination of these two types of feedback providesthe information required by agents to make local deci-sions that effect to global flow of information througha network.

The final aspect of an agent’s behavior is how itbudgets its time towards problem solving activities,communicating with neighboring agents, and collect-ing information from sensors. The amount of timespent on each of the three major activities is critical tothe performance of an agent. If too much time is spentcollecting information from sensors, the informationwill overflow the system, and very little problem solv-ing will be performed. If too much communication isperformed, information will be propagated throughoutthe network, but once again, not enough problem solv-ing will be performed. If too much emphasis is placedon problem solving, an agent will perform all the prob-lem solving possible. This will leave it in an idle state asit waits for more information from neighboring agentsor sensors.

Since the balance among these three activities isnot normally the same for every agent in a system, amechanism must exist to adjust the importance of eachactivity. A simple solution is to rely on the feedbackconcerning the workload of neighboring agents to ad-just the flow of information into an agent. If the flowof information into an agent is increased, that agentis forced to spend more time on communicating thanon problem solving. If the flow of information intoan agent is decreased, it has more time to spend onproblem solving. The control of time spent on collect-ing the information from sensors can also be adjustedby examining the workload of an agent. If an agentrecognizes that it is overworked, it will not only slowdown the flow of information from neighboring agents,but also slow its own collection of information fromsensors. If an agent is underworked, it will request anincrease in the flow of information from neighboringagents, and spend more time on collecting informationfrom sensors.

The behavior of a single agent in a cooperating groupcan be summarized as follows:

1. Collect information that affects your own problemsolving activities.

2. Collect information that might affect the problemsolving activities of your neighbors.

3. Collect the least expensive information possible.4. Share information that might affect the problem

solving activities of your neighbors with your neigh-bors.

5. Share your information needs with your neighbors.6. Share your current workload with your neighbors.7. Budget time between problem solving, communi-

cating with neighboring agents, and collecting in-formation from sensors based on your own work-load and the work loads of neighboring agents.

This behavior needs to be dynamic, and adapt itselfto different problems and configurations of the agentnetwork. It is possible, for example, that different prob-lems will use different parts of the network, requiringthat the agents adapt their behavior to achieve betteroverall network performance. It is also possible thatsome agents or links between agents might disappear,requiring that the rest of the network dynamically adaptits behavior to the changes in its topology. What isneeded is anadaptive communications strategyto mod-ify and adapt an agent’s communications activities as itinteracts with other agents in a multiagent system. If allthe agents in a multiagent system use an adaptive com-munications strategy, the collection of local communi-cations strategies (i.e., the communication activities ofan agent with its immediate neighbors) represents theglobal communications strategy that controls the flowof information through a network of agents.

3.3. The Communications Layer of an Agent

An adaptive communications strategy optimizes anagent’s local communications strategy so that it cantake advantage of its local resources. In addition, theadaptive communications strategy can compensate forfailures in agents and communications links. An idealcommunications strategy uses the minimum amount ofcommunication required for a group of agents to solvea problem in the minimal amount of time. The goal isto design a separate communications layer that has theability to adapt to the problem solving environment inwhich it is placed. This requires that the communica-tions strategy be independent of the problem domain,independent of the network topology, able to adapt tothe information needs of the agents, and able to mini-mize communications with other agents.

Since this communications layer is designed to beused in a variety of environments, it will not be asefficient as a communication strategy designed for a

P1: VBI



specific application. However, the advantages of anadaptive communications strategy outweigh its lim-itations. Since an adaptive communication strategylearns, an application can be implemented in less time.Also, the number and types of agents in a multiagentsystem can be adjusted without redesigning the com-munications software. In addition, the learned com-munications strategy can provide a starting point forimplementing an optimized communications strategy.The areas that an adaptive communications strategyshows the most promise are in prototyping MAS appli-cations and in any dynamic problem solving environ-ment where the communications strategy must changeover time.

The communications layer is modeled as a set ofbehaviors required by an agent in a multiagent system.This set of behaviors is also modeled as a control systemwith a set of inputs, a set of outputs, and a set of systemequations.

4. The Adaptive CommunicationsControl Strategy (ACCS)

The mechanism used to achieve the seven behaviorslisted in the previous section must be simple so theoverhead associated with the adaptive communicationsstrategy does not dominate the work that an agent per-forms. This mechanism is a set of probability tablesthat are used to make information routing and sensorsampling decisions. Each agent is a control systemwith a set of inputs and a set of outputs. The inputsinclude information from agents and sensors and feed-back from neighboring agents. The outputs includeinformation and feedback to other agents and sensorsampling actions. Inside the agent is a system of equa-tions that adjusts the relationship between these inputsand outputs in an attempt to satisfy the agent’s behav-ioral characteristics.

This system of equations also generates two proba-bility tables. One is used to make information routingdecisions, and the other to make sensor sampling deci-sions. As information is received from external sourcesor generated by the problem solving component of anagent, it is classified into independent sets. Knowledgeis grouped in this way to provide an agent with a finerlevel of control over its information routing and sensorsampling decisions. The information routing probabil-ity table contains a row of values for each neighbor-ing agent. Each row contains a value for the differentclasses of information. When information is received

from an external source or information is generatedby the problem solving component of an agent, theinformation is classified. The column of the proba-bility table associated with that information class isthen examined. Each probability value represents thefrequency with which information of a specific classshould be transmitted to an adjacent agent.

The sensor sampling probability table is used in asimilar manner. Each time an agent decides to performa sensor sampling action, this table is examined. Theprobability values in this table represent the frequencywith which information of a specific class should besampled from neighboring sensors. The control sys-tem within an agent adjusts the probability values ineach of these tables to achieve the desired agent be-haviors. The probability tables used by all the agentsin the system are a direct representation of the currentcommunications strategy. This strategy is adaptive be-cause the values in these tables are adjusted over timein an attempt to improve the performance of the system(see Sections 4.6, 4.7, and 4.8 for details).

Several topics must be introduced before the systemof equations within an agent can be described. Theseinclude the classification of information, the usefulnessof information and classes of information, the cost ofinformation, the selfishness of an agent, and the work-load of an agent. After these topics are discussed, theywill be used to describe the system of equations whichgenerate the probability tables that an agent uses tomake information routing and sensor sampling deci-sions.

4.1. Classes of Information

Information must be grouped into separate classes so anagent can make better decisions. If no information clas-sification is performed, feedback would be required foreach piece of information. However, since each pieceof information is unique, the agent that generated theinformation will not have a chance to use the feedbackinformation to make a more intelligent routing deci-sion the next time. This leaves an agent without anymechanism for learning, and it must resort to sendingall new pieces of information across all possible linksso the neighboring agents are guaranteed of acquiringthe information they need to perform their part of theproblem solving.

On the other hand, if a single classification is used,an agent is only able to tell a neighbor the usefulnessof all the information that it has received. Since most

P1: VBI



agents have specific problem solving skills, they onlyneed a small subset of the information. This results inrelatively small feedback values, and thus small prob-ability values, decreasing the frequency with which in-formation is transmitted, and the agents are idle mostof the time.

The first level of classification is required to separateinformation into groups based on who sent the infor-mation. In this way, an agent can track the informationthat is passed across specific links. Then, through feed-back, the traffic across specific links can be adjusted.This helps strengthen or weaken certain paths throughthe network of agents.

The next level of classification depends on the infor-mation needs of the problem solving components of thedifferent agents. The assumption is made that a com-mon set of classification rules exists for all agents andthat all generated information will fall in one of thesepredefined classes. This assumption does not hurt thegenerality of our approach, since it is reasonable toassume that the user or designer of a problem solv-ing network will know a priori the knowledge that thenetwork can deal with.

4.2. Usefulness of Information

As an agent collects information, it groups it by classand by the agent that sent the information. Thesegroups of information are evaluated to determine howuseful they are to the problem solving component of anagent and to the neighboring agents. This means thatthe Adaptive Communications Control System mustknow what the problem solving component of an agentconsiders useful. Once the usefulness of an informa-tion class has been determined, the probability tablescan be computed and fed back to the neighboring agents(see Sections 4.6, 4.7, and 4.8 for details). This feed-back provides each agent with a model of what itsneighbors consider useful and not useful. The samemethod is used to examine the information sent byneighboring sensors, so that an agent has a model ofwhich sensors provide information that is useful to thesystem of agents.

The usefulness of information must be examined attwo different levels. One is the usefulness of a singlepiece of information, and the other is the usefulnessof a class of information. The usefulness values forthe pieces of information in a class are combined withthe usefulness of the information class to neighboringagents to determine the overall usefulness of a class

of information. The usefulness of a single piece ofinformation will depend on the specific needs of theproblem solving component of an agent.

4.3. Cost of Information

There are two actions that have an associated cost in asystem of agents. One is computation, and the otheris communication. As a piece of information is passedacross communication links, its cost increases by thecost of using these links. Also, any piece of informa-tion that is generated by the problem solving compo-nent of an agent has a cost attributed to of performingthe problem solving activity. As a piece of informationpropagates through a system of agents, the cost associ-ated with that information is a direct representation ofthe resources that the information consumed.

The cost is used to help an agent make better infor-mation routing decisions. The less expensive routesshould be preferred over more expensive routes. How-ever, to increase the robustness of the system, routesshould not be completely cut off. Even though oneroute is more expensive than another, if the less ex-pensive route is lost due to an agent or a link fail-ure, the more expensive route will be needed. If themore expensive route is completely cut off, the systemwould not be able to adapt when the failure occurred.This compromise of allowing some amount of trafficto pass through expensive routes may decrease the per-formance of the system, but it is required in order toincrease its robustness and adaptability.

4.4. Agent Selfishness

One behavior that an agent possesses is the ability tobalance its own information needs against the informa-tion needs of its neighbors. If an agent is completelyselfish, it will collect information that it needs for itsown, local problem solving, but it will not provide anyinformation that its neighbors may need. If an agentis completely selfless, it will spend all of its time rout-ing information to its neighbors, and spend no time onproblem solving. Though there are special cases wherecompletely selfish and completely selfless agents areneeded, most of the time a combination of these char-acteristics is required.

To address this problem, we introduce the conceptof a selfishness factor, having a range [0, 1]. A valueof 0 means that an agent is completely selfless, and avalue of 1 means that an agent is completely selfish.

P1: VBI



For most cases, a selfishness value of between 0.5 and1 will be used. This range of values places a higher im-portance on an agent’s own information needs, but itforces an agent to also consider the information needsof its neighbors. The set of equations that determinethe behavior of an agent must use this selfishness factorto balance the activities of problem solving and infor-mation routing.

4.5. Agent Workload

An agent also monitors its own workload. This infor-mation is used to adjust the amount of information thatneighboring agents send and the frequency by whichneighboring sensors are sampled. If an agent is un-derworked, the neighboring agents should send moreinformation, and the neighboring sensors should besampled more often. If an agent is overworked, theneighboring agents should send less information, andthe neighboring sensors should be sampled less often.This leads to several questions: what is the definitionof “work”?; what is considered “underworked”?; andwhat is considered “overworked”?

The goal is for all the agents to be working at max-imum efficiency, which means that they should fullyutilize all their resources. However, it is difficult to tellthe difference between an agent that is overworked,and an agent that is working at maximum efficiency;both are using 100% of their resources. To make thisdistinction, we use a definition of work based on anagent’sidle time, and we demand that all agents func-tioning properly have some idle time. Consequently,if an agent is never idle, it is overworked; if an agentis always idle, then it is underworked; if an agent hasonly a small, predefined amount of idle time, then theagent is working at its maximum possible efficiency.With this definition of work, the difference betweenoverworked and working at maximum efficiency canbe determined.

Some agents, due to their location in the network,or their specific problem solving skills, will have morework to do than other agents. These agents are con-sidered the “weak links” in the system, since the net-work of agents can only process information at the rateof its slowest member. These weak-link agents willquickly become overworked, and the communicationsstrategy developed should take into account such over-worked agents, and make sure that they work at peakperformance, without slowing down the network-wideproblem solving.

It is clear that in some agent topologies it will beimpossible to have all agents working at peak perfor-mance, especially given even a single weak-link agent.The best we can hope to achieve is that the weak-linkagent works with only a small amount of idle time, andthe remaining agents have minimum idle time, whichis computed based on the resources these agents needto contribute to the cooperating group. Therefore, forthe multiagent system to operate efficiently, each agenthas to have an expected amount of idle time that itshould attempt to achieve. To achieve this goal, eachagent contains a control system that adjusts the amountof resources an agent uses in an attempt to match themeasured amount of idle time to the expected amountof idle time.

4.6. General Form of an Adaptive CommunicationsControl System (ACCS)

The concepts of information classification, informa-tion usefulness, information cost, agent selfishness, andagent work load are combined in one set of equationsthat determine the behavior of an agent. Figure 2 repre-sents the control system model of an agent. The inputsare shown on the left side, the outputs on the right side,and the system of equations is inside the agent.

For this model, there aren neighboring agents andmneighboring sensors. Information enters an agent andis either used by the problem solving component togenerate new pieces of information, routed to a neigh-boring agent, or discarded. New pieces of informationgenerated by the problem solving component are ei-ther routed to a neighboring agent or discarded.K A,KT , 1T andTE are parameters to the internal controlsystem that monitors the workload of an agent.

Figure 2. Control system model for an agent.

P1: VBI



The following are the equations that compute theprobability tables for an agent. The probability valuesare limited to the range [0.01, 1.0]; if a probability wereto drop to 0, a link would be completely cut off andnever reactivated.F ′(Ai , k) are theF(k) values fedback to the agent by its neighbors, andg′(a, X) is theusefulness of information fed back from a neighboringagent.

PA(a, c)

=

1.0 for 1.00< F(k)g(X)

F(k)g(X) for 0.01≤ F(k)g(X) ≤ 1.00

0.01 for F(k)g(X) < 0.01

PS(s, c)

=

1.0 for 1.00< F(k)h(Y)

F(k)h(Y) for 0.01≤ F(k)h(Y) ≤ 1.00

0.01 for F(k)h(Y) < 0.01

PT (a, c)

=

1.0 for 1.00< F ′(a, k)× g′(a, X)

F ′(a, k)g′(a, X) for 0.01≤ F ′(a, k)× g′(a, X) ≤ 1.00

0.01 for F ′(a, k)g′(a, X)< 0.01

where

F(k) Scale factor that is generated by the agent’sinternal control system.

F ′(a, k) Scale factor that was fed back from a neigh-boring agenta.

g(X) Usefulness of informationX that wasreceived from agenta and classified intoclassc.

g′(a, X) Usefulness of informationX that was fedback from a neighboring agenta.

h(Y) Usefulness of informationY that was re-ceived from sensors and classified intoclassc.

Figure 3 shows the internal control system of anagent. This control system monitors the workload ofthe agent. The scale factorF(k) is adjusted until theexpected amount of idle timeTE matches the measuredamount of idle timeTI (k). F(k) directly impactsTI (k)becauseF(k) is used to compute the probability tablesthat determine how information is routed and sampled.The control system is sampled every1T seconds.K A

Figure 3. Internal control system of an agent.

and KT are the parameters to the derivative feedbackcontrol system that determine how quickly and accu-rately an agent can adapt to changes in the multiagentsystem. The optimal values for these parameters de-pend on the dynamics of an actual system and have tobe determined experimentally. The valuee(k) is theerror term in the derivative feedback control system.The scale factorF(k) is calculated by summinge(k)over all sample periods. Derivative feedback is usedbecause it has low overhead, and it provides better per-formance than a simple feedback control system.

4.7. The ACCS for a Network of Rule-Based Agents

The reason thatg(X) andh(Y) could not be definedearlier is that these usefulness equations depend on thecharacteristics of the problem solving component of anagent. Assuming a network of heterogeneous agents,each agent would have to have its own definition offact usefulness; a scheduling agent may assign useful-ness to classes of information in a manner differentfrom a qualitative simulation agent. In our work wehave elected to concentrate on rule-based agents, forreasons we will discuss in the following. This doesnot constrain the applicability of our work, since, givendefinitions for information usefulness, the ACCS is ap-plicable to any set of agents.

To defineg(X) andh(Y), the usefulness of a singlefact must be defined. For rule-based systems, the use-fulness of a single fact is defined to be the percentage ofrules in the rule base that a single factmay potentiallytrigger. In our prototype the rule-based systems usedwere represented in a RETE network [45], and the rulesthat a fact may potentially trigger can be computed bythe RETE nodes that are activated by this piece of in-formation.

Once the usefulness of a single fact is defined, theusefulness of all the facts in a class is combined to

P1: VBI



determine the usefulness of an entire class of facts. Thefollowing two equations use the concepts of the classi-fication of information, the usefulness of information,agent selfishness, and the cost of information to deter-mine the usefulness of an entire class of facts. Theseequations assume that facts are classified and separatedaccording to which agent transmitted the facts. Insteadof using theg(X), UA(a, c) is used, and instead ofusingh(Y), US(s, c) is used. The following is the def-inition of class usefulness along with the definitions ofall the parameters that are used by these equations:

g(X) = UA(a, c)

= S1

NA(a, c)

NA(a,c)∑i=1

UAF(a, c, i )

CAF(a, c, i )+ (1− S)

× 1

n− 1

n∑i=1,i 6=a

UT (i, c)

h(Y) = US(s, c)

= S1

NS(s, c)

NS(s,c)∑i=1

USF(s, c, i )

CSF(s, c, i )+ (1− S)

× 1

n− 1

n∑i=1

UT (i, c)

where

UA(a, c) The usefulness of facts in classc receivedfrom agenta. This is the same asg(X)whereX is the set of facts in classc re-ceived from agenta.

US(s, c) The usefulness of facts in classc receivedfrom sensors. This is the same ash(Y)whereY is the set of facts in classc re-ceived from sensors.

S A value between 0 and 1 that representsthe selfishness of an agent. A value closerto 0 means that an agent is more concernedwith the information needs of neighbor-ing agent than with its own informationneeds. A value closer to 1 means that anagent is mainly concerned with its owninformation needs.

NA(a, c) The number of facts in classc receivedfrom agenta.

NS(s, c) The number of facts in classc receivedfrom sensors.

UAF(a, c, i ) The usefulness of thei th fact in classcreceived from agenta.

CAF(a, c, i ) The cost of thei th fact in classc receivedfrom agenta.

USF(s, c, i ) The usefulness of thei th fact in classcreceived from sensors.

CSF(s, c, i ) The cost of thei th fact in classc receivedfrom sensors.

n The number of neighboring agents.UT (i, c) The usefulness of facts in classc to the

neighboring agenti . This is the same asg′(X) whereX is the set of facts in classc that were sent to agenti .

Both equations use the selfishness factorS. Noticethat the factorS is used on the left term, and the factor(1− S) is used on the right term. The left term is theusefulness of a class to the agent, and the right term isthe usefulness of the class to the neighboring agents.WhenS= 0, the left term is zero, and the usefulnessof the class to an agent depends on the usefulness ofthis class to an agent’s neighbors. Therefore, the agentis completely selfless. WhenS= 1, the right term iszero, and the usefulness of the class to an agent dependssolely on its own knowledge needs; therefore, the agentis completely selfish. WhenS has a value between 0and 1, the left and right terms are weighted so an agentcan take into account its own information needs withthe left term and the information needs of its neighborswith the right term.

The left term computes the average usefulness ofa set of facts that are members of a specific class.UAF(a, c, i ), CAF(a, c, i ), USF(s, c, i ), andCSF(s, c, i )are the usefulness and cost information on recently re-ceived facts from neighboring agents and sensors. Theusefulness of each fact in the set is divided by the costof the fact. This reduces the usefulness of expensiveinformation and allows an agent to show a preferencefor the less expensive information routing paths. Onlythe most recently received facts in a class are used,NA(a, c), the number of facts in classc received fromagenta, andNS(s, c), the number of facts in classc re-ceived from sensors. Older facts are ignored becausethe system of agents is very dynamic and the communi-cation strategy is constantly changing and it only makessense to use facts that have been transmitted due to theuse of the most recent communications strategy. Thisway, the communications strategy itself can be thoughtof as a state variable in a control system. All the agentswork together to build a global communications strat-egy based on local viewpoints. Then, every samplingperiod, a new communications strategy replaces the

P1: VBI



old communications strategy. This new communica-tions strategy is derived from the previous one basedon the performance of the system over the sampling pe-riod. Therefore, only recently transmitted facts shouldbe used to determine the usefulness of a class.

The right term computes the average usefulness of aclass of facts to the neighboring agents. This value pro-vides an agent with the information needs of its neigh-bors. Even though an agent’s own problem solvingcomponent may not need a specific class of informa-tion, this agent may be required to route the informa-tion to an agent that does need this specific informationclass. This is the mechanism that provides an agentwith the ability to perform information routing.

4.8. Model for an Adaptive CommunicationsStrategy for Rule-Based Agents

Figure 1 shows the layered architecture for an agentwith a rule-based problem solving component. TheAdaptive Communications Control System uses threeinternal models to make information routing and sensorsampling decisions. The sensor model is the probabil-ity table PS(s, c), the agent model is the probabilitytable PT (a, c), and the model of itself is the proba-bility table PA(a, c). Ai represents a connection to aneighboring agent, andSj represents a connection to aneighboring sensor. Figure 4 shows the final version ofthe control system that the Adaptive CommunicationsControl System uses to build and maintain its modelsof the world. The following system equations describethe relationships between the inputs and outputs of the

Figure 4. Control system model for an agent with a rule-basedsystem.

control system. These system equations also calculatethe probability tables that are used to make the infor-mation routing and sensor sampling decisions:

F(k) = F(k− 1)+ K A

(TI (k)− TE − KT

× TI (k)− TI (k− 1)

1T

)PS(s, c)

=

1.0 for 1.00< F(k)US(s, c)

F(k)US(s, c) for 0.01≤ F(k)US(s, c)≤ 1.00

0.01 for F(k)US(s, c) < 0.01

PT (a, c)

=

1.0 for 1.00< F ′(a, k)×UT (a, c)

F ′(a, k)UT (a, c) for 0.01≤ F ′(a, k)×UT (a, c) ≤ 1.00

0.01 for F ′(a, k)UT (a, c)< 0.01

The agent architecture and control system describedabove are designed to provide an agent with the sevenbehaviors that were defined in Section 3.2:

1. Collect information that affects your own problemsolving activities. This behavior is achieved withthe selfishness factor and the equation forUA(a, c).If the selfishness factor is greater than zero, then theUA(a, c) values contain information on the agent’sown information needs. These values are transmit-ted to neighboring agents and becomeUT (a, c) val-ues that are used to make information routing de-cisions. When a specific class from a neighboringagent is found to be useful, theUA(a, c) value willbe large, and thus theUT (a, c) value in the neigh-boring agent will also be large. ThisUT (a, c) valueis used to computePT (a, c). When the neighboringagent receives facts in the classc, they will likelybe sent to the agent that found the classc useful.Therefore, by feeding back the usefulness of classesto neighboring agents, an agent ensures that usefulfacts are sent its way.

2. Collect information that might affect the problemsolving activities of your neighbors.This behavioris also achieved with the selfishness factor and theequation forUA(a, c). If the selfishness factor is

P1: VBI



less than one, then theUA(a, c) value contains in-formation on the information needs of neighboringagents. These values are transmitted to neighbor-ing agents, and are used to calculatePT (a, c)withinthe neighboring agents. When a neighbor finds aclass of facts useful, this increases theUT (a, c) val-ues for that class. This in turn increasesUA(a, c)which is fed back to other neighbors. This increasesthe probability that facts in classc are transmittedfrom these other neighbors. So, by feeding backthe usefulness of classes to neighboring agents, therecursive nature of the class usefulness equationsinsures that agents collect information that is usefulto their neighbors.

3. Collect the least expensive information possible.This behavior is addressed with the equations forUA(a, c) andUS(s, c). Since these equations dividethe usefulness of individual facts by their cost, ex-pensive facts are less useful than inexpensive facts.Since this information is fed back to neighboringagents, the less expensive sources of facts receivehigh class usefulness values. These classes have ahigher probability of being sent, and thus the lessexpensive paths are favored.

4. Share information that might affect the problemsolving activities of your neighbors with your neigh-bors. This behavior is satisfied through the use ofthePT (a, c) tables. Whenever a fact is received, thePT (a, c) table is checked. If the fact is a memberof a class that is useful to a neighboring agent, andthe route that the fact has traveled is less expensivethan other paths, then the fact is likely transmittedto the neighboring agent.

5. Share your information needs with your neighbors.This behavior is satisfied by the sending ofUA(a, c)values to neighboring agents. These values tell theneighboring agent which classes of facts are usefuland which classes of facts are not useful.

6. Share your current workload with your neighbors.This behavior is satisfied by computing theF(k)values and sending them to neighboring agents.F(k) values are computed from the amount of timethat an agent is idle compared to the amount of timethat an agent is expected to be idle. The scale factorF(k) is adjusted in an attempt to match these twovalues. Therefore,F(k) is an indirect indication ofan agent’s work load.

7. Budget time between problem solving, communicat-ing with neighboring agents, and collecting infor-mation from sensors based on your own workload

and the workload of neighboring agents.An agentmust perform all problem solving possible, and anagent must handle all the messages that it has re-ceived. In addition, the probability tablePS(s, c)must be monitored to determine the frequency thateach neighboring sensor is sampled. TheF(k) andUT (a, c) values received from neighboring agentsdetermine the importance of information routing.The facts received from sensors, an agent’s inter-nal F(k) value, and theUT (a, c) values receivedfrom neighboring agents determine the importanceof sensor sampling. Since the internal control sys-tem matches the measured idle time to the expectedidle time, an agent will have time to perform allof its problem solving along with its informationrouting and sensor sampling activities. So, the rela-tive weight of thePT (a, c) table, thePS(a, c) table,and TE determines how an agent budgets its timetowards each of its activities.

5. Testbed and Experiments

5.1. Introduction

We developed a testbed to implement and evaluate theACCS. The testbed operates on one or more hosts andcommunicates across a live network of SUN worksta-tions. The layered agent architecture provided us witha natural division for the implementation into a set ofmodules.

The low-level communications module is imple-mented using the Remote Procedure Call (RPC) pro-tocol that is supported under SunOS. The details ofthe communications module’s implementation can befound in [46]. For the rule-based problem solving com-ponent of an agent we used CLIPS, a language thatallows the creation of (primarily) forward reasoningrule-based systems [47]. We selected CLIPS as ourproblem solving language because we had access tothe source code and the RETE network, and becauseCLIPS programs can be easily embedded inside otherapplications. To make prototyping simple we defineda language that allows the definition of a network ofagents, and also an extension to CLIPS which allowsthe passing of messages. Examples can be found in[48].

Several experiments were performed to test the per-formance of the ACCS. These include experiments onfully connected networks with varying populations of

P1: VBI



Figure 5. Symbols used to represent agents and sensors.

agents, and non-fully connected networks with varyingpopulations and topologies. The discussion of each ex-periment contains a figure showing the topology of theexperiment along with a condensed version of each rulebase. The symbols in Fig. 5 are used to represent agentsand sensors.

Communication links between agents and sensorsare represented by lines connecting the different nodes.If a communication link is labeled with a value, thatvalue represents the cost of sending a message acrossthat link. If a link is not labeled, then the cost is assumedto be 0.0.

Since we concentrated on studying the experimentalperformance of our model, we did not use any real do-main knowledge or application. Our systems containedrandomly generated rule bases and sensors. The goalwas to run simulations of various agent configurationsand tasks, and this required domain-independent sim-ulations. A condensed version of the rule base is listedunder each agent. These rules show the relationship be-tween different classes of information. For example,the rule v5→ v4, means that facts in class v5 generatefacts in class v4. A single class is also listed undereach sensor. This shows the class of facts that a sensorwill generate when it is sampled. For each experiment,there is a class of facts that is considered a final con-clusion. Whenever a system generates a fact in thisspecial class, the system has successfully performedsome amount of work. The efficiency of a multiagentsystem is measured by determining the rate at which fi-nal conclusions are generated, and how many resourcesthe system consumed in order to generate those finalconclusions. An adaptive communications strategy iseffective if the efficiency of the multiagent/system im-proves over time.

5.2. Performance Measurements

Each agent process collects performance measure-ments as an experiment is performed. These measure-ments include the resources that an agent uses and many

of the state variables contained within an agent’s sys-tem equations. The resources monitored include thenumber of messages an agent sends to neighboring pro-cesses, the number of rules an agent fires, the numberof final conclusions an agent generates, and the totalcost associated with an agent using these resources.The state variables monitored include the agent’s workload (TI ), the probability that neighboring sensors aresampled (PS(s, c)), and the probability that facts ina specific class are transmitted to neighboring agents(PT (a, c)).

The collection of these measurements is coordinatedby the user interface process. The user is allowed to setthe length of time the experiment is to be run, and howoften the measurements are taken. The user interfaceprocess sends a control message to the agent processeseach sampling period. The agent processes respond tothis message by writing all their performance measure-ments to an output file. At the end of the experiment allof the output files are processed to generate agent leveland system level performance statistics. The followingis the list of statistics generated:

System Level Performance Statistics

Final Conclusions per Secondshows the rate at whichfinal conclusions are generated. It is computed bysumming the final conclusion generation rates of allthe agents in the system.

Rule Firings per Secondshows the rate at which rulesare fired in a multiagent system. It is computed bysumming the rule firing rates of all the agents in thesystem.

Messages per Secondshows the rate at which messagesare transmitted in a multiagent system. It is computedby summing the message transmission rates of all theagents in the system.

Cost per Secondis a measure of the resources the mul-tiagent system consumes in a time interval. Bothsending messages and firing rules consume re-sources. Communications links have an associatedvalue that represents the cost of sending a messageacross the link. Agents also have an associated valuethat represents the cost of firing a rule. This measure-ment shows the total cost of transmitting messagesand firing rules. It is computed by summing the costrates of all the agents in the system.

Rule Firings per Final Conclusionshows the num-ber of rules that are fired for each final conclusiongenerated. It measures how efficiently the agents use

P1: VBI



their rule firing resources to generate final conclu-sions. This statistic is significant because a multia-gent system can be analyzed to determine its optimalvalue.

Messages per Final Conclusionshows the number ofmessages that are transmitted for each final con-clusion generated. It measures how efficiently theagents use their communications resources to gen-erate final conclusions. This statistic is significantbecause a multiagent system can be analyzed to de-termine its optimal value.

Cost per Final Conclusionshows of the resources themultiagent system consumes in generating one finalconclusion. It is computed by dividing the Cost perSecond statistic by the Final Conclusions per Secondstatistic.

Agent Level Performance Statistics

Rule Firings per Secondshows the rate at which anagent is firing rules.

Messages per Secondshows the rate at which an agentis transmitting messages. These include fact mes-sages, feedback messages, and synchronization mes-sages.

Final Conclusions per Secondshows the rate at whichan agent is generating final conclusions.

Cost per Secondshows the rate at which an agent isconsuming resources.

Cost per Rule Firing shows how many resources anagent consumes to fire a rule.

Messages per Rule Firingstatistic shows how manymessages an agent transmits to fire a rule.

TI is the amount of time an agent is idle in a given timeperiod.

PS(s, c) is the probability that neighboring sensors willbe sampled.

PT (a, c) is the probability that facts in classc will betransmitted to agenta.

5.3. Experiments with Fully Connected Networks

The first four experiments involve fully connected net-works of agents with different populations. These pop-ulations range from two in the first experiment to fivein the fourth experiment. Even though the agents forma fully connected network, the sensors are not fullyconnected. Instead, a sensor is connected to one of theagents in the system. Each experiment also involves

different rule bases. This is done in an attempt to elimi-nate the dependence of the ACCS on certain rule-basedorganizations. The first four experiments make severalassumptions to prove that a simplified version of theadaptive communications strategy is effective:

1. The cost of transmitting a message across a link isassumed to be zero.

2. The cost of firing a rule in an agent is assumed tobe zero.

3. The agents are assumed to be fully connected.With this organization, the furthest a fact has tobe routed is through one agent node. Since agentsdon’t have to act as information routers, an agentcan be assumed to be completely selfish. Therefore,an agent’s selfishness factor is assumed to be 1.0.This eliminates the right factor of the class useful-ness equations.

With these assumptions, the system equations reduceto the following:

F(k)= F(k− 1)+ K A

(TI (k)− TE − KT

× TI (k)− TI (k− 1)

1T

)UA(a, c)= 1

NA(a, c)

NA(a,c)∑i=1

UAF(a, c, i )

US(s, c)= 1

NS(s, c)

NS(s,c)∑i=1

USF(s, c, i )

PS(s, c)=

1.0 for 1.00< F(k)US(s, c)

F(k)US(s, c) for 0.01≤ F(k)US(s, c)≤ 1.00

0.01 for F(k)US(s, c) < 0.01

PT (a, c)=

1.0 for 1.00< F ′(a, k)×UT (a, c)

F ′(a, k)UT (a, c) for 0.01≤ F ′(a, k)×UT (a, c)≤ 1.00

0.01 for F ′(a, k)UT (a, c)< 0.01

Each experiment was run for a period of 60 s, andperformance measurements were taken once a second.The table for each experiment shows how efficientlythe system performed. These statistics show the sys-tem’s average performance over the last twenty seconds

P1: VBI



of the experiment. These measurements were takenat the end of the experiment so the system’s perfor-mance is measured while it was in steady state. Thecolumn labeled “Initial” shows the performance of thesystem without an adaptive communications strategy.Instead, each agent in the system sends all new facts toall neighboring agents. The column labeled “ACCS”shows the performance of the adaptive communicationsstrategy. The system started with the same strategy asthe “Initial” column, but the adaptive communicationscontrol system was allowed to modify the strategy inan attempt to improve the system’s efficiency. Thelast column contains the optimal performance statis-tics which were determined by examining the entiremultiagent system: the minimal path from sensor inputto final conclusion generation was identified, and thenumber of messages passed and rules fired along thispath were counted. The three graphs in Figs. 6–8 showthe main performance statistics for the first experiment.

Figure 6. Final conclusions per second for Experiment 1.

Figure 7. Messages per second for Experiment 1.

Figure 8. Rule firings per second for Experiment 1.

Figure 9. Network topology of Experiment 1.




The network topology, rule bases, and system wideperformance statistics for the first four experiments areshown in Figs. 9–12 and Tables 1–4.

All of these tables show that the adaptive communi-cations strategy increased the rate at which final con-clusions were generated, decreased the message load,increased the rule firing rate, and almost reached theoptimal values for the resources consumed per final

P1: VBI



Table 1. Summary of results for Experiment 1.

Initial ACCS Optimal

Final conclusions/second 22.90 27.55 —

Messages/second 232.65 88.05 —

Rule firings/second 138.70 167.0 —

Messages/final conclusion 10.16 3.20 3

Rule firings/final conclusion 6.05 6.06 6






















conclusion generated. In fact, the difference betweenthe adaptive communications strategy and these op-timal values can be attributed to the overhead of theadaptive communications strategy and errors in the per-formance measurements.

Figure 13 shows the probability for Agent A to sam-ple sensor S during the third experiment. This graphhas a close correlation to the workloads tracked duringthe experiment: the sharp rise in the sensor samplingprobability corresponds to the point that Agent A had alarge amount of idle time. The internal control systemincreasedF(k), which in turn increasePS(s, c).

Figure 13. Sensor sampling probability for Agent A.

5.4. An Experiment with Distracting Rule Sets

This experiment uses a non-fully connected networkof agents, and a selfishness factor of 0.8. The meansthat an agent is mainly concerned with its own informa-tion needs, and a little concerned with the informationneeds of its neighbors. This experiment also containsa subset of rules that do not move the system towardsgenerating a final conclusion. These rules are placedin the system to measure the effects of distracting ruleson the performance of the system. The adaptive com-munications strategy improved the performance of thesystem across all the system wide performance mea-sures, but the results for this system are not as good asfor the fully connected networks. This is due to the factthat Agent B must route information between Agent Aand Agent C, and Agent B contains a set of distract-ing rules that degrade the system’s performance (seeFig. 14).

The only unusual statistic is the number of rulefirings per final conclusion while using the initialcommunications strategy (see Table 5). This value of









P1: VBI




approximately 5 is less than the ACCS value of 8. Theonly reasonable explanation, though it has not beenverified, is that Agent B fell behind in firing rules asit was overwhelmed with work to perform. The firstfew of its rules would get fired, and the facts generatedwould be sent to neighboring agents. However, AgentB could not finish with all the rule firings and messagetransmissions before more facts would be sent from theneighboring agents who had much smaller workloads.Agent A had to keep increasing the sensor samplingrate for S in order to keep Agent A and Agent C busy.This generated too much work for Agent B to perform.Agent B fell behind, and on average, never fired halfof its rules.

5.5. An Experiment with an Information Router

This experiments shows how the adaptive communi-cations strategy works with a hierarchical topology.Agent A does not have any rules, and is the only con-nection to the other agents in the system (see Fig. 15).Agent A is at the top of the hierarchy, and it mustassume the role of an information router and coordi-nate the activities of the remaining agents who are atthe bottom of the hierarchy. All the system parametersthat were experimentally determined from the previousexamples are used in this system. The initial commu-nications strategy performed very poorly in this exper-iment, and the adaptive communications strategy im-proved upon this initial strategy for all the system wideperformance statistics (see Table 6).

5.6. An Experiment with Resource Costs

This final experiment examines the performance of asystem in which there is a cost associated with trans-mitting messages. Here the rule bases are very simple,









and are purposely selected to see if the system can favorthe least expensive path. Notice that the path throughAgent C is much more expensive than the path throughAgent D. Ideally, Agent A should choose to send mostof its facts in class v2 to Agent D and not to Agent C(see Fig. 16).

Once again, the adaptive communications strategydid much better than the initial communications strat-egy, but it did not get close to the optimal values. Theworst statistic is the Cost per Final Conclusion (seeTable 7).

The reason for the poor performance of this mul-tiagent system can be seen in the probability that theclasses of facts are sent from Agent A. Figure 17 showsthe probability that facts are transmitted from Agent Ato Agent C. This communications link should be almost






Total cost/second 1045.98 439.33 —



Cost/final conclusion 92.56 21.17 0.3

P1: VBI



Figure 17. Experiment 7: probability of Agent A sending facts toAgent C.

Figure 18. Experiment 7: probability of Agent A sending facts toAgent D.

completely cut off due to the high cost of transmittingacross this link. Instead, Agent A has a relatively highprobability that facts in class v2 are sent to Agent C.Figure 18 shows the real problem: the probability thatdifferent classes are sent from Agent A to Agent D hasvery large oscillations. This system, even though itsuccessfully improved the performance, is out of con-trol. The current model for an adaptive communica-tions strategy does not handle cost information well,and as a result, the systems that include cost informa-tion may not achieve steady state. The basic problemis that the model divides the usefulness of a fact byits cost. When there are radically different cost valuesfor two different paths, the class usefulness values canchange by large amounts. This induces the oscillationswhose amplitudes are too large for the control system.These oscillations in the probabilities are the cause ofthe high cost in the system.

5.7. Summary of Results

The ACCS improves the efficiency of fully connectednetworks, and larger agent populations result in greaterimprovements in efficiency. This not only shows theeffectiveness of the adaptive communications strategy,but also the ineffectiveness of the initial communica-tions strategy. In general, the adaptive communicationsstrategy improved the performance of all the experi-ments run. The best improvement was seen for fullyconnected networks with no cost information. Whenthe assumption of a fully connected network of agentsis removed, the adaptive communications strategy isstill very good, but it falls short of the optimal val-ues. Finally, when cost information is introduced, theadaptive communications strategy still provides an im-provement over the initial communications strategy,but it is even farther from the optimal performance lev-els. The last three experiments show where the adap-tive communications strategy needs improvement. Itcannot reach optimal performance levels in systemsthat contain agents with distracting rule sets. It alsohas problems in systems where agents are required tospecialize as information routers. The adaptive com-munications strategy showed its worst performance insystems where cost information is added.

6. Conclusions

For an agent in a multiagent system to contribute tothe problem solving group, it must exhibit behaviorsthat help the group move towards the solution to agiven problem. The largest obstacle that an agent mustovercome is how it decides to communicate with otheragents in the system. The behaviors that provide anagent with the ability to learn how to communicatewith other agents were identified. From these behav-iors, a control system for an agent was derived. Thiscontrol system monitors the information entering andleaving an agent and determines what types of infor-mation are important to an agent. Each agent examinesits own information needs and the information needsof its neighbors to determine an agent’s own commu-nications strategy. When the agents work together tosolve a problem, the combined affect of their local com-munications strategies form a global communicationsstrategy that controls the flow of information through-out a multiagent system.

The theory for an adaptive communications stra-tegy was developed and applied to agents that use a

P1: VBI



rule-based system to perform their problem solvingactivities. The control system, which is the centralcomponent of an adaptive communications strategy,was refined to reflect the needs of rule-based systems.Then, a testbed was designed and built to perform ex-periments on this adaptive communications strategy.This testbed was used in various experiments and itgenerated performance measurements that showed theperformance of the adaptive communications strategyunder different conditions. The results from these ex-periments showed that an adaptive communicationsstrategy is very effective on fully connected networksof agents, greatly improves the performance of generaltopologies, but has problems with distracting rule sets,networks that contain nodes that are communicationbottlenecks, and networks that have costs associatedwith links.

Acknowledgments

This work was partially supported by a grant of theKansas Technology Enterprise Corporation through theCenter for Excellence in Computer-Aided Systems En-gineering.

References

1. W.A. Kornfeld, “ETHER: A parallel problem solving system,”in Proc. of IJCAI-79, 1979, pp. 490–492.

2. Alan H. Bond and L. Gasser (Eds.),Readings in DistributedArtificial Intelligence, Morgan Kaufmann: San Mateo, CA,pp. 3–35, 1988.

3. Ronald Fagin and Moshe Y. Vardi, “Knowledge and implicitknowledge in a distributed environment: Preliminary report,”in Proceedings of the Conference on the Theoretical Aspectsof Reasoning about Knowledge, edited by Joseph Y. Halpern,1986, pp. 187–206.

4. Jeffery S. Rosenschein, “Synchronization of multiagent plans,”in Proceedings of the National Conference of the American Asso-ciation for Artificial Intelligence, Pittsburgh, PA, 1982, pp. 115–119.

5. Reid G. Smith, “The contract net protocol: High level com-munication and control in a distributed problem solver,”IEEETransactions on Computers, vol. C-29, no. 12, pp. 1104–1113,1980.

6. Reid G. Smith and Randal Davis, “Frameworks for cooperationin distributed problem solving,”IEEE Transactions on Systems,Man and Cybernetics, vol. SMC-11, no. 1, pp. 61–70, 1981.

7. Edmund H. Durfee, V.R. Lesser, and Daniel D. Corkill, “Coher-ent cooperation among communicating problem solvers,”IEEETransactions on Computers, vol. C-36, no. 11, pp. 1275–1291,1987.

8. Edmund H. Durfee and V.R. Lesser, “Incremental planning tocontrol a time-constrained, blackboard-based problem solver,”

IEEE Transactions on Aerospace and Electronic Systems,vol. 24, no. 5, pp. 647–662, 1988.

9. Michael Georgeff, “Communication and interaction in multi-agent planning,” inProc. of the 8th International Joint Con-ference on Artificial Intelligence, Karlsruhe, Germany, 1983,pp. 125–129.

10. V.R. Lesser and Lee D. Erman, “Distributed interpretation:A model and experiment,”IEEE Transactions on Computers,vol. C-29, no. 12, pp. 1144–1163, 1980.

11. V.R. Lesser and Daniel D. Corkill, “Functionally accurate, co-operative distributed systems,”IEEE Transactions on Systems,Man and Cybernetics, vol. SMC-11, no. 1, pp. 81–96, 1981.

12. Edmund H. Durfee and V.R. Lesser, “Using partial global plansto coordinate distributed problem solvers,” inProc. of IJCAI-87,Milan, Italy, 1987, pp. 875–883.

13. Edmund H. Durfee and V.R. Lesser, “Negotiating task-ing decomposition and allocation using partial global plan-ning,” in Distributed Artificial Intelligence, edited byL. Gasser and M. Huhns, Morgan Kaufmann, pp. 229–243,1989.

14. Edmund H. Durfee and V.R. Lesser, “Partial global planning: Acoordination framework for distributed hypothesis formation,”IEEE Transactions on Systems, Man, and Cybernetics, vol. 21,no. 5, pp. 1167–1183, 1991.

15. K.S. Decker, “Distributed artificial intelligence testbeds,” inFoundations of Distributed Artificial Intelligence, edited byG. O’Hare and N. Jennings, Wiley Inter-Science, 1994.

16. S.E. Conry, R.A. Meyer, and R.P. Pope, “Mechanisms for assess-ing nonlocal impact of local decisions in distributed planning,”in Distributed Artificial Intelligence, edited by L. Gasser andM. Huhns, Morgan Kaufmann, pp. 245–257, 1989.

17. Robert J. Ensor and John D. Gabbe, “Transaction blackboards,”in Proc. of IJCAI-85, 1985, pp. 340–344.

18. Lee D. Erman, Frederick A. Hayes-Roth, V.R. Lesser, and D.Raj Reddy, “The hearsay-II speech-understanding system: Inte-grating knowledge to resolve uncertainty,”Computing Surveys,vol. 12, no. 2, pp. 213–253, 1980.

19. Barbara Hayes-Roth, “A blackboard architecture for control,”Artificial Intelligence, vol. 26, pp. 251–321, 1985.

20. Y. Shoham and M. Tennenholtz, “On the synthesis of usefulsocial laws for artificial agent societies (preliminary report),” inProc. of AAAI-92, 1992, pp. 704–709.

21. M. Fenster, S. Kraus, and J.S. Rosenschein, “Coordination with-out communication: Experimental validation of focal point tech-niques,” inProc. First Int. Conf. on Multiagent Systems, AAAIPress, 1995, pp. 102–108.

22. Y. Shoham and M. Tennenholtz, “On social laws for artificialagent societies: Off-line design,”Artificial Intelligence, vol. 73,no. 1–2, pp. 231–252, 1995.

23. Mark S. Fox, “An organizational view of distributed systems,”IEEE Transactions on Systems, Man, and Cybernetics, vol.SMC-11, no. 1, pp. 70–80, 1981.

24. R. Wesson, Fredrick Hayes-Roth, John W. Burge, CathleenStasz, and Carl A. Sunshine, “Network structures for dis-tributed situation assessment,”IEEE Transactions on Sys-tems, Man, and Cybernetics, vol. SMC-11, no. 1, pp. 5–23,1981.

25. Thomas W. Malone, “Modeling coordination in organizationsand markets,”Management Science, vol. 33, no. 10, pp. 1317–1332, 1987.

P1: VBI



26. Randal Steeb, Stephanie Cammarata, Fredrick A. Hayes-Roth,Perry W. Thorndyke, and Robert B. Wesson, “Architecturesfor distributed air-traffic control, “Technical Report” R-2728-ARPA, pp. 9–30, 1981.

27. S. Cammarata, D. McArthur, and R. Steeb, “Strategies of Coop-eration in Distributed Problem Solving,” inProc. of IJCAI-83,1983, pp. 767–770.

28. Michael J. Shaw and Andrew B. Whinston, “Learning andadaptation in distributed artificial intelligence systems,” inDistributed Artificial Intelligence, edited by L. Gasser andM.N. Huhns, Pitman Publishing/Morgan Kaufmann Publishers,vol. II, pp. 413–429, 1990.

29. P. Brazdil, M. Games, S. Sian, L. Torgo, and W. van de Velde,“Learning in distributed systems and multiagent environments,”in Proceedings of the European Working Session on Learning,Porto, Portugal, 1991, pp. 413–423.

30. P. Brazdil and S. Muggleton, “Learning to relate terms in amultiple agent environment,” inProceedings of the EuropeanWorking Session on Learning, Porto, Portugal, 1991, pp. 424–439.

31. S. Gallant, “Connectionist expert systems,”Communications ofACM, vol. 31, pp. 152–168, 1988.

32. Barbara Hayes-Roth, “An architecture for adaptive intelligentsystems,”Artificial Intelligence, vol. 72, nos. 1–2, pp. 329–365,1995.

33. K.S. Decker and V.R. Lesser, “Analyzing the need for meta-levelcommunication,” inProc. of IJCAI-94, 1994.

34. Daniel E. Neiman, David W. Hildum, and V.R. Lesser, “Exploit-ing meta-level information in a distributed scheduling system,”in Proc. of IJCAI-94, 1994.

35. Elise H. Turner, “Exploiting problem solving to select informa-tion to include dialogues between cooperating agents,” inProc.of Sixteenth Annual Conference of the Cognitive Science Society,1994, pp. 882–886.

36. K.S. Decker and V.R. Lesser, “Designing a family of coordina-tion algorithms,” inProc. First Int. Conf. on Multiagent Systems,AAAI Press, 1995, pp. 73–80.

37. M. Prasad, V. Nagendra, V.R. Lesser, and S. Lander, “Learningorganizational roles in a heterogeneous multi-agent system,” inAAAI Symposium on Adaptation, Coevolution and Learning inMultiagent Systems, AAAI Technical Report SS-96-01, pp. 73–77, 1996.

38. R. Whitehair and V.R. Lesser, “A framework for the analysisof sophisticated control in interpretation systems, “ComputerScience Technical Report” 93-53, University of Massachusetts,1993.

39. T. Ohkno, K. Hiraki, and Y. Anzai, “Reducing communica-tion load on contract net by case-based reasoning—Extensionwith directed contract and forgetting,” inProc. Second Int.Conf. on Multiagent Systems, AAAI Press, 1996, pp. 244–251.

40. R. Foisel, V. Chevrier, and J.-P. Haton, “Improving global co-herence by adaptive organization in a multi-agent system,” inProc. Second Int. Conf. on Multiagent Systems, AAAI Press,1996, p. 435.

41. D. Carmel and S. Markovitch, “Learning models of intelligentagents,” inProc. of AAAI-96, 1996, pp. 62–67.

42. H.H. Bui, D. Kieronska, and S. Venkatesh, “Learning otheragents’ preferences in multiagent negotiation,” inProc. of AAAI-96, 1996, pp. 114–119.

43. Walter Jacobs, “A structure for systems that plan abstractly,” inProceedings of the AFIPS Conference, Spring Joint ComputerConference, vol. 38, 1971, pp. 357–364.

44. C. Tsatsoulis and G. Yee, “Learning reliability models of otheragents in a multiagent system,”AAAI Workshop on IntelligentAdaptive Agents, AAAI Technical Report, 1996.

45. C.L. Forgy “RETE: A fast algorithm for the many pattern/manypattern object matching problem,”Artificial Intelligence, vol. 19,pp. 17–37, 1982.

46. Michael D. Kinney, “A multi-agent communications tool us-ing remote procedure calls,” CECASE-TR-8640-12, Center forExcellence in Computer Aided Systems Engineering, The Uni-versity of Kansas, 1992.

47. NASA COSMIC,CLIPS Advanced Programming Guide, Ver-sion 4.1, 1989.

48. Michael D. Kinney, “A testbed for an adaptive communicationsstrategy for multi-agent systems,” CECASE-TR-8640-27, Cen-ter for Excellence in Computer Aided Systems Engineering, TheUniversity of Kansas, 1992.

49. Y. Shoham, “Agent-oriented programming,”Artificial Intelli-gence, vol. 60, pp. 51–92, 1993.

Michael Kinney is a Consulting Engineer with DeskStation Tech-nology. He is a member of a small engineering team that designsand manufactures high performance workstations using MIPS andALPHA processors. He received his Bachelor’s degree in ComputerEngineering from the University of Kansas in 1990, and his Mas-ter’s degree in Electrical Engineering from the University of Kansasin 1993, where he pursued research in Multiagent Systems, ObjectOriented Systems, and Machine Learning.

Costas Tsatsoulisis an Associate Professor of Electrical Engineer-ing and Computer Science at the University of Kansas. His researchinterests include artificial intelligence, multiagent systems, data min-ing, case based reasoning, and computer vision. He is a memberof ACM, AAAI, IEEE-Computer Society, and a Senior Member ofIEEE. He is the editor ofAnalysis of SAR Data of the PolarOceans,published in 1998 by Springer-Verlag. Dr. Tsatsoulis received thePh.D. degree in 1987 from Purdue University.

Learning Communication Strategies in Multiagent Systems

Documents

Transcript of Learning Communication Strategies in Multiagent Systems