geekztrainerblog.files.wordpress.comUNIT I – INTRODUCTION TO AI AND PRODUCTION SYSTEMS What is AI?...
Transcript of geekztrainerblog.files.wordpress.comUNIT I – INTRODUCTION TO AI AND PRODUCTION SYSTEMS What is AI?...
UNIT I – INTRODUCTION TO AI AND PRODUCTION SYSTEMS
What is AI?
Artificial intelligence (AI) is the intelligence exhibited by machines or software. It is also the name of the academic field of study which studies how to create computers and computersoftware that are capable of intelligent behavior. The central problems (or goals) of AI research include reasoning, knowledge, planning, learning, natural language processing (communication), perception and the ability to move and manipulate objects.
Example
Cleverbot Cleverbot is a chatterbot that’s modeled after human behavior and able to hold a conversation. It does so by remembering words from conversations. The responses are not programmed. Instead, it finds keywords or phrases that matches the input and searches through its saved conversations to find how it should respond to the input. MySong MySong is an application which can help people who has no experience in write a song or even can not play any instrument, to create original music by themselves. It will automatically choose chords to accompany the vocal melody that you have just inputed by microphone. In the other hand, MySong can help songwriters to record their new ideas and melodies no matter where and when they are. But MySong is not a professional application which can produce or edit your song, then it means you have to use other tools or software to really develop a song. Artificial Intelligence in Video Games The AI components used in video games is often a slimmed down version of a true AI implementation, as the scope of a video game is often limited (ie Console memory capacity). The most innovative use of AI is garnered on Personal Computers, whose memory capabilities are adjustable beyond the capacity of modern gaming consoles. Some examples of AI components typically used in video games are Path Finding, Adaptiveness (learning), perception, and planning (decision
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
making). The present state of video games can offer a variety of "worlds" for AI concepts to be tested in, such as a static or dynamic environment, deterministic or non-deterministic transitioning, and fully or partially known game worlds. The real-time performance constraint of AI in video game processing must also be considered, which is another contributing factor to why video games may choose to implement a "simple" AI, ie: finite state machine as AI, which may not even be considered Artifical Intelligence at heart. Artificial Intelligence in Mobile System As smartphone come into our daily life, we need to the make our device even more clever. Recently, researchers are trying to apply traditional AI techniques into mobile environment. Those techniques, including speech recognition, machine learning, classification and natural language processing give us a more powerful application, such as SIRI on iOS, kinect from Microsoft. AI on mobile device introduces some new challenges, such as limited computation resource and energy consumption etc. History of AI
Cybernetics and early neural networks Turing's test Game AI Symbolic reasoning and the Logic Theorist AI
Problem Formulation States possible world states accessibility the agent can determine via its sensors in which state it is consequences of actions the agent knows the results of its actions levels problems and actions can be specified at various levels constraints conditions that influence the problem-solving process Performance measures to be applied costs utilization of resources Problem Types
single-state problem multiple-state problem
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
contingency problem exploration problem
Single-State Problem Exact prediction is possible state is known exactly after any sequence of actions accessibility of the world all essential information can be obtained through sensors consequences of actions are known to the agent goal for each known initial state, there is a unique goal state that is guaranteed to be reachable via an action sequence simplest case, but severely restricted Multiple-State Problem Semi-exact prediction is possible state is not known exactly, but limited to a set of possible states after each action accessibility of the world not all essential information can be obtained through sensors reasoning can be used to determine the set of possible states consequences of actions are not always or completely known to the agent; actions or the environment might exhibit randomness goal due to ignorance, there may be no fixed action sequence that leads to the goal less restricted, but more complex. Contingency Problem Exact prediction is impossible state unknown in advance, may depend on the outcome of actions and changes in the environment accessibility of the world some essential information may be obtained through sensors only at execution time consequences of actions may not be known at planning time goal instead of single action sequences, there are trees of actions contingency branching point in the tree of actions agent design different from the previous two cases: the agent must act on incomplete plans Exploration Problem Effects of actions are unknown state the set of possible states may be unknown accessibility of the world some essential information may be obtained through sensors only at execution time consequences of actions may not be known at planning time goal can’t be completely formulated in advance because states and consequences may not be known at planning time discovery what states exist experimentation what are the outcomes of actions learning remember and evaluate experiments
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
MATCHING
Clever search involves choosing from among the rules that can be applied at a particular point, the ones that are most likely to lead to a solution. We need to extract from the entire collection of rules, those that can be applied at a given point. To do so requires some kind of matching between the current state and the preconditions of the rules.
How should this be done?
One way to select applicable rules is to do a simple search through all the rules comparing one’s preconditions to the current state and extracting all the ones that match . this requires indexing of all the rules. But there are two problems with these simple solutions:
A. It requires the use of a large number of rules. Scanning through all of them would be hopelessly inefficeint.
B. It is not always immediately obvious whether a rule’s preconditions are satisfied by a particular state.
Sometimes , instead of searching through the rules, we can use the current state as an index into the rules and select the matching ones immediately. In spite of limitations, indexing in some form is very important in the efficient operation of rules based systems.
A more complex matching is required when the preconditions of rule specify required properties that are not stated explicitly in the description of the current state. In this case, a separate set of rules must be used to describe how some properties can be
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
inferred from others. An even more complex matching process is required if rules should be applied and if their pre condition approximately match the current situation. This is often the case in situations involving physical descriptions of the world.
LEARNING
Learning is the improvement of performance with experience over time.
Learning element is the portion of a learning AI system that decides how to modify the performance element and implements those modifications.
We all learn new knowledge through different methods, depending on the type of material to be learned, the amount of relevant knowledge we already possess, and the environment in which the learning takes place. There are five methods of learning . They are,
1. Memorization (rote learning)
2. Direct instruction (by being told)
3. Analogy
4. Induction
5. Deduction
Learning by memorizations is the simplest from of le4arning. It requires the least amount of inference and is accomplished by simply copying the knowledge in the same form that it will be used directly into the knowledge base.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Example:- Memorizing multiplication tables, formulate , etc.
Direct instruction is a complex form of learning. This type of learning requires more inference than role learning since the knowledge must be transformed into an operational form before learning when a teacher presents a number of facts directly to us in a well organized manner.
Analogical learning is the process of learning a new concept or solution through the use of similar known concepts or solutions. We use this type of learning when solving problems on an exam where previously learned examples serve as a guide or when make frequent use of analogical learning. This form of learning requires still more inferring than either of the previous forms. Since difficult transformations must be made between the known and unknown situations.
Learning by induction is also one that is used frequently by humans . it is a powerful form of learning like analogical learning which also require s more inferring than the first two methods. This learning re quires the use of inductive inference, a form of invalid but useful inference. We use inductive learning of instances of examples of the concept. For example we learn the
concepts of color or sweet taste after experiencing the sensations associated with several examples of colored objects or sweet foods.
Deductive learning is accomplished through a sequence of deductive inference steps using known facts. From the known
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
facts, new facts or relationships are logically derived. Deductive learning usually requires more inference than the other methods.
Review Questions:-
1. what is perception ?
2. How do we overcome the Perceptual Problems?
3. Explain in detail the constraint satisfaction waltz algorithm?
4. What is learning ?
5. What is Learning element ?
6. List and explain the methods of learning?
Types of learning:- Classification or taxonomy of learning types serves as a guide in studying or comparing a differences among them. One can develop learning taxonomies based on the type of knowledge representation used (predicate calculus , rules, frames), the type of knowledge learned (concepts, game playing, problem solving), or by the area of application(medical diagnosis , scheduling , prediction and so on).
The classification is intuitively more appealing and is one which has become popular among machine learning researchers . it is independent of the knowledge domain and the representation scheme is used. It is based on the type of inference strategy employed or the methods used in the learning process. The five different learning methods under this taxonomy are:
Memorization (rote learning)
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Direct instruction(by being told)
Analogy
Induction
Deduction
Learning by memorization is the simplest form of learning. It requires the least5 amount of inference and is accomplished by simply copying the knowledge in the same form that it will be used directly into the knowledge base. We use this type of learning when we memorize multiplication tables ,
for example.
A slightly more complex form of learning is by direct instruction. This type of learning requires more understanding and inference than role learning since the knowledge must be transformed into an operational form before being integrated into the knowledge base. We use this type of learning when a teacher presents a number of facts directly to us in a well organized manner.
The third type listed, analogical learning, is the process of learning an ew concept or solution through the use of similar known concepts or solutions. We use this type of learning when solving problems on an examination where previously learned examples serve as a guide or when we learn to drive a truck using our knowledge of car driving. We make frewuence use of analogical learning. This form of learning requires still more inferring than either of the previous forms, since difficult transformations must be made between the
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
known and unknown situations. This is a kind of application of knowledge in a new situation.
The fourth type of learning is also one that is used frequency by humans. It is a powerful form of learning which, like analogical learning, also requires more inferring than the first two methods. This form of learning requires the use of inductive inference, a form of invalid but useful inference. We use inductive learning when wed formulate a general concept after seeing a number of instance or examples of the concept. For example, we learn the concepts of color sweet taste after experiencing the sensation associated with several examples of colored objects or sweet foods.
The final type of acquisition is deductive learning. It is accomplished through a sequence of deductive inference steps using known facts. From the known facts, new facts or relationships are logically derived. Deductive learning usually requires more inference than the other methods. The inference method used is, of course , a deductive type, which is a valid from of inference.
In addition to the above classification, we will sometimes refer to learning methods as wither methods or knowledge-rich methods. Weak methods are general purpose methods in which little or no initial knowledge is available. These methods are more mechanical than the classical AI knowledge – rich methods. They often rely on a form of heuristics search in the learning process.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Heuristic Search
All of the search methods in the preceding section are uninformed in that they did not take into account the goal. They do not use any information about where they are trying to get to unless they happen to stumble on a goal. One form of heuristic information about which nodes seem the most promising is a heuristic function h(n), which takes a node n and returns a non-negative real number that is an estimate of the path cost from node n to a goal node. The function h(n) is an underestimate if h(n) is less than or equal to the actual cost of a lowest-cost path from node n to a goal.
The heuristic function is a way to inform the search about the direction to a goal. It provides an informed way to guess which neighbor of a node will lead to a goal.
There is nothing magical about a heuristic function. It must use only information that can be readily obtained about a node. Typically a trade-off exists between the amount of work it takes to derive a heuristic value for a node and how accurately the heuristic value of a node measures the actual path cost from the node to a goal.
A standard way to derive a heuristic function is to solve a simpler problem and to use the actual cost in the simplified problem as the heuristic function of the original problem.
The straight-line distance in the world between the node and the goal position can be used as the heuristic function. The examples that follow assume the following heuristic function: h(mail) = 26 h(ts) = 23 h(o103) = 21
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
h(o109) = 24 h(o111) = 27 h(o119) = 11 h(o123) = 4 h(o125) = 6 h(r123) = 0
h(b1) = 13 h(b2) = 15 h(b3) = 17 h(b4) = 18 h(c1) = 6 h(c2) = 10 h(c3) = 12 h(storage) = 12
This h function is an underestimate because the h value is less than or equal to the exact cost of a lowest-cost path from the node to a goal. It is the exact cost for node o123. It is very much an underestimate for node b1, which seems to be close, but there is only a long route to the goal. It is very misleading for c1, which also seems close to the goal, but no path exists from that node to the goal.
where the state space includes the parcels to be delivered. Suppose the cost function is the total distance traveled by the robot to deliver all of the parcels. One possible heuristic function is the largest distance of a parcel from its destination. If the robot could only carry one parcel, a possible heuristic function is the sum of the distances that the parcels must be carried. If the robot could carry multiple parcels at once, this may not be an underestimate of the actual cost.
The h function can be extended to be applicable to (non-empty) paths. The heuristic value of a path is the heuristic value of the node at the end of the path. That is:
h(⟨no,...,nk⟩)=h(nk)
A simple use of a heuristic function is to order the neighbors that are added to the stack representing the frontier in depth-first search. The neighbors can be added to the frontier so that the
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
best neighbor is selected first. This is known as heuristic depth-first search. This search chooses the locally best path, but it explores all paths from the selected path before it selects another path. Although it is often used, it suffers from the problems of depth-fist search.
Another way to use a heuristic function is to always select a path on the frontier with the lowest heuristic value. This is called best-first search. It usually does not work very well; it can follow paths that look promising because they are close to the goal, but the costs of the paths may keep increasing.
Figure 3.8: A graph that is bad for best-first search
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
PROBLEM CHARACTERISTICS Heuristic search is a very general method applicable to a large class of problem . It includes a variety of techniques. In order to choose an appropriate method, it is necessary to analyze the problem with respect to the following considerations. Is the problem decomposable ? A very large and composite problem can be easily solved if it can be broken into smaller problems and recursion could be used. Suppose we want to solve. Ex:- ∫ x2 + 3x+sin2x cos 2x dx This can be done by breaking it into three smaller problems and solving each by applying specific rules. Adding the results the complete solution is obtained. 2. Can solution steps be ignored or undone? Problem fall under three classes ignorable , recoverable and irrecoverable. This classification is with reference to the steps of the solution to a problem. Consider thermo proving. We may later find that it is of no help. We can still proceed further, since nothing is lost by this redundant step. This is an example of ignorable solutions steps. Now consider the 8 puzzle problem tray and arranged in specified order. While moving from the start state towards goal state, we may make some stupid move and consider theorem proving. We may proceed by first proving lemma. But we may backtrack and undo the unwanted move. This only involves additional steps and the solution steps are recoverable. Lastly consider the game of chess. If a wrong move is made, it can neither be ignored nor be recovered. The thing to do is to make the best use of current situation and proceed. This is an example of an irrecoverable solution steps. 1. Ignorable problems Ex:- theorem proving · In which solution steps can be ignored. 2. Recoverable problems Ex:- 8 puzzle · In which solution steps can be undone 3. Irrecoverable problems Ex:- Chess · In which solution steps can’t be undone A knowledge of these will help in determining the control structure.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
3.. Is the Universal Predictable? Problems can be classified into those with certain outcome (eight puzzle and water jug problems) and those with uncertain outcome ( playing cards) . in certain – outcome problems, planning could be done to generate a sequence of operators that guarantees to a lead to a solution. Planning helps to avoid unwanted solution steps. For uncertain out come problems, planning can at best generate a sequence of operators that has a good probability of leading to a solution. The uncertain outcome problems do not guarantee a solution and it is often very expensive since the number of solution and it is often very expensive since the number of solution paths to be explored increases exponentially with the number of points at which the outcome can not be predicted. Thus one of the hardest types of problems to solve is the irrecoverable, uncertain – outcome problems ( Ex:- Playing cards). 4. Is good solution absolute or relative ? (Is the solution a state or a path ?) There are two categories of problems. In one, like the water jug and 8 puzzle problems, we are satisfied with the solution, unmindful of the solution path taken, whereas in the other category not just any solution is acceptable. We want the best, like that of traveling sales man problem, where it is the shortest path. In any – path problems, by heuristic methods we obtain a solution and we do not explore alternatives. For the best-path problems all possible paths are explored using an exhaustive search until the best path is obtained. 5. The knowledge base consistent ? In some problems the knowledge base is consistent and in some it is not. For example consider the case when a Boolean expression is evaluated. The knowledge base now contains theorems and laws of Boolean Algebra which are always true. On the contrary consider a knowledge base that contains facts about production and cost. These keep varying with time. Hence many reasoning schemes that work well in consistent domains are not appropriate in inconsistent domains. Ex.Boolean expression evaluation. 6. What is the role of Knowledge? Though one could have unlimited computing power, the size of the knowledge base available for solving the problem does matter in arriving at a good solution. Take for example the game of playing chess, just the rues for determining legal moves and some simple control mechanism is sufficient to arrive at a solution. But additional knowledge about good strategy and tactics could help to constrain the search and speed up the execution of the program. The solution would then be realistic. Consider the case of predicting the political trend. This would require an enormous amount of knowledge even to be able to recognize a solution , leave alone the best.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Ex:- 1. Playing chess 2. News paper understanding 7. Does the task requires interaction with the person. The problems can again be categorized under two heads. 1. Solitary in which the computer will be given a problem description and will produce an answer, with no intermediate communication and with he demand for an explanation of the reasoning process. Simple theorem proving falls under this category . given the basic rules and laws, the theorem could be proved, if one exists. Ex:- theorem proving (give basic rules & laws to computer) 2. Conversational, in which there will be intermediate communication between a person and the computer, wither to provide additional assistance to the computer or to provide additional informed information to the user, or both problems such as medical diagnosis fall under this category, where people will be unwilling to accept the verdict of the program, if they can not follow its reasoning. Ex:- Problems such as medical diagnosis. 8. Problem Classification Actual problems are examined from the point of view , the task here is examine an input and decide which of a set of known classes. Ex:- Problems such as medical diagnosis , engineering design.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
FORMALIZING GRAPH SEARCHING
A directed graph consists of
a set N of nodes and a set A of ordered pairs of nodes called arcs.
In this definition, a node can be anything. All this definition does is constrain arcs to be ordered pairs of nodes. There can be infinitely many nodes and arcs. We do not assume that the graph is represented explicitly; we require only a procedure that can generate nodes and arcs as needed.
The arc ⟨n1,n2⟩ is an outgoing arc from n1 and an incoming arc to n2.
A node n2 is a neighbor of n1 if there is an arc from n1 to n2; that is, if ⟨n1,n2⟩∈A. Note that being a neighbor does not imply symmetry; just because n2 is a neighbor of n1 does not mean that n1 is necessarily a neighbor of n2. Arcs may be labeled, for example, with the action that will take the agent from one state to another.
A path from node s to node g is a sequence of nodes ⟨n0, n1,..., nk⟩ such that s=n0, g=nk, and ⟨ni-
1,ni⟩∈A; that is, there is an arc from ni-1 to ni for each i. Sometimes it is useful to view a path as the sequence of arcs, ⟨no,n1⟩, ⟨n1,n2⟩,..., ⟨nk-1,nk⟩ , or a sequence of labels of these arcs.
A cycle is a nonempty path such that the end node is the same as the start node - that is, a cycle is a path ⟨n0, n1,..., nk⟩ such that n0=nk and k≠0. A directed graph without any cycles is called a directed acyclic graph (DAG). This should probably be an acyclic directed graph, because it is a directed graph that happens to be acyclic, not an acyclic graph that happens to be directed, but DAG sounds better than ADG!
A tree is a DAG where there is one node with no incoming arcs and every other node has exactly one incoming arc. The node with no incoming arcs is called the root of the tree and nodes with no outgoing arcs are called leaves.
To encode problems as graphs, one set of nodes is referred to as the start nodes and another set is called the goal nodes. A solution is a path from a start node to a goal node.
Sometimes there is a cost - a positive number - associated with arcs. We write the cost of arc ⟨ni,nj⟩ as cost(⟨ni,nj⟩). The costs of arcs induces a cost of paths.
Given a path p = ⟨n0, n1,..., nk⟩, the cost of path p is the sum of the costs of the arcs in the path:
cost(p) = cost(⟨n0,n1⟩) + ...+ cost(⟨nk-1,nk⟩)
An optimal solution is one of the least-cost solutions; that is, it is a path p from a start node to a goal node such that there is no path p' from a start node to a goal node where cost(p')<cost(p).
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Figure 3.2: A graph with arc costs for the delivery robot domain
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Production System Types of Production Systems.
A Knowledge representation formalism consists of collections of condition-action rules(Production Rules or Operators), a database which is modified in accordance with the rules, and a Production System Interpreter which controls the operation of the rules i.e The 'control mechanism' of a Production System, determining the order in which Production Rules are fired. A system that uses this form of knowledge representation is called a production system. A production system consists of rules and factors. Knowledge is encoded in a declarative from which comprises of a set of rules of the form Situation ------------ Action SITUATION that implies ACTION. Example:- IF the initial state is a goal state THEN quit. The major components of an AI production system are i. A global database ii. A set of production rules and iii. A control system The goal database is the central data structure used by an AI production system. The production system. The production rules operate on the global database. Each rule has a precondition that is either satisfied or not by the database. If the precondition is satisfied, the rule can be applied. Application of the rule changes the database. The control system chooses which applicable rule should be applied and ceases computation when a termination condition on the database is satisfied. If several rules are to fire at the same time, the control system resolves the conflicts.
Four classes of production systems:- 1. A monotonic production system 2. A non monotonic production system 3. A partially commutative production system 4. A commutative production system. Advantages of production systems:- 1. Production systems provide an excellent tool for structuring AI programs.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
2. Production Systems are highly modular because the individual rules can be added, removed or modified independently.
3. The production rules are expressed in a natural form, so the statements contained in the knowledge base should the a recording of an expert thinking out loud. Disadvantages of Production Systems:- One important disadvantage is the fact that it may be very difficult analyse the flow of control within a production system because the individual rules don’t call each other. Production systems describe the operations that can be performed in a search for a solution to the problem. They can be classified as follows. Monotonic production system :- A system in which the application of a rule never prevents the later application of another rule, that could have also been applied at the time the first rule was selected. Partially commutative production system:- A production system in which the application of a particular sequence of rules transforms state X into state Y, then any permutation of those rules that is allowable also transforms state x into state Y. Theorem proving falls under monotonic partially communicative system. Blocks world and 8 puzzle problems like chemical analysis and synthesis come under monotonic, not partially commutative systems. Playing the game of bridge comes under non monotonic , not partially commutative system. For any problem, several production systems exist. Some will be efficient than others. Though it may seem that there is no relationship between kinds of problems and kinds of production systems, in practice there is a definite relationship. Partially commutative , monotonic production systems are useful for solving ignorable problems. These systems are important for man implementation standpoint because they can be implemented without the ability to backtrack to previous states, when it is discovered that an incorrect path was followed. Such systems increase the efficiency since it is not necessary to keep track of the changes made in the search process. Monotonic partially commutative systems are useful for problems in which changes occur but can be reversed and in which the order of operation is not critical (ex: 8 puzzle problem). Production systems that are not partially commutative are useful for many problems in which irreversible changes occur, such as chemical analysis. When dealing with such systems, the order in which operations are performed is very important and hence correct decisions have to be made at the first time itself.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Control or Search Strategy :
Selecting rules; keeping track of those sequences of rules that have already been tried and the states produced by them.
Goal state provides a basis for the termination of the problem solving task.
1- PATTERN MATCHING STAGE Execution of a rule requires a match. preconditions match content of of a rule <=====> the working memory.
when match is found => rule is applicable several rules may be applicable
2- CONFLICT RESOLUTION (SELECTION STRATEGY ) STAGE Selecting one rule to execute;
3- ACTION STAGE Applying the action part of the rule => changing the content of the workspace =>new patterns,new matches => new set of rules eligible for execution
Recognize -act control cycle
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Search Strategies Uninformed Search Strategies have no additional information about states beyond that
provided in the problem definition.
Strategies that know whether one non goal state is ―more promising‖ than another are
called Informed search or heuristic search strategies.
There are five uninformed search strategies as given below.
o Breadth-first search
o Uniform-cost search
o Depth-first search
o Depth-limited search
o Iterative deepening search
Problem characteristics Analyze each of them with respect to the seven problem characteristics
Chess Water jug 8-puzzle Traveling salesman Missionaries and cannibals Tower of Hanoi
1. Chess
Problem characteristic Satisfied Reason
Is the problem decomposable?
No One game have Single solution
Can solution steps be ignored or undone?
No In actual game(not in PC) we can’t undo previous steps
Is the problem universe predictable?
No Problem Universe is not predictable as we are not sure about move of other player(second player)
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Is a good solution absolute or relative?
absolute Absolute solution : once you get one solution you do need to bother about other possible solution. Relative Solution : once you get one solution you have to find another possible solution to check which solution is best(i.e low cost). By considering this chess is absolute
Is the solution a state or a path?
Path Is the solution a state or a path to a state? – For natural language understanding, some of the words have different interpretations .therefore sentence may cause ambiguity. To solve the problem we need to find interpretation only , the workings are not necessary (i.e path to solution is not necessary) So In chess winning state(goal state) describe path to state
What is the role of knowledge?
lot of knowledge helps to constrain the
search for a solution.
Does the task require human-interaction?
No Conversational In which there is intermediate
communication between a person and the computer, either to provide additional assistance to the computer or to provide additional information to the user, or both. In chess additional assistance is not required
2. Water jug
Problem characteristic Satisfied Reason
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Is the problem decomposable?
No One Single solution
Can solution steps be ignored or undone?
Yes
Is the problem universe predictable?
Yes Problem Universe is predictable bcz to slove this problem it require only one person .we can predict what will happen in next step
Is a good solution absolute or relative?
absolute Absolute solution , water jug problem may have number of solution , bt once we found one solution,no need to bother about other solution Bcz it doesn’t effect on its cost
Is the solution a state or a path?
Path Path to solution
What is the role of knowledge?
lot of knowledge helps to constrain the search for a solution.
Does the task require human-interaction?
Yes additional assistance is required. Additional assistance, like to get jugs or pump
3. 8 puzzle
Problem characteristic Satisfied Reason
Is the problem decomposable?
No One game have Single solution
Can solution steps be ignored or undone?
Yes We can undo the previous move
Is the problem universe predictable?
Yes Problem Universe is predictable bcz to slove this problem it require only one person .we can predict what will beposition of blocks in next move
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Is a good solution absolute or relative?
absolute Absolute solution : once you get one solution you do need to bother about other possible solution. Relative Solution : once you get one solution you have to find another possible solution to check which solution is best(i.e low cost). By considering this 8 puzzle is absolute
Is the solution a state or a path?
Path Is the solution a state or a path to a state? – For natural language understanding, some of the words have different interpretations .therefore sentence may cause ambiguity. To solve the problem we need to find interpretation only , the workings are not necessary (i.e path to solution is not necessary) So In 8 puzzle winning state(goal state) describe path to state
What is the role of knowledge?
lot of knowledge helps to constrain the
search for a solution.
Does the task require human-interaction?
No Conversational In which there is intermediate
communication between a person and the computer, either to provide additional assistance to the computer or to provide additional information to the user, or both. In 8 puzzle additional assistance is not required
4. Travelling Salesman (TSP)
Problem characteristic Satisfied Reason
Is the problem decomposable? No One game have Single solution
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Can solution steps be ignored or undone?
Yes
Is the problem universe predictable?
Yes
Is a good solution absolute or relative?
absolute Absolute solution : once you get one solution you do need to bother about other possible solution. Relative Solution : once you get one solution you have to find another possible solution to check which solution is best(i.e low cost). By considering this TSP is absolute
Is the solution a state or a path?
Path Is the solution a state or a path to a state? – For natural language understanding, some of the words have different interpretations .therefore sentence may cause ambiguity. To solve the problem we need to find interpretation only , the workings are not necessary (i.e path to solution is not necessary) So In TSP (goal state) describe path to state
What is the role of knowledge?
lot of knowledge helps to constrain the search for a solution.
Does the task require human-interaction?
No Conversational In which there is intermediate
communication between a person and the computer, either to provide additional assistance to the computer or to provide additional information to the user, or both.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
In chess additional assistance is not required
5. Missionaries and cannibals
Problem characteristic Satisfied Reason
Is the problem decomposable?
No One game have Single solution
Can solution steps be ignored or undone?
Yes
Is the problem universe predictable?
Yes Problem Universe is not predictable as we are not sure about move of other player(second player)
Is a good solution absolute or relative?
absolute Absolute solution : once you get one solution you do need to bother about other possible solution. Relative Solution : once you get one solution you have to find another possible solution to check which solution is best(i.e low cost). By considering this is absolute
Is the solution a state or a path?
Path Is the solution a state or a path to a state? – For natural language understanding, some of the words have different interpretations .therefore sentence may cause ambiguity. To solve the problem we need to find interpretation only , the workings are not necessary (i.e path to solution is not necessary) So In winning state(goal state) describe path to state
What is the role of knowledge?
lot of knowledge helps to constrain the
search for a solution.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Does the task require human-interaction?
Yes Conversational In which there is intermediate
communication between a person and the computer, either to provide additional assistance to the computer or to provide additional information to the user, or both. In chess additional assistance is required to move Missionaries to other side of river of other assistance is required
6. Tower of Hanoi Problem characteristic Satisfied Reason
Is the problem decomposable?
No One game have Single solution
Can solution steps be ignored or undone?
Yes
Is the problem universe predictable?
Yes
Is a good solution absolute or relative?
absolute Absolute solution : once you get one solution you do need to bother about other possible solution. Relative Solution : once you get one solution you have to find another possible solution to check which solution is best(i.e low cost). By considering this Tower of Hanoi isabsolute
Is the solution a state or a path?
Path Is the solution a state or a path to a state? – For natural language understanding, some of the words have different interpretations .therefore sentence may cause ambiguity. To solve the problem we need to find interpretation only
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
, the workings are not necessary (i.e path to solution is not necessary) So In tower of Hanoi winning state(goal state) describe path to state
What is the role of knowledge?
lot of knowledge helps to constrain the
search for a solution.
Does the task require human-interaction?
No Conversational In which there is intermediate
communication between a person and the computer, either to provide additional assistance to the computer or to provide additional information to the user, or both. In tower of Hanoi additional assistance is not required
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
1
Measuring Performance of Algorithms
There are two aspects of algorithmic performance: • Time
- Instructions take time. - How fast does the algorithm perform? - What affects its runtime?
• Space - Data structures take space - What kind of data structures can be used? - How does choice of data structure affect the
runtime? Algorithms can not be compared by running them on computers. Run time is system dependent. Even on same computer would depend on language Real time units like microseconds not to be used. Generally concerned with how the amount of work varies with the data.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
2
Measuring Time Complexity Counting number of operations involved in the algorithms to handle n items. Meaningful comparison for very large values of n.
Complexity of Linear Search
Consider the task of searching a list to see if it contains a particular value. • A useful search algorithm should be general. • Work done varies with the size of the list • What can we say about the work done for list of any length? i = 0; while (i < MAX && this_array[i] != target) i = i + 1; if (i <MAX) printf ( “Yes, target is there \n” ); else printf( “No, target isn’t there \n” ); The work involved : Checking target value with each of the n elements.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
3
no. of operations: 1 (best case) n (worst case) n/2 (average case) Computer scientists tend to be concerned about the Worst Case complexity. The worst case guarantees that the performance of the algorithm will be at least as good as the analysis indicates. Average Case Complexity: It is the best statistical estimate of actual performance, and tells us how well an algorithm performs if you average the behavior over all possible sets of input data. However, it requires considerable mathematical sophistication to do the average case analysis.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
4
Algorithm Analysis: Loops
Consider an n X n two dimensional array. Write a loop to store the row sums in a one-dimensional array rows and the overall total in grandTotal. LOOP 1: grandTotal = 0; for (k=0; k<n-1; ++k)
rows[k] = 0; for (j = 0; j <n-1; ++j)
rows[k] = rows[k] + matrix[k][j]; grandTotal = grandTotal + matrix[k][j];
It takes 2n2 addition operations LOOP 2: grandTotal =0; for (k=0; k<n-1; ++k)
rows[k] = 0; for (j = 0; j <n-1; ++j)
rows[k] = rows[k] + matrix[k][j]; grandTotal = grandTotal + rows[k];
This one takes n2 + n operations
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
5
Big-O Notation We want to understand how the performance of an algorithm responds to changes in problem size. Basically the goal is to provide a qualitative insight. The Big-O notation is a way of measuring the order of magnitude of a mathematical expression O(n) means on the Order of n Consider n4 + 31n2 + 10 = f (n) The idea is to reduce the formula in the parentheses so that it captures the qualitative behavior in simplest possible terms. We eliminate any term whose contribution to the total ceases to be significant as n becomes large. We also eliminate any constant factors, as these have no effect on the overall pattern as n increases. Thus we may approximate f(n) above as O (n4 + 31n2 + 10) = O( n4)
Let g(n) = n4
Then the order of f(n) is O[g(n)]. Definition: f(n) is O(g(n)) if there exist positive numbers c and N such that f(n) < = c g(n) for all n >=N. i.e. f is big –O of g if there is c such that f is not larger than cg for sufficiently large value of n ( greater than N)
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
6
c g(n) is an upper bound on the value of f(n) That is, the number of operations is at worst proportional to g(n) for all large values of n. How does one determine c and N? Let f(n) = 2 n2 + 3 n + 1 = O (n2 ) Now 2 n2 + 3 n + 1 < = c n2 Or 2 + (3/n) + ( 1 / n2 ) < = c You want to find c such that a term in f becomes the largest and stays the largest. Compare first and second term. First will overtake the second at N = 2, so for N= 2, c >= 3.75, for N = 5, c >= slightly more than 2, for very large value of n, c is almost 2. g is almost always > = f if it is multiplied by a constant c Look at it another way : suppose you want to find weight of elephants, cats and ants in a jungle. Now irrespective of how many of each item were there, the net weight would be proportional to the weight of an elephant. Incidentally we can also say f is big -O not only of n2 but also of n3 , n4 , n5 etc (HOW ?)
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
7
Loop 1 and Loop 2 are both in the same big-O category: O(n2)
Properties of Big-O notation: O(n) + O(m) = O(n) if n > = m The function log n to base a is order of O( log n to base b) For any values of a and b ( you can show that any log values are multiples of each other) Linear search Algorithm: Best Case - It’s the first value “order 1,” O(1) Worst Case - It’s the last value, n “order n,” O(n) Average - N/2 (if value is present) “order n,” O(n) Example 1: Use big-O notation to analyze the time efficiency of the following fragment of C code: for(k = 1; k <= n/2; k++) . . for (j = 1; j <= n*n; j++)
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
8
. . Since these loops are nested, the efficiency is n3/2, or O(n3) in big-O terms. Thus, for two loops with O[f1(n)] and O[f2(n)] efficiencies, the efficiency of the nesting of these two loops is O[f1(n) * f2(n)]. Example 2: Use big-O notation to analyze the time efficiency of the following fragment of C code: for (k=1; k<=n/2; k++) . . for (j = 1; j <= n*n; j++) . . The number of operations executed by these loops is the sum of the individual loop efficiencies. Hence, the efficiency is n/2+n2, or O(n2) in big-O terms.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
9
Thus, for two loops with O[f1(n)] and O[f2(n)] efficiencies, the efficiency of the sequencing of these two loops is O[fD(n)] where fD(n) is the dominant of the functions f1(n) and f2(n).
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
10
Complexity of Linear Search In measuring performance, we are generally concerned with how the amount of work varies with the data. Consider, for example, the task of searching a list to see if it contains a particular value. • A useful search algorithm should be general. • Work done varies with the size of the list • What can we say about the work done for list of any length? i = 0; while (i < MAX && this_array[i] != target) i = i + 1; if (i <MAX) printf ( “Yes, target is there \n” ); else printf( “No, target isn’t there \n” );
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
11
Order Notation How much work to find the target in a list containing N elements? Note: we care here only about the growth rate of work. Thus, we toss out all constant values. Best Case work is constant; it does not grow with the
size of the list. Worst and Average Cases work is proportional to the
size of the list, N.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
12
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
13
Order Notation
O(1) or “Order One”: Constant time does not mean that it takes only one operation does mean that the work doesn’t change as N changes is a notation for “constant work”
O(n) or “Order n”: Linear time does not mean that it takes N operations does mean that the work changes in a way that is
proportional to N is a notation for “work grows at a linear rate”
O(n2) or “Order n2 ”: Quadratic time O(n3) or “Order n3 ”: Cubic time Algorithms whose efficiency can be expressed in terms of a polynomial of the form amnm + am-1nm-1 + ... + a2n2 + a1n + a0 are called polynomial algorithms. Order O(nm). Some algorithms even take less time than the number of elements in the problem. There is a notion of logarithmic time algorithms. We know 103 =1000 So we can write it as log101000 = 3
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
14
Similarly suppose we have 26 =64 then we can write log264 = 6 If the work of an algorithm can be reduced by half in one step, and in k steps we are able to solve the problem then 2k = n or in other words log2n = k This algorithm will be having a logarithmic time complexity ,usually written as O(ln n). Because logan will increase much more slowly than n itself, logarithmic algorithms are generally very efficient. It also can be shown that it does not matter as to what base value is chosen. Example 3: Use big-O notation to analyze the time efficiency of the following fragment of C code: k = n; while (k > 1)
.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
15
. k = k/2;
Since the loop variable is cut in half each time through the loop, the number of times the statements inside the loop will be executed is log2n. Thus, an algorithm that halves the data remaining to be processed on each iteration of a loop will be an O(log2n) algorithm. There are a large number of algorithms whose complexity is O( n log2n) . Finally there are algorithms whose efficiency is dominated by a term of the form an
These are called exponential algorithms. They are of more theoretical rather than practical interest because they cannot reasonably run on typical computers for moderate values of n.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
16
Comparison of N, logN and N2
N O(LogN) O(N2) 16 4 256 64 6 4K
256 8 64K 1,024 10 1M
16,384 14 256M 131,072 17 16G 262,144 18 6.87E+10 524,288 19 2.74E+11
1,048,576 20 1.09E+12 1,073,741,824 30 1.15E+18
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Constraint Satisfaction
Constraint satisfaction is the process of finding a solution to a set of constraints that impose conditions that the variables must satisfy. A solution is therefore a set of values for the variables that satisfies all constraints—that is, a point in the feasible region.
The techniques used in constraint satisfaction depend on the kind of constraints being considered. Often used are constraints on a finite domain, to the point that constraint satisfaction problems are typically identified with problems based on constraints on a finite domain. Such problems are usually solved via search, in particular a form of backtracking or local search. Constraint propagation are other methods used on such problems; most of them are incomplete in general, that is, they may solve the problem or prove it unsatisfiable, but not always. Constraint propagation methods are also used in conjunction with search to make a given problem simpler to solve. Other considered kinds of constraints are on real or rational numbers; solving problems on these constraints is done via variable elimination or the simplex algorithm.
Complexity
Solving a constraint satisfaction problem on a finite domain is an NP complete problem with respect to the domain size. Research has shown a number of tractable subcases, some limiting the allowed constraint relations, some requiring the scopes of constraints to form a tree, possibly in a reformulated version of the problem. Research has also established
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
relationship of the constraint satisfaction problem with problems in other areas such as finite model theory.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
UNIT 02
GAME PLAYING
Introduction
Game playing has been a major topic of AI since the very beginning. Beside the attraction of the topic to people, it is also because its close relation to "intelligence", and its well-defined states
and rules.
The most common used AI technique in game is search. In some other problem-solving
activities, state change is solely caused by the action of the system itself. However, in multi- player games, states also depend on the actions of other players (systems) who usually have
different goals.
A special situation that has been studied most is two-person zero-sum game, where the two
players have exactly opposite goals, that is, each state can be evaluated by a score from one player's viewpoint, and the other's viewpoint is exactly the opposite. This type of game is
common, and easy to analyze, though not all competitions are zero-sum!
There are perfect information games (such as Chess and Go) and imperfect information games (such as Bridge and games where dice are used). Given sufficient time and space, usually an
optimum solution can be obtained for the former by exhaustive search, though not for the latter. However, for most interesting games, such a solution is usually too inefficient to be practically
used.
Minimax Procedure
For two-person zero-sum perfect-information game, if the two players take turn to move, the
minimax procedure can solve the problem given sufficient computational resources. This algorithm assumes each player takes the best move in each step.
First, we distinguish two types of nodes, MAX and MIN, in the state graph, determined by the
depth of the search tree.
Minimax procedure: starting from the leaves of the tree (with final scores with respect to one
player, MAX), and go backwards towards the root (the starting state).
At each step, one player (MAX) takes the action that leads to the highest score, while the other player (MIN) takes the action that leads to the lowest score.
All nodes in the tree will all be scored, and the path from root to the actual result is the one on
which all nodes have the same score.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Example:
Because of computational resources limitation, the search depth is usually restricted, and
estimated scores generated by a heuristic function are used in place of the actual score in the above procedure.
Example: Tic-tac-toe, with the difference of possible win paths as the henristic function.
Alpha-Beta Pruning
Very often, the game graph does not need to be fully explored using Minimax.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Based on explored nodes' score, inequity can be set up for nodes whose children haven't been exhaustively explored. Under certain conditions, some branches of the tree can be ignored
without changing the final score of the root.
In Alpha-Beta Pruning, each MAX node has an alpha value, which never decreases; each MIN
node has a beta value, which never increases. These values are set and updated when the value of
a child is obtained. Search is depth-first, and stops at any MIN node whose beta value is smaller than or equal to the alpha value of its parent, as well as at any MAX node whose alpha value is
greater than or equal to the beta value of its parent.
Examples: in the following partial trees, the other children of node (5) do not need to be generated.
(1)MAX[>=3] ----- (2)MIN[==3] ----- (3)MAX[==5]
| |------------ (4)MAX[==3]
|
|------------ (5)MIN[<=0] ----- (6)MAX[==0]
| ---------- X
| ---------- X
(1)MIN[<=5] ----- (2)MAX[==5] ----- (3)MIN[==5]
| |------------ (4)MIN[==3]
|
|------------ (5)MAX[>=8] ----- (6)MIN[==8]
| ---------- X
| ---------- X
This method is used in a Prolog program that plays Tic-tac-toe.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
ITERATIVE DEEPENING
While still an unintelligent algorithm, the iterative deepening search combines the positive elements of breadth-first and depth-first searching to create an algorithm which is often an
improvement over each method individually.
An iterative deepening search operates like a depth-first search, except slightly more constrained-
-there is a maximum depth which defines how many levels deep the algorithm can look for solutions. A node at the maximum level of depth is treated as terminal, even if it would
ordinarily have successor nodes. If a search "fails," then the maximum level is increased by one and the process repeats. The value for the maximum depth is initially set at 0 (i.e., only the initial
node).
The initial node is
checked for a goal
state; then, since the
search cannot go any deeper, it "fails."
The maximum level is increased
to 1; then the search restarts-the search (in its most basic
implementation) does not remember testing the initial node
already. This time, since the initial node is not at the maximum
level, it can be expanded.
Its successors, however, cannot;
they are checked...if they fail, they are treated as terminal nodes and
deleted. The search "fails," and the search once again restarts, with
maximum level 2.
Visited
Nodes
Current
Node
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
This continues until a solution is found.
An interesting observation is that the nodes in this search are first checked in the same order they would be checked in a breadth-first-search; however, since nodes are deleted as the search
progresses, much less memory is used at any given time.
The drawback to the iterative deepening search is clear from the walkthrough--it can be painfully
redundant, rechecking every node it has already checked with each new iteration. The algorithm can be enhanced to remember what nodes it has already seen, but this sacrifices most of the
memory efficiency that made the algorithm worthwhile in the first place, and nodes at the maximum level for one iteration will still need to be re-accessed and expanded in the following
iteration. Still, when memory is at a premium, iterative deepening is preferable to a plain depth- first search when there is danger of looping or the most efficient solution is desired.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Knowledge Representation
Typically, a problem to solve or a task to carry out, as well as what constitutes a solution, is only given informally, such as "deliver parcels promptly when they arrive" or "fix whatever is wrong
with the electrical system of the house."
The role of representations in solving problems
To solve a problem, the designer of a system must
flesh out the task and determine what constitutes a solution;
represent the problem in a language with which a computer can reason; use the computer to compute an output, which is an answer presented to a user or a
sequence of actions to be carried out in the environment; and interpret the output as a solution to the problem.
Knowledge is the information about a domain that can be used to solve problems in that domain.
To solve many problems requires much knowledge, and this knowledge must be represented in the computer. As part of designing a program to solve problems, we must define how the
knowledge will be represented. A representation scheme is the form of the knowledge that is used in an agent. A representation of some piece of knowledge is the internal representation of
the knowledge. A representation scheme specifies the form of the knowledge. A knowledge
base is the representation of all of the knowledge that is stored by an agent.
A good representation scheme is a compromise among many competing objectives. A
representation should be
rich enough to express the knowledge needed to solve the problem. as close to the problem as possible; it should be compact, natural, and maintainable. It
should be easy to see the relationship between the representation and the domain being represented, so that it is easy to determine whether the knowledge represented is correct.
A small change in the problem should result in a small change in the representation of the problem.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
amenable to efficient computation, which usually means that it is able to express features of the problem that can be exploited for computational gain and able to trade off accuracy
and computation time.
able to be acquired from people, data and past experiences.
Many different representation schemes have been designed. Many of these start with some of these objectives and are then expanded to include the other objectives. For example, some are
designed for learning and then expanded to allow richer problem solving and inference abilities. Some representation schemes are designed with expressiveness in mind, and then inference and
learning are added on. Some schemes start from tractable inference and then are made more natural, and more able to be acquired.
Some of the questions that must be considered when given a problem or a task are the following:
What is a solution to the problem? How good must a solution be? How can the problem be represented? What distinctions in the world are needed to solve
the problem? What specific knowledge about the world is required? How can an agent
acquire the knowledge from experts or from experience? How can the knowledge be debugged, maintained, and improved?
How can the agent compute an output that can be interpreted as a solution to the problem? Is worst-case performance or average-case performance the critical time to
minimize? Is it important for a human to understand how the answer was derived?
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Predicate Calculus
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
First-order logic
• Whereas propositional logic assumes the world contains facts,
• first-order logic (like natural language) assumes the world contains
• Objects: people, houses, numbers, colors, baseball games, wars, …
• Relations: red, round, prime, brother of, bigger than, part of, comes between, …
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Syntax of FOL: Basic elements
• Constants TaoiseachJohn, 2, DIT,...
• Predicates Brother, >,...
• Functions Sqrt, LeftLegOf,...
• Variables x, y, a, b,...
• Connectives , , , ,
• Equality =
• Quantifiers ,
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Atomic sentences
Atomic sentence = predicate (term1,...,termn)
or term1 = term2
Term = function (term1,...,termn)
or constant or variable
• E.g., Brother(TaoiseachJohn,RichardTheLionheart) >
(Length(LeftLegOf(Richard)),
Length(LeftLegOf(TaoiseachJohn)))
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Complex sentences
• Complex sentences are made from atomic
sentences using connectives
•
S, S1 S2, S1 S2, S1 S2, S1 S2,
E.g. Sibling(TaoiseachJohn,Richard)
Sibling(Richard,TaoiseachJohn)
>(1,2) ≤ (1,2)
>(1,2) >(1,2)
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Truth in first-order logic
• Sentences are true with respect to a model and an interpretation
• Model contains objects (domain elements) and relations among
them
•
• Interpretation specifies referents for
constant symbols → objects
predicate symbols → relations
function symbols → functional relations
• An atomic sentence predicate(term1,...,termn) is true
iff the objects referred to by term1,...,termn
are in the relation referred to by predicate
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Universal quantification
• <variables> <sentence>
•
Everyone at DIT is smart:
x At(x,DIT) Smart(x)
• x P is true in a model m iff P is true with x being each possible object in the model
•
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Roughly speaTaoiseach, equivalent to the conjunction of instantiations of P
• At(TaoiseachJohn,DIT) Smart(TaoiseachJohn)
At(Richard,DIT) Smart(Richard)
At(DIT,DIT) Smart(DIT)
...
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
A common mistake to avoid
• Typically, is the main connective with
•
• Common mistake: using as the main
connective with :
x At(x,DIT) Smart(x)
means “Everyone is at DIT and everyone is smart”
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Existential quantification
• <variables> <sentence>
• Someone at DIT is smart:
• x At(x,DIT) Smart(x)$
•
• x P is true in a model m iff P is true with x being some possible object in the model
•
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Roughly speaTaoiseach, equivalent to the disjunction of instantiations of P
• At(TaoiseachJohn,DIT) Smart(TaoiseachJohn)
At(Richard,DIT) Smart(Richard)
At(DIT,DIT) Smart(DIT)
...
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Another common mistake to
avoid • Typically, is the main connective with
• Common mistake: using as the main
connective with :
•
x At(x,DIT) Smart(x)
is true if there is anyone who is not at DIT!
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Properties of quantifiers
• x y is the same as y x
•
• x y is the same as y x
•
• x y is not the same as y x
•
• x y Loves(x,y) – “There is a person who loves everyone in the world”
–
• y x Loves(x,y) – “Everyone in the world is loved by at least one person”
–
• Quantifier duality: each can be expressed using the other
•
• x Likes(x,IceCream) x Likes(x,IceCream)
•
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Equality
• term1 = term2 is true under a given interpretation
if and only if term1 and term2 refer to the same
object
•
• E.g., definition of Sibling in terms of Parent:
•
x,y Sibling(x,y) [ (x = y) m,f (m = f)
Parent(m,x) Parent(f,x) Parent(m,y) Parent(f,y)]
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Using FOL
The kinship domain:
• Brothers are siblings
•
x,y Brother(x,y) Sibling(x,y)
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• One's mother is one's female parent
•
m,c Mother(c) = m (Female(m) Parent(m,c))
• “Sibling” is symmetric
•
x,y Sibling(x,y) Sibling(y,x)
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Knowledge engineering in FOL
1. Identify the task
2. Assemble the relevant knowledge
3. Decide on a vocabulary of predicates, functions, and constants
4. Encode general knowledge about the domain
5. Encode a description of the specific problem instance
6. Pose queries to the inference procedure and get answers
7. Debug the knowledge base
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Summary
• First-order logic:
•
– objects and relations are semantic primitives
– syntax: constants, functions, predicates,
equality, quantifiers
–
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Semantics for Predicate Calculus • An interpretation over D is an assignment
of the entities of D to each of the constant,
variable, predicate and function symbols of
a predicate calculus expression such that:
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• 1: Each constant is assigned an element of D
• 2: Each variable is assigned a non-empty subset of D;(these are the allowable substitutions for that variable)
• 3: Each predicate of arity n is defined on n arguments from D and defines a mapping from Dn into T,F
• 4: Each function of arity n is defined on n arguments from D and defines a mapping from Dn into D
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
The meaning of an expression
• Given an interpretation, the meaning of an
expression is a truth value assignment
over the interpretation.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Truth Value of Predicate Calculus
expressions
• Assume an expression E and an
interpretation I for E over a non empty
domain D. The truth value for E is
determined by:
• The value of a constant is the element of
D assigned to by I
• The value of a variable is the set of
elements assigned to it by I
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
More truth values
• The value of a function expression is that
element of D obtained by evaluating the
function for the argument values assigned
by the interpretation
• The value of the truth symbol “true” is T
• The value of the symbol “false” is F
• The value of an atomic sentence is either
T or F as determined by the interpretation I
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Similarity with Propositional logic
truth values
• The value of the negation of a sentence is
F if the value of the sentence is T and F
otherwise
• The values for conjunction, disjunction
,implication and equivalence are
analogous to their propositional logic
counterparts
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Universal Quantifier
• The value for
• Is T if S is T for all assignments to X under
I, and F otherwise
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Existential Quantifier
• The value for
• Is T if S is T for any assignment to X under
I, and F otherwise
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Some Definitions
• A predicate calculus expressions S1 is satisfied.
• Definition If there exists an Interpretation I and a variable assignment under I which returns a value T for S1 then S1 is said to be satisfied under I.
• S is Satisfiable if there exists an interpretation
and variable assignment that satisfies it: Otherwise it is unsatisfiable
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Some Definitions
• A set of predicate calculus expressions
S is satisfied.
• Definition For any interpretation I and
variable assignment where a value T is
returned for every element in S the the
set S is said to be satisfied,
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• A set of expressions is satisfiable if and only if there exist an intrepretation and variable assignment that satisfy every element
• If a set of expressions is not satisfiable, it is said to be inconsistent
• If S has a value T for all possible
interpretations , it is said to be valid
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Some Definitions
• A predicate calculus expressions S1 is satisfied.
• Definition If there exists an Interpretation I and a variable assignment under I which returns a value T for S1 then S1 is said to be satisfied under I.
• A set of predicate calculus expressions S is satisfied.
•
• Definition For any interpretation I and variable assignment where a value T is returned for every element in S the the set S is said to be satisfied,
• An inference rule is complete.
• Definition If all predicate calculus expressions X that logically follow from a set of expressions, S can be produced using the inference rule , then the inference rule is said to be complete.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• A predicate calculus expression X logically follows from a set S of predicate calculus expressions .
• For any interpretation I and variable assignment where S is satisfied, if X is also satisfied under the same interpretation and variable assignment then X logically follows from S.
• Logically follows is sometimes called entailment
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Soundness
• An inference rule is sound.
• If all predicate calculus expressions X
produced using the inference rule from a
set of expressions, S logically follow from
S then the inference rule is said to be
sound.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Completeness
• An inference Rule is complete if given a
set S of predicate calculus expressions, it
can infer every expression that logically
follows from S
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Equivalence
• Recall that :
• See attached word
document
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Predicate Calculus
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
First-order logic
• Whereas propositional logic assumes the world contains facts,
• first-order logic (like natural language) assumes the world contains
• Objects: people, houses, numbers, colors, baseball games, wars, …
• Relations: red, round, prime, brother of, bigger than, part of, comes between, …
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Syntax of FOL: Basic elements
• Constants TaoiseachJohn, 2, DIT,...
• Predicates Brother, >,...
• Functions Sqrt, LeftLegOf,...
• Variables x, y, a, b,...
• Connectives , , , ,
• Equality =
• Quantifiers ,
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Atomic sentences
Atomic sentence = predicate (term1,...,termn)
or term1 = term2
Term = function (term1,...,termn)
or constant or variable
• E.g., Brother(TaoiseachJohn,RichardTheLionheart) >
(Length(LeftLegOf(Richard)),
Length(LeftLegOf(TaoiseachJohn)))
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Complex sentences
• Complex sentences are made from atomic
sentences using connectives
•
S, S1 S2, S1 S2, S1 S2, S1 S2,
E.g. Sibling(TaoiseachJohn,Richard)
Sibling(Richard,TaoiseachJohn)
>(1,2) ≤ (1,2)
>(1,2) >(1,2)
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Truth in first-order logic
• Sentences are true with respect to a model and an interpretation
• Model contains objects (domain elements) and relations among
them
•
• Interpretation specifies referents for
constant symbols → objects
predicate symbols → relations
function symbols → functional relations
• An atomic sentence predicate(term1,...,termn) is true
iff the objects referred to by term1,...,termn
are in the relation referred to by predicate
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Universal quantification
• <variables> <sentence>
•
Everyone at DIT is smart:
x At(x,DIT) Smart(x)
• x P is true in a model m iff P is true with x being each possible object in the model
•
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Roughly speaTaoiseach, equivalent to the conjunction of instantiations of P
• At(TaoiseachJohn,DIT) Smart(TaoiseachJohn)
At(Richard,DIT) Smart(Richard)
At(DIT,DIT) Smart(DIT)
...
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
A common mistake to avoid
• Typically, is the main connective with
•
• Common mistake: using as the main
connective with :
x At(x,DIT) Smart(x)
means “Everyone is at DIT and everyone is smart”
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Existential quantification
• <variables> <sentence>
• Someone at DIT is smart:
• x At(x,DIT) Smart(x)$
•
• x P is true in a model m iff P is true with x being some possible object in the model
•
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Roughly speaTaoiseach, equivalent to the disjunction of instantiations of P
• At(TaoiseachJohn,DIT) Smart(TaoiseachJohn)
At(Richard,DIT) Smart(Richard)
At(DIT,DIT) Smart(DIT)
...
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Another common mistake to
avoid • Typically, is the main connective with
• Common mistake: using as the main
connective with :
•
x At(x,DIT) Smart(x)
is true if there is anyone who is not at DIT!
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Properties of quantifiers
• x y is the same as y x
•
• x y is the same as y x
•
• x y is not the same as y x
•
• x y Loves(x,y) – “There is a person who loves everyone in the world”
–
• y x Loves(x,y) – “Everyone in the world is loved by at least one person”
–
• Quantifier duality: each can be expressed using the other
•
• x Likes(x,IceCream) x Likes(x,IceCream)
•
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Equality
• term1 = term2 is true under a given interpretation
if and only if term1 and term2 refer to the same
object
•
• E.g., definition of Sibling in terms of Parent:
•
x,y Sibling(x,y) [ (x = y) m,f (m = f)
Parent(m,x) Parent(f,x) Parent(m,y) Parent(f,y)]
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Using FOL
The kinship domain:
• Brothers are siblings
•
x,y Brother(x,y) Sibling(x,y)
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• One's mother is one's female parent
•
m,c Mother(c) = m (Female(m) Parent(m,c))
• “Sibling” is symmetric
•
x,y Sibling(x,y) Sibling(y,x)
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Knowledge engineering in FOL
1. Identify the task
2. Assemble the relevant knowledge
3. Decide on a vocabulary of predicates, functions, and constants
4. Encode general knowledge about the domain
5. Encode a description of the specific problem instance
6. Pose queries to the inference procedure and get answers
7. Debug the knowledge base
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Summary
• First-order logic:
•
– objects and relations are semantic primitives
– syntax: constants, functions, predicates,
equality, quantifiers
–
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Semantics for Predicate Calculus • An interpretation over D is an assignment
of the entities of D to each of the constant,
variable, predicate and function symbols of
a predicate calculus expression such that:
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• 1: Each constant is assigned an element of D
• 2: Each variable is assigned a non-empty subset of D;(these are the allowable substitutions for that variable)
• 3: Each predicate of arity n is defined on n arguments from D and defines a mapping from Dn into T,F
• 4: Each function of arity n is defined on n arguments from D and defines a mapping from Dn into D
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
The meaning of an expression
• Given an interpretation, the meaning of an
expression is a truth value assignment
over the interpretation.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Truth Value of Predicate Calculus
expressions
• Assume an expression E and an
interpretation I for E over a non empty
domain D. The truth value for E is
determined by:
• The value of a constant is the element of
D assigned to by I
• The value of a variable is the set of
elements assigned to it by I
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
More truth values
• The value of a function expression is that
element of D obtained by evaluating the
function for the argument values assigned
by the interpretation
• The value of the truth symbol “true” is T
• The value of the symbol “false” is F
• The value of an atomic sentence is either
T or F as determined by the interpretation I
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Similarity with Propositional logic
truth values
• The value of the negation of a sentence is
F if the value of the sentence is T and F
otherwise
• The values for conjunction, disjunction
,implication and equivalence are
analogous to their propositional logic
counterparts
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Universal Quantifier
• The value for
• Is T if S is T for all assignments to X under
I, and F otherwise
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Existential Quantifier
• The value for
• Is T if S is T for any assignment to X under
I, and F otherwise
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Some Definitions
• A predicate calculus expressions S1 is satisfied.
• Definition If there exists an Interpretation I and a variable assignment under I which returns a value T for S1 then S1 is said to be satisfied under I.
• S is Satisfiable if there exists an interpretation
and variable assignment that satisfies it: Otherwise it is unsatisfiable
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Some Definitions
• A set of predicate calculus expressions
S is satisfied.
• Definition For any interpretation I and
variable assignment where a value T is
returned for every element in S the the
set S is said to be satisfied,
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• A set of expressions is satisfiable if and only if there exist an intrepretation and variable assignment that satisfy every element
• If a set of expressions is not satisfiable, it is said to be inconsistent
• If S has a value T for all possible
interpretations , it is said to be valid
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Some Definitions
• A predicate calculus expressions S1 is satisfied.
• Definition If there exists an Interpretation I and a variable assignment under I which returns a value T for S1 then S1 is said to be satisfied under I.
• A set of predicate calculus expressions S is satisfied.
•
• Definition For any interpretation I and variable assignment where a value T is returned for every element in S the the set S is said to be satisfied,
• An inference rule is complete.
• Definition If all predicate calculus expressions X that logically follow from a set of expressions, S can be produced using the inference rule , then the inference rule is said to be complete.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• A predicate calculus expression X logically follows from a set S of predicate calculus expressions .
• For any interpretation I and variable assignment where S is satisfied, if X is also satisfied under the same interpretation and variable assignment then X logically follows from S.
• Logically follows is sometimes called entailment
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Soundness
• An inference rule is sound.
• If all predicate calculus expressions X
produced using the inference rule from a
set of expressions, S logically follow from
S then the inference rule is said to be
sound.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Completeness
• An inference Rule is complete if given a
set S of predicate calculus expressions, it
can infer every expression that logically
follows from S
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Equivalence
• Recall that :
• See attached word
document
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Predicate Logic
The first of these, predicate logic, involves using standard forms of logical
symbolism which have been familiar to philosophers and mathematicians for many
decades. Most simple sentences, for example, ``Peter is generous'' or ``Jane gives a
painting to Sam,'' can be represented in terms of logical formulae in which a
predicate is applied to one or more arguments (the term `argument' as used in
predicate logic is similar to, but not identical with, its use to refer to the inputs to a
procedure in POP-11):
PREDICATE ARGUMENTS
generous (peter)
gives (jane, painting, sam)
Consider the following sentence: ``Every respectable villager worships a deity.'' A
moment's reflection will reveal that this is ambiguous. Is it saying that there is one
single deity to which each respectable villager offers worship? Or does each
worshipper have his or her own deity, to which a fellow respectable villager may
or may not be also praying? With predicate logic it is easy to reveal the nature of
the ambiguity, by a device known as quantification. Quantification allows one to
talk in a general way about all things of a certain class or about some particular but
unspecified thing of a certain class. We can, for instance, express the proposition
``All of Jane's friends are generous'' in terms of the following formula:
For any X: IF friend(X,jane) THEN generous(X)
while the sentence ``Jane has at least one friend who is generous'' can be expressed
as follows:
For some X: friend(X,jane) AND generous(X)
The expressions `For any X' and `For some X' are known as quantifiers. We can
now use quantification to exhibit the ambiguity of the sentence about the
respectable villagers. The first reading of it can be represented as
For some X: for any Y: deity(X)
AND IF (villager(Y) AND respectable(Y)) THEN worships(Y,X)
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
while the second can be represented as
For any Y: (IF villager(Y) AND respectable(Y) THEN
For some X: deity(X) AND worships(Y,X))
It is thus possible to show in a clear way that the original sentence can express (at
least) two quite distinct propositions. It is possible to infer from the first, but not
from the second, that if Margaret and Neil are two respectable villagers, then they
both worship the same entity. (In the interests of ecumenical peace, however, it is
sometimes better to refrain from letting such ambiguities come out into the open!)
Predicate logic has a long pedigree. Its roots go back at least as far as Aristotle,
although in its current form it was developed starting in the late nineteenth century.
Associated with it are techniques for the analysis of many conceptual structures in
our common thought. Because these analytical techniques are well-understood, and
because it is relatively easy to express the formulae of predicate logic in AI
languages such as LISP or POP-11, it has been a very popular knowledge
representation symbolism within AI. Predicate logic also embodies a set of
systematic procedures for proving that certain formulae can or cannot be logically
derived from others and such logical inference procedures have been used as the
backbone for problem-solving systems in AI. Predicate logic is in itself an
extremely formal kind of representation mechanism. Its supporters believe,
however, that it can be used to fashion conceptual tools which reproduce much of
the subtlety and nuance of ordinary informal thinking.
A popular method for incorporating predicate logic in AI programs has involved a
machine-based inference procedure called resolution, first proposed by J. A.
Robinson (1965). This makes it relatively easy to represent expert, or
commonsense, knowledge in terms of a set of axioms expressed in a special form
of predicate calculus formulae and then derive consequences from these axioms.
Indeed an AI programming language has been developed called Prolog
(PROgramming in LOGic) which employs a resolution inference mechanism
together with a restricted form of predicate logic (Clocksin and Mellish, 1981) and
its proponents claim that it is a powerful tool for building knowledge-based
systems.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
RESOLUTION
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Problem Definition Input
1. Database containing formally represented facts: First-order
logic sentences converted into clause form.
2. Inference rule: Resolution principle (MP & MT)
Goal: An inference procedure
Requirements:
1. Soundness – every sentence produced by the procedure will
be “true”.
2. Completeness – every “true” sentence can be produced by the procedure
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Definitions • Terms:
– Constants (e.g. “c1”, “c2”)
– Variables (e.g. “x1”, “x2”)
– Functions (e.g. “f(x1, x2)”)
• Predicate – Indicator function on terminals.
– e.g. EVEN(t) : Numbers TRUE, FALSE
• Atom – the application of a predicate on a literal.
– e.g. EVEN(t)
• Literal – A predicate or its negation
– e.g. EVEN(t), ¬EVEN(t)
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Definitions • Formulae - Recursively defined:
– Every Atom is a formula
– If w1, w2 are formulae, then so are:
w1, w1 w2 , w1 w2 , w1 w2 , w1, w1
• Clause – Disjunction (or) of literals.
– e.g. L1 V L2 V ¬L3 (can be written as: L1, L2 ,¬L3)
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
The Resolution Principle • Given:
– A clause Φ containing the literal: φ
– A clause Ψ containing the literal: ¬φ
• We can conclude:
– (Φ – φ) U (Ψ – ¬φ)
• Or in the generalized version…
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
The Resolution Principle • Given:
– A clause Φ containing the literal: φ
– A clause Ψ containing the literal: ¬ψ
– A most general unifier g of φ and ¬ψ
• We can conclude:
– ((Φ – φ) U (Ψ – ¬ψ)) | g
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
The Resolution Procedure • Let DB be a set of true sentences without
contradictions, and C be a sentence we want to prove.
The Idea - proof by negation:
• Assume ¬C and try to find a contradiction.
Intuition
• If all DB sentences are true, and assuming ¬C creates a contradiction then C must be inferred from DB.
•
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
The Resolution Procedure 1. Convert: DB U ¬C to clause form.
2. If there is a contradiction in DB, C was proved.
Terminate.
3. Select two clauses and add their resolvents to the
current DB. If there are no resolvable clauses – the procedure fails, terminate. Else, go to step 2.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Conversion to Clause Form 1. Eliminate all :
– Replace AB with ¬A V B
2. Distribute negations:
– Replace ¬¬A with A
– A B with A B
– …
3. Eliminate existential quantifiers by replacing with
Skölem constants or functions:
– e.g. x y P1 x, y P2 x, y x P1 x, f x P2 x, f x
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Conversion to Clause Form 4. Rename variables to avoid duplicates between
different quantifiers.
5. Drop all universal quantifiers
6. Put expression into conjunctive normal form (CNF).
7. Convert to clauses (sets of literals).
8. Rename variables to avoid duplicates between
different clauses.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Conversion to Clauses - Example • Initial expression:
y on x, y bigger y, x
x brick x y on y, x brick y
y, z on x, y on x, z equal y, z
• Remove implications:
y on x, y bigger y, x
x brick x y on y, x brick y
y, z on x, y on x, z equal y, z
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Conversion to Clauses - Example • Previous step:
y on x, y bigger y, x
x brick x y on y, x brick y
y, z on x, y on x, z equal y, z
• Move negations inwards:
y on x, y bigger y, x
x brick x y on y, x brick y
y, z on x, y on x, z equal y, z
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Conversion to Clauses - Example • Previous step:
y on x, y bigger y, x
x brick x y on y, x brick y
y, z on x, y on x, z equal y, z
• Remove existential quantifiers:
on x, support x bigger support x , x
x brick x y on y, x brick y
y, z on x, y on x, z equal y, z
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Conversion to Clauses - Example • Previous step:
on x, support x bigger support x , x
x brick x y on y, x brick y
y, z on x, y on x, z equal y, z
• Rename variables:
on x, support x bigger support x , x
x brick x y on y, x brick y
w, z on x, w on x, z equal w, z
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Conversion to Clauses - Example • Previous step:
on x, support x bigger support x , x
x brick x y on y, x brick y
w, z on x, w on x, z equal w, z
• Remove universals quantifiers:
on x, support x bigger support x , x
brick x on y, x brick y
on x, w on x, z equal w, z
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Conversion to Clauses - Example • Previous step:
on x, support x bigger support x , x
brick x on y, x brick y
on x, w on x, z equal w, z
• Convert to CNF:
brick x on x, support x
brick x bigger support x , x
brick x on y, x brick y
brick x on x, w on x, z equal w, z
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Conversion to Clauses - Example • Previous step:
brick x on x, support x
brick x bigger support x , x
brick x on y, x brick y
brick x on x, w on x, z equal w, z
• Convert to clauses:
brick x , on x, support x ,
brick x ,bigger support x , x ,
brick x , on y, x , brick y ,
brick x , on x, w , on x, z , equal w, z
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Conversion to Clauses - Example • Previous step:
brick x , on x, support x ,
brick x ,bigger support x , x ,
brick x , on y, x , brick y ,
brick x , on x, w , on x, z , equal w, z
• Rename variables:
brick x1 , on x1, support x1 ,
brick x2 ,bigger support x2 , x2 ,
brick x3 , on y, x3 , brick y ,
brick x4 , on x4 , w , on x4 , z , equal w, z
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Simple Example • The problem:
– “Heads I win, tails you lose.”
– Use resolution to show I always win.
• Facts representation:
1. H Win Me
2. T Loose You
3. H T
4. Loose You Win Me
Goal : Win Me
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Simple Example
• Proof: 1. H ,Win Me
2. T , Loose You
3. H ,T
4. Loose You ,Win Me
5. Win Me
6. T ,Win Me
7. T ,Win Me
8. Win Me
2, 4
1, 3
6, 7
9. 5,8
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
STRUCTURED REPRESNTATION OF KNOWLEDGE
Representing knowledge using logical formalism, like predicate logic, has several advantages.
They can be combined with powerful inference mechanisms like resolution, which makes reasoning with facts easy. But using logical formalism complex structures of the world, objects
and their relationships, events, sequences of events etc. can not be described easily.
A good system for the representation of structured knowledge in a particular domain should
posses the following four properties:
(i) Representational Adequacy:- The ability to represent all kinds of knowledge that are needed
in that domain.
(ii) Inferential Adequacy :- The ability to manipulate the represented structure and infer new
structures.
(iii) Inferential Efficiency:- The ability to incorporate additional information into the knowledge
structure that will aid the inference mechanisms.
(iv) Acquisitional Efficiency :- The ability to acquire new information easily, either by direct insertion or by program control.
The techniques that have been developed in AI systems to accomplish these objectives fall under
two categories:
1. Declarative Methods:- In these knowledge is represented as static collection of facts which are
manipulated by general procedures. Here the facts need to be stored only one and they can be used in any number of ways. Facts can be easily added to declarative systems without changing
the general procedures.
2. Procedural Method:- In these knowledge is represented as procedures. Default reasoning and probabilistic reasoning are examples of procedural methods. In these, heuristic knowledge of
“How to do things efficiently “can be easily represented.
In practice most of the knowledge representation employ a combination of both. Most of the knowledge representation structures have been developed to handle programs that handle natural
language input. One of the reasons that knowledge structures are so important is that they provide a way to represent information about commonly occurring patterns of things . such
descriptions are some times called schema. One definition of schema is
“Schema refers to an active organization of the past reactions, or of past experience, which must
always be supposed to be operating in any well adapted organic response”.
By using schemas, people as well as programs can exploit the fact that the real world is not random. There are several types of schemas that have proved useful in AI programs. They
include
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
(i) Frames:- Used to describe a collection of attributes that a given object possesses (eg: description of a chair).
(ii) Scripts:- Used to describe common sequence of events (eg:- a restaurant scene).
(iii) Stereotypes :- Used to described characteristics of people.
(iv) Rule models:- Used to describe common features shared among a
set of rules in a production system.
Frames and scripts are used very extensively in a variety of AI programs. Before selecting any
specific knowledge representation structure, the following issues have to be considered.
(i) The basis properties of objects , if any, which are common to every problem domain must be
identified and handled appropriately.
(ii) The entire knowledge should be represented as a good set of primitives.
(iii) Mechanisms must be devised to access relevant parts in a large knowledge base.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
UNIT 03 KNOWLEDGE REPRESENTATION • Knowledge is a general term.
An answer to the question, "how to represent knowledge", requires an
analysis to distinguish between knowledge “how” and knowledge “that”.
knowing "how to do something".
e.g. "how to drive a car" is a Procedural knowledge.
knowing "that something is true or false".
e.g. "that is the speed limit for a car on a motorway" is a Declarative
knowledge.
• knowledge and Representation are two distinct entities. They play a
central but distinguishable roles in intelligent system.
Knowledge is a description of the world.
It determines a system's competence by what it knows.
Representation is the way knowledge is encoded.
It defines a system's performance in doing something.
• Different types of knowledge require different kinds of representation. The
Knowledge Representation models/mechanisms are often based on:
◊ Logic ◊ Rules
◊ Frames ◊ Semantic Net
• Different types of knowledge require different kinds of reasoning. 03
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR -Introduction
1. Introduction
.
Knowledge is a general term.
Knowledge is a progression that starts with data which is of limited utility.
By organizing or analyzing the data, we understand what the data means,
and this becomes information.
The interpretation or evaluation of information yield knowledge.
An understanding of the principles embodied within the knowledge is wisdom.
• Knowledge Progression
Data Organizing Interpretation Understanding
Information Knowledge Wisdom
Analyzing Evaluation Principles
Fig 1 Knowledge Progression
Data is viewed as collection of
disconnected facts. Information emerges when
relationships among facts are
established and understood;
Provides answers to "who",
"what", "where", and "when". Knowledge emerges when
relationships among patterns
are identified and understood;
Provides answers as "how" . Wisdom is the pinnacle of
understanding, uncovers the
principles of relationships that
describe patterns.
Provides answers as "why" .
: Example : It is raining.
: Example : The temperature dropped 15
degrees and then it started raining.
: Example : If the humidity is very high
and the temperature drops
substantially, then atmospheres is
unlikely to hold the moisture, so it rains. : Example : Encompasses understanding
of all the interactions that happen
between raining, evaporation, air
currents, temperature gradients and
changes.
04
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
,
• Knowledge Model (Bellinger 1980) KR -Introduction
A knowledge model tells, that as the degree of “connectedness” and
“understanding” increases, we progress from data through information and
knowledge to wisdom.
Degree of Connectedness
Wisdom
Understanding
principles
Knowledge
Understanding
patterns
Information
Understanding
Data relations Degree of
Understanding
Fig. Knowledge Model
The model represents transitions and understanding.
the transitions are from data, to information, to knowledge, and finally
to wisdom;
the understanding support the transitions from one stage to the next
stage.
The distinctions between data, information, knowledge, and wisdom are
not very discrete. They are more like shades of gray, rather than black and
white (Shedroff, 2001).
"data" and "information" deal with the past; they are based on the
gathering of facts and adding context.
"knowledge" deals with the present that enable us to perform.
"wisdom" deals with the future, acquire vision for what will be, rather
than for what is or was.
05
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR -Introduction
• Knowledge Category
.
Knowledge is categorized into two major types: Tacit and Explicit.
term “Tacit” corresponds to "informal" or "implicit" type of knowledge,
term “Explicit” corresponds to "formal" type of knowledge.
Tacit knowledge Explicit knowledge
◊ Exists within a human being; ◊ Exists outside a human being;
it is embodied. it is embedded.
◊ Difficult to articulate formally. ◊ Can be articulated formally.
◊ Difficult to communicate or ◊ Can be shared, copied, processed
share. and stored.
◊ Hard to steal or copy. ◊ Easy to steal or copy
◊ Drawn from experience, ◊ Drawn from artifact of some type as
action, subjective insight. principle, procedure, process,
concepts.
(The next slide explains more about tacit and explicit knowledge). 06
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR -Introduction
Knowledge Typology Map
The map shows two types of knowledge - Tacit and Explicit knowledge.
Tacit knowledge comes from "experience", "action", "subjective" , "insight"
Explicit knowledge comes from "principle", "procedure", "process",
"concepts", via transcribed content or artifact of some type.
Experience Doing
Principles Procedure
(action)
Tacit Explicit
Process
Knowledge Knowledge
Subjective Knowledge Concept
Insight
Context
Information
Fig. Knowledge Typology Map Data
Facts
◊ Facts : are data or instance that are specific and unique.
◊ Concepts : are class of items, words, or ideas that are known by a
common name and share common features. ◊ Processes : are flow of events or activities that describe how things
work rather than how to do things. ◊ Procedures : are series of step-by-step actions and decisions that
result in the achievement of a task. ◊ Principles : are guidelines, rules, and parameters that govern;
principles allow to make predictions and draw implications;
These artifacts are used in the knowledge creation process to create
two types of knowledge: declarative and procedural explained below.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
07 Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
.
• Knowledge Type KR -Introduction
Cognitive psychologists sort knowledge into Declarative and Procedural
category and some resea hers added Strategic as a third category.
‡ About procedural knowledge, there is some disparity in views.
− One, it is close to Tacit knowledge, it manifests itself in the doing of some-
thing yet cannot be expressed in words; e.g., we read faces and moods.
− Another, it is close to declarative knowledge; the difference is that a task or
method is described instead of facts or things. ‡ All declarative knowledge are explicit knowledge; it is knowledge that
can be and has been articulated. ‡ The strategic knowledge is thought as a subset of declarative
knowledge.
Procedural knowledge ◊ Knowledge about "how to do
something"; e.g., to determine if
Peter or Robert is older, first find
their ages. ◊ Focuses on tasks that must be
performed to reach a particular
objective or goal. ◊ Examples : procedures, rules,
strategies, agendas, models.
Declarative knowledge
◊ Knowledge about "that
something is true or false". e.g.,
A car has four tyres; Peter is
older than Robert; ◊ Refers to representations of
objects and events; knowledge
about facts and relationships; ◊ Example : concepts, objects,
facts, propositions, assertions,
semantic nets, logic and
descriptive models. 08
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Relationship among Knowledge Type KR -Introduction
The relationship among explicit, implicit, tacit, declarative and procedural
knowledge are illustrated below.
Knowledge
Start
No
Yes
Implicit
Has been
Can not be
articulated articulated
No
Yes
Tacit
Explicit
Facts and Motor Skill
things (Manual)
Describing Declarative Procedural Doing
Tasks and Mental Skill
methods
Fig. Relationship among types of knowledge
The Figure shows :
Declarative knowledge is tied to "describing" and
Procedural knowledge is tied to "doing."
Vertical arrows connecting explicit with declarative and tacit with
procedural, indicate the strong relationships exist among them.
Horizontal arrow connecting declarative and procedural indicates that we
often develop procedural knowledge as a result of starting with declarative
knowledge. i.e., we often "know about" before we "know how".
Therefore, we may view :
− all procedural knowledge as tacit knowledge, and
− all declarative knowledge as explicit knowledge. 09
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR -framework
1.1 Framework of Knowledge Representation (Poole 1998)
.
Computer requires a well-defined problem description to process and
provide well-defined acceptable solution.
To collect fragments of knowledge we need first to formulate a description
in our spoken language and then represent it in formal language so that
computer can understand. The computer can then use an algorithm to
compute an answer. This process is illustrated below.
Problem Solve
Solution
Represent Interpret Informal
Formal
Compute
Representation Output
Fig. Knowledge Representation Framework
The steps are
− The informal formalism of the problem takes place first.
− It is then represented formally and the computer produces an output.
− This output can then be represented in a informally described solution
that user understands or checks for consistency.
Note : The Problem solving requires
− formal knowledge representation, and
− conversion of informal knowledge to formal knowledge , that is
conversion of implicit knowledge to explicit knowledge.
10
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Knowledge and Representation
.
Problem solving requires large amount of knowledge and some
mechanism for manipulating that knowledge.
The Knowledge and the Representation are distinct entities, play a
central but distinguishable roles in intelligent system.
− Knowledge is a description of the world;
it determines a system's competence by what it
knows. − Representation is the way knowledge is encoded;
it defines the system's performance in doing something.
In simple words, we :
− need to know about things we want to represent , and
− need some means by which things we can manipulate.
◊ know things ‡ Objects - facts about objects in the domain.
to represent ‡ Events - actions that occur in the domain.
‡ Performance - knowledge about how to do things
‡ Meta- - knowledge about what we know
knowledge
◊ need means ‡ Requires - to what we represent ;
to manipulate some formalism
Thus, knowledge representation can be considered at two levels :
(a) knowledge level at which facts are described, and
(b) symbol level at which the representations of the objects, defined
in terms of symbols, can be manipulated in the programs.
Note : A good representation enables fast and accurate access to
knowledge and understanding of the content. 11
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Mapping between Facts and Representation
Knowledge is a collection of “facts” from some domain.
We need a representation of "facts" that can be manipulated by a
program. Normal English is insufficient, too hard currently for a computer
program to draw inferences in natural languages.
Thus some symbolic representation is necessary.
Therefore, we must be able to map "facts to symbols" and "symbols to
facts" using forward and backward representation mapping.
Example : Consider an English sentence
Reasoning
programs
Facts Internal
Representation
English English
understanding generation
English
Representation
Facts
Representations
◊ Spot is a dog A fact represented in English sentence
◊ dog (Spot) Using forward mapping function the
above fact is represented in logic
◊ ∀ x : dog(x) → hastail (x) A logical representation of the fact that
"all dogs have tails"
Now using deductive mechanism we can generate a new
representation of object :
◊ hastail (Spot) ◊ Spot has a tail
[it is new knowledge]
A new object representation
Using backward mapping function to
generate English sentence 12
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Forward and Backward Representation
The forward and backward representations are elaborated below :
Desired real
Initial reasoning Final
Facts Facts
Forward Backward representation representation mapping mapping
Internal English
Representation
Representation
Operated by
program
‡ The doted line on top indicates the abstract reasoning process that
a program is intended to model.
‡ The solid lines on bottom indicates the concrete reasoning process
that the program performs.
13
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• KR System Requirements
A good knowledge representation enables fast and accurate access to
knowledge and understanding of the content.
A knowledge representation system should have following properties.
◊ Representational The ability to represent all kinds of knowledge
Adequacy that are needed in that domain.
◊ Inferential Adequacy The ability to manipulate the representational
structures to derive new structure corresponding
to new knowledge inferred from old .
◊ Inferential Efficiency The ability to incorporate additional information
into the knowledge structure that can be used to
focus the attention of the inference mechanisms
in the most promising direction.
◊ Acquisitional The ability to acquire new knowledge using
Efficiency automatic methods wherever possible rather than
reliance on human intervention.
Note : To date no single system can optimizes all of the above properties. 14
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - schemes
1.2 Knowledge Representation Schemes
There are four types of Knowledge representation :
Relational, Inheritable, Inferential, and Declarative/Procedural.
◊ Relational Knowledge :
provides a framework to compare two objects based on equivalent
attributes.
any instance in which two different objects are compared is a
relational type of knowledge.
◊ Inheritable Knowledge
− is obtained from associated objects.
− it prescribes a structure in which new objects are created which may
inherit all or a subset of attributes from existing objects.
◊ Inferential Knowledge
− is inferred from objects through relations among objects.
− e.g., a word alone is a simple syntax, but with the help of other
words in phrase the reader may infer more from a word; this
inference within linguistic is called semantics.
◊ Declarative Knowledge
− a statement in which knowledge is specified, but the use to which
that knowledge is to be put is not given.
− e.g. laws, people's name; these are facts which can stand alone, not
dependent on other knowledge;
Procedural Knowledge
− a representation in which the control information, to use the
knowledge, is embedded in the knowledge itself.
− e.g. computer programs, directions, and recipes; these indicate
specific use or implementation;
These KR schemes are detailed in next few slides 15
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - schemes
• Relational Knowledge :
This knowledge associates elements of one domain with another domain.
− Relational knowledge is made up of objects consisting of attributes and
their corresponding associated values.
− The results of this knowledge type is a mapping of elements among
different domains.
The table below shows a simple way to store facts.
− The facts about a set of objects are put systematically in columns. −
This representation provides little opportunity for inference.
Table - Simple Relational Knowledge
Player Height Weight Bats - Throws
Aaron 6-0 180 Right - Right
Mays 5-10 170 Right - Right
Ruth 6-2 215 Left - Left
Williams 6-3 205 Left - Right
‡ Given the facts it is not possible to answer simple question such as :
" Who is the heaviest player ? ".
but if a procedure for finding heaviest player is provided, then
these facts will enable that procedure to compute an answer.
‡ We can ask things like who "bats – left" and "throws – right".
16
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - schemes
• Inheritable Knowledge :
Here the knowledge elements inherit attributes from their parents.
The knowledge is embodied in the design hiera hies found in the
functional, physical and process domains. Within the hiera hy, elements
inherit attributes from their parents, but in many cases not all attributes of
the parent elements be prescribed to the child elements.
The inheritance is a powerful form of inference, but not adequate. The basic
KR needs to be augmented with inference mechanism.
The KR in hiera hical structure, shown below, is called “semantic network”
or a collection of “frames” or “slot-and-filler structure". The structure shows
property inheritance and way for insertion of additional knowledge. Property inheritance : The objects or elements of specific classes inherit
attributes and values from more general classes. The classes are organized in
a generalized hiera hy. Baseball knowledge − isa : show class inclusion
Person handed
Right
− instance : show class membership isa height
Adult 5.10
Male
isa height 6.1
EQUAL bats Baseball
batting-average
handed Player
isa
0.252
batting-average batting-average
0.106 Pitcher Fielder 0.262
instance instance
Chicago team Three Finger Pee-Wee- team Brooklyn-
Cubs Brown Reese Dodger
Fig. Inheritable knowledge representation (KR)
‡ The directed arrows represent attributes (isa, instance, team) originates
at object being described and terminates at object or its value.
‡ The box nodes represents objects and values of the attributes.
[Continued in the next slide]
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
17 Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - schemes
[from previous slide – example]
◊ Viewing a node as a frame
Example : Baseball-player
isa : Adult-Male
Bates : EQUAL handed
Height : 6.1
Batting-average : 0.252
◊ Algorithm : Property Inheritance
Retrieve a value V for an attribute A of an instance object O.
Steps to follow:
1. Find object O in the knowledge base.
2. If there is a value for the attribute A then report that value.
3. Else, if there is a value for the attribute instance; If not, then fail.
4. Else, move to the node corresponding to that value and look for a
value for the attribute A; If one is found, report it.
5. Else, do until there is no value for the “isa” attribute or
until an answer is found :
(a) Get the value of the “isa” attribute and move to that node.
(b) See if there is a value for the attribute A; If yes, report it.
This algorithm is simple. It describes the basic mechanism of
inheritance. It does not say what to do if there is more than one value
of the instance or “isa” attribute.
This can be applied to the example of knowledge base illustrated, in
the previous slide, to derive answers to the following queries :
− team (Pee-Wee-Reese) = Brooklyn–Dodger
− batting–average(Three-Finger-Brown) =
0.106 − height (Pee-Wee-Reese) = 6.1
− bats (Three Finger Brown) = right
[For explanation - refer book on AI by Elaine Rich & Kevin Knight, page 112] 18
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - schemes
• Inferential Knowledge :
This knowledge generates new information from the given information.
This new information does not require further data gathering form sou e,
but does require analysis of the given information to generate new
knowledge.
− given a set of relations and values, one may infer other values or
relations.
− a predicate logic (a mathematical deduction) is used to infer from a
set of attributes.
− inference through predicate logic uses a set of logical operations to
relate individual data.
− the symbols used for the logic operations are :
" → " (implication), " ¬ " (not), " V " (or), " Λ " (and),
" ∀ " (for all), " ∃ " (there exists).
Examples of predicate logic statements :
1. "Wonder" is a name of a dog : dog (wonder)
2. All dogs belong to the class of animals : ∀ x : dog (x) → animal(x)
3. All animals either live on land or in ∀ x : animal(x) → live (x,
water : land) V live (x, water)
From these three statements we can infer that :
" Wonder lives either on land or on water."
Note : If more information is made available about these objects and their
relations, then more knowledge can be inferred.
19
Example :
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - schemes
• Declarative/Procedural Knowledge
Differences between Declarative/Procedural knowledge is not very clear.
Declarative knowledge :
Here, the knowledge is based on declarative facts about axioms and
domains .
− axioms are assumed to be true unless a counter example is found to
invalidate them.
− domains represent the physical world and the pe eived functionality.
− axiom and domains thus simply exists and serve as declarative
statements that can stand alone.
Procedural knowledge:
Here, the knowledge is a mapping process between domains that specify
“what to do when” and the representation is of “how to make it” rather than
“what it is”. The procedural knowledge :
− may have inferential efficiency, but no inferential adequacy and
acquisitional efficiency.
− are represented as small programs that know how to do specific things,
how to proceed.
Example : A parser in a natural language has the knowledge that a noun
phrase may contain articles, adjectives and nouns. It thus accordingly call
routines that know how to process articles, adjectives and nouns. 20
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
1.3 Issues in Knowledge Representation KR - issues
The fundamental goal of Knowledge Representation is to facilitate
inferencing (conclusions) from knowledge.
The issues that arise while using KR techniques are many. Some of these
are explained below.
◊ Important Attributes :
Any attribute of objects so basic that they occur in almost every
problem domain ?
◊ Relationship among attributes:
Any important relationship that exists among object attributes ?
◊ Choosing Granularity :
At what level of detail should the knowledge be represented ?
◊ Set of objects :
How sets of objects be represented ?
◊ Finding Right structure :
Given a large amount of knowledge stored, how can relevant parts be
accessed ?
Note : These issues are briefly explained, referring previous example, Fig.
Inheritable KR. For detail readers may refer book on AI by Elaine Rich & Kevin
Knight- page 115 – 126. 21
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - issues
• Important Attributes : (Ref. Example - Fig. Inheritable KR)
There are attributes that are of general significance.
There are two attributes "instance" and "isa", that are of general
importance. These attributes are important because they support property
inheritance.
• Relationship among Attributes : (Ref. Example- Fig. Inheritable KR)
The attributes to describe objects are themselves entities they represent.
The relationship between the attributes of an object, independent of
specific knowledge they encode, may hold properties like:
Inverses, existence in an isa hiera hy, techniques for reasoning about
values and single valued attributes.
◊ Inverses :
This is about consistency check, while a value is added to one
attribute. The entities are related to each other in many different ways.
The figure shows attributes (isa, instance, and team), each with a
directed arrow, originating at the object being described and
terminating either at the object or its value.
There are two ways of realizing this:
‡ first, represent two relationships in a single representation; e.g., a
logical representation, team(Pee-Wee-Reese, Brooklyn–Dodgers),
that can be interpreted as a statement about Pee-Wee-Reese or
Brooklyn–Dodger.
‡ second, use attributes that focus on a single entity but use them in
pairs, one the inverse of the other; for e.g., one, team = Brooklyn–
Dodgers , and the other, team = Pee-Wee-Reese, . . . .
This second approach is followed in semantic net and frame-based
systems, accompanied by a knowledge acquisition tool that guarantees
the consistency of inverse slot by checking, each time a value is added
to one attribute then the corresponding value is added to the inverse.
22
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - issues
◊ Existence in an "isa" hiera hy :
This is about generalization-specialization, like, classes of objects and
specialized subsets of those classes. There are attributes and
specialization of attributes.
Example: the attribute "height" is a specialization of general attribute
"physical-size" which is, in turn, a specialization of "physical-attribute".
These generalization-specialization relationships for attributes are
important because they support inheritance.
◊ Techniques for reasoning about values :
This is about reasoning values of attributes not given explicitly.
Several kinds of information are used in reasoning, like,
height : must be in a unit of length,
age : of person can not be greater than the age of person's parents.
The values are often specified when a knowledge base is created.
◊ Single valued attributes :
This is about a specific attribute that is guaranteed to take a unique
value.
Example : A baseball player can at time have only a single height and
be a member of only one team. KR systems take different approaches
to provide support for single valued attributes.
23
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - issues
• Choosing Granularity
What level should the knowledge be represented and
what are the primitives ?
Should there be a small number or should there be a large number of
low-level primitives or High-level facts.
High-level facts may not be adequate for inference while Low-level
primitives may require a lot of storage.
Example of Granularity :
− Suppose we are interested in following facts
John spotted Sue.
− This could be represented as
Spotted (agent(John), object (Sue))
− Such a representation would make it easy to answer questions such are
Who spotted Sue ?
− Suppose we want to know
Did John see Sue ?
− Given only one fact, we cannot discover that answer.
− We can add other facts, such as
Spotted (x , y) → saw (x , y)
− We can now infer the answer to the question.
24
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - issues
• Set of Objects
Certain properties of objects that are true as member of a set but
not as individual;
Example : Consider the assertion made in the sentences
"there are more sheep than people in Australia", and
"English speakers can be found all over the world."
To describe these facts, the only way is to attach assertion to the sets
representing people, sheep, and English.
The reason to represent sets of objects is :
If a property is true for all or most elements of a set,
then it is more efficient to associate it once with the set
rather than to associate it explicitly with every elements of the set .
This is done in different ways :
− in logical representation through the use of universal quantifier, and
− in hiera hical structure where node represent sets, the inheritance
propagate set level assertion down to individual. Example: assert large (elephant); Remember to make clear distinction between, − whether we are asserting some property of the set itself,
means, the set of elephants is large, or
− asserting some property that holds for individual elements of the set , means, any thing that is an elephant is large.
There are three ways in which sets may be represented : (a) Name, as in the example – Ref Fig. Inheritable KR, the node - Baseball-
Player and the predicates as Ball and Batter in logical representation.
(b) Extensional definition is to list the numbers, and (c) In tensional definition is to provide a rule, that returns true or
false depending on whether the object is in the set or not. [Readers may refer book on AI by Elaine Rich & Kevin Knight- page 122 - 123]
25
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - issues
• Finding Right Structure
Access to right structure for describing a particular situation.
It requires, selecting an initial structure and then revising the
choice. While doing so, it is necessary to solve following problems :
− how to perform an initial selection of the most appropriate structure.
− how to fill in appropriate details from the current situations.
how to find a better structure if the one chosen initially turns out not
to be appropriate.
− what to do if none of the available structures is appropriate.
− when to create and remember a new structure.
There is no good, general purpose method for solving all these problems.
Some knowledge representation techniques solve some of them.
[Readers may refer book on AI by Elaine Rich & Kevin Knight- page 124 - 126]
26
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – using logic
2. KR Using Predicate Logic
In the previous section much has been illustrated about knowledge and KR
related issues. This section, illustrates :
How knowledge can be represented as “symbol structures” that characterize
bits of knowledge about objects, concepts, facts, rules, strategies;
Examples : “red” represents colour red;
“car1” represents my car ;
"red(car1)" represents fact that my car is red.
Assumptions about KR :
− Intelligent Behavior can be achieved by manipulation of symbol structures.
− KR languages are designed to facilitate operations over symbol structures,
have precise syntax and semantics;
Syntax tells which expression is legal ?,
e.g., red1(car1), red1 car1, car1(red1), red1(car1 & car2) ?; and
Semantic tells what an expression means ?
e.g., property “dark red” applies to my car.
− Make Inferences, draw new conclusions from existing facts.
To satisfy these assumptions about KR, we need formal notation that allow
automated inference and problem solving. One popular choice is use of logic.
27
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – Logic
• Logic
Logic is concerned with the truth of statements about the world.
Generally each statement is either TRUE or FALSE.
Logic includes : Syntax , Semantics and Inference Procedure.
◊ Syntax :
Specifies the symbols in the language about how they can be
combined to form sentences. The facts about the world are
represented as sentences in logic.
◊ Semantic :
Specifies how to assign a truth value to a sentence based on its
meaning in the world. It Specifies what facts a sentence refers to.
A fact is a claim about the world, and it may be TRUE or FALSE.
◊ Inference Procedure :
Specifies methods for computing new sentences from the existing
sentences.
Note
Facts : are claims about the world that are True or False.
Representation : is an expression (sentence), stands for the objects and
relations.
Sentences : can be encoded in a computer program.
28
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Logic as a KR Language KR - Logic
Logic is a language for reasoning, a collection of rules used while doing
logical reasoning. Logic is studied as KR languages in artificial intelligence.
◊ Logic is a formal system in which the formulas or sentences have true or
false values.
◊ Problem of designing KR language is a tradeoff between that which is
(a) Expressive enough to represent important objects and relations in
a problem domain.
(b) Efficient enough in reasoning and answering questions about
implicit information in a reasonable amount of time.
◊ Logics are of different types : Propositional logic, Predicate logic,
Temporal logic, Modal logic, Description logic etc;
They represent things and allow more or less efficient inference.
◊ Propositional logic and Predicate logic are fundamental to all logic.
Propositional Logic is the study of statements and their connectivity.
Predicate Logic is the study of individuals and their properties.
29
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – Logic
2.1 Logic Representation
Logic can be used to represent simple facts.
The facts are claims about the world that are True or False.
To build a Logic-based representation :
◊ User defines a set of primitive symbols and the associated semantics.
◊ Logic defines ways of putting symbols together so that user can define
legal sentences in the language that represent TRUE facts.
◊ Logic defines ways of inferring new sentences from existing ones.
◊ Sentences - either TRUE or false but not both are called propositions.
◊ A declarative sentence expresses a statement with a proposition as
content; example:
the declarative "snow is white" expresses that snow is white;
further, "snow is white" expresses that snow is white is TRUE.
In this section, first Propositional Logic (PL) is briefly explained and then
the Predicate logic is illustrated in detail.
30
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - Propositional Logic
• Propositional Logic (PL)
A proposition is a statement, which in English would be a declarative
sentence. Every proposition is either TRUE or FALSE.
Examples: (a) The sky is blue., (b) Snow is cold. , (c) 12 * 12=144
‡ Propositions are “sentences” , either true or false but not both.
‡ A sentence is smallest unit in propositional logic.
‡ If proposition is true, then truth value is "true" .
If proposition is false, then truth value is "false" .
Example :
Sentence Truth value Proposition (Y/N)
"Grass is green" "true" Yes
"2 + 5 = 5" "false" Yes
"Close the door" - No
"Is it hot out side ?" - No
"x > 2" where x is variable - No (since x is not defined) "x = x" - No
(don't know what is "x" and "="; "3 = 3" or "air is equal to air" or "Water is equal to water" has no meaning)
− Propositional logic is fundamental to all logic.
− Propositional logic is also called Propositional calculus, Sentential
calculus, or Boolean algebra.
− Propositional logic tells the ways of joining and/or modifying entire
propositions, statements or sentences to form more complicated
propositions, statements or sentences, as well as the logical
relationships and properties that are derived from the methods of
combining or altering statements. 31
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - Propositional Logic
Statement, Variables and Symbols
These and few more related terms, such as, connective, truth value,
contingencies, tautologies, contradictions, antecedent, consequent,
argument are explained below.
◊ Statement
Simple statements (sentences), TRUE or FALSE, that does not
contain any other statement as a part, are basic propositions;
lower-case letters, p, q, r, are symbols for simple statements.
Large, compound or complex statement are constructed from basic
propositions by combining them with connectives.
◊ Connective or Operator
The connectives join simple statements into compounds, and joins
compounds into larger compounds.
Table below indicates, the basic connectives and their symbols :
− listed in decreasing order of operation priority;
− operations with higher priority is solved first.
Example of a formula : ((((a Λ ¬b) V c → d) ↔ ¬ (a V c ))
Connectives and Symbols in decreasing order of operation priority
Connective Symbols Read as
assertion P "p is true"
negation ¬p ~ ! NOT "p is false"
conjunction p ∧ q · && & AND "both p and q are true"
disjunction P v q || | OR "either p is true, or q is true, or both "
implication p → q ⊃ ⇒ if ..then "if p is true, then q is true"
" p implies q "
equivalence ↔ ≡ ⇔ if and only if "p and q are either both true or both false"
Note : The propositions and connectives are the basic elements of
propositional logic.
32
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
◊
KR - Propositional Logic
Truth Value The truth value of a statement is its TRUTH or FALSITY , Example : p
~p
p v q
use " T " or
" 1 " to mean TRUE. use " F " or
" 0 " to mean FALSE
Truth table defining the basic connectives :
p q ¬p ¬q p ∧ q p v q p→ q p ↔ q q→ p
T T F F T T T T T
T F F T F T F F T
F T T F F T T F F
F F T T F F T T T
[The next slide shows the truth values of a group of propositions, called
tautology, contradiction, contingency, antecedent, consequent. They form
argument where one proposition claims to follow logically other
proposition]
33
is either TRUE or FALSE,
is either TRUE or FALSE,
is either TRUE or FALSE, and so on.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
◊
◊
KR - Propositional Logic
Tautologies A proposition that is always true is called a "tautology". e.g., (P v ¬P) is always true regardless of the truth value of the
proposition P. Contradictions A proposition that is always false is called a "contradiction". e.g., (P ∧ ¬P) is always false regardless of the truth value of
the proposition P.
◊ Contingencies
A proposition is called a "contingency" , if that proposition is
neither a tautology nor a contradiction .
e.g., (P v Q) is a contingency.
◊ Antecedent, Consequent
These two are parts of conditional statements.
In the conditional statements, p → q , the
1st statement or "if - clause" (here p) is called antecedent ,
2nd statement or "then - clause" (here q) is called consequent.
34
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - Propositional Logic
◊ Argument
An argument is a demonstration or a proof of some statement.
Example : "That bird is a crow; therefore, it's black."
Any argument can be expressed as a compound statement.
In logic, an argument is a set of one or more meaningful
declarative sentences (or "propositions") known as the premises
along with another meaningful declarative sentence (or
"proposition") known as the conclusion.
Premise is a proposition which gives reasons, grounds, or evidence
for accepting some other proposition, called the conclusion.
Conclusion is a proposition, which is purported to be established on
the basis of other propositions.
Take all the premises, conjoin them, and make that conjunction
the antecedent of a conditional and make the conclusion the
consequent. This implication statement is called the corresponding
conditional of the argument.
Note : Every argument has a corresponding conditional, and every
implication statement has a corresponding argument. Because the
corresponding conditional of an argument is a statement, it is therefore
either a tautology, or a contradiction, or a contingency.
‡ An argument is valid
"if and only if" its corresponding conditional is a tautology.
‡ Two statements are consistent
"if and only if" their conjunction is not a contradiction.
‡ Two statements are logically equivalent
"if and only if" their truth table columns are identical;
"if and only if" the statement of their equivalence using " ≡ " is a
tautology.
Note : The truth tables are adequate to test validity, tautology,
contradiction, contingency, consistency, and equivalence.
35
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - Predicate Logic
• Predicate Logic
The propositional logic, is not powerful enough for all types of assertions;
Example : The assertion "x > 1", where x is a variable, is not a proposition
because it is neither true nor false unless value of x is defined.
For x > 1 to be a proposition ,
− either we substitute a specific number for x ;
− or change it to something like
"There is a number x for which x > 1 holds";
− or "For every number x, x > 1 holds".
Consider example :
“ All men are mortal.
Socrates is a man.
Then Socrates is mortal” ,
These cannot be expressed in propositional logic as a finite and logically
valid argument (formula).
We need languages : that allow us to describe properties ( predicates ) of
objects, or a relationship among objects represented by the variables .
Predicate logic satisfies the requirements of a language.
− Predicate logic is powerful enough for expression and reasoning.
− Predicate logic is built upon the ideas of propositional logic. 36
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - Predicate Logic
Predicate :
Every complete "sentence" contains two parts : a "subject" and a
"predicate".
The subject is what (or whom) the sentence is about.
The predicate tells something about the subject;
Example :
A sentence "Judy runs".
The subject is Judy and the predicate is runs .
Predicate, always includes verb, tells something about the subject.
Predicate is a verb phrase template that describes a property of
objects, or a relation among objects represented by the variables.
Example:
“The car Tom is driving is blue"
; "The sky is blue" ;
"The cover of this book is blue"
Predicate is “is blue" , describes property.
Predicates are given names; Let „B‟ is name for predicate "is_blue".
Sentence is represented as "B(x)" , read as "x is blue";
Symbol “x” represents an arbitrary Object .
37
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - Predicate Logic
Predicate Logic Expressions :
The propositional operators combine predicates, like
If ( p(....) && ( !q(....) || r (....) ) )
Logic operators :
Examples of disjunction (OR) and conjunction (AND).
Consider the expression with the respective logic symbols || and &&
x < y || ( y < z && z < x)
which is true || ( true && true) ;
Applying truth table, found True
Assignment for < are 3, 2, 1 for x, y, z and then
the value can be FALSE or TRUE
3 < 2 || ( 2 < 1 && 1 < 3)
It is False
38
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - Predicate Logic
Predicate Logic Quantifiers
As said before, x > 1 is not proposition and why ?
Also said, that for x > 1 to be a proposition what is required ?
Generally, a predicate with variables (is called atomic formula) that can
be made a proposition by applying one of the following two operations
to each of its variables :
1. Assign a value to the variable; e.g., x > 1, if 3 is assigned to x
becomes 3 > 1 , and it then becomes a true statement, hence a
proposition.
2. Quantify the variable using a quantifier on formulas of predicate
logic (called wff well-formed formula), such as x > 1 or P(x), by
using Quantifiers on variables.
Apply Quantifiers on Variables
‡ Variable x
* x > 5 is not a proposition, its truth depends upon the value of
variable x
* to reason such statements, x need to be declared ‡ Declaration x : a
* x : a declares variable x
* x : a read as “x is an element of set a”
‡ Statement p is a statement about x
* Q x : a • p is quantification of statement
statement
declaration of variable x as element of set a
quantifier
* Quantifiers are two types :
universal quantifiers , denoted by symbol and
existential quantifiers , denoted by symbol
Note : The next few slide tells more on these two Quantifiers. 39
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - Predicate Logic
Universe of Discourse
The universe of discourse, also called domain of discourse or universe.
This indicates :
− a set of entities that the quantifiers deal.
− entities can be set of real numbers, set of integers, set of all cars
on a parking lot, the set of all students in a classroom etc.
− universe is thus the domain of the (individual) variables.
− propositions in the predicate logic are statements on objects of a
universe.
The universe is often left implicit in practice, but it should be obvious
from the context.
Examples:
− About "natural numbers" forAll x, y (x < y or x = y or x > y), there is
no need to be more precise and say forAll x, y in N, because N is
implicit, being the universe of discourse.
− About a property that holds for natural numbers but not for real
numbers, it is necessary to qualify what the allowable values of x
and y are. 40
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Apply Universal Quantifier " For All "
Universal Quantification allows us to make a statement about a
collection of objects.
‡ Universal quantification: x : a • p
* read “ for all x in a , p holds ”
* a is universe of discourse
* x is a member of the domain of discourse.
* p is a statement about x
‡ In propositional form it is written as : x P(x)
* read “ for all x, P(x) holds ”
“ for each x, P(x) holds ” or
“ for every x, P(x) holds ”
* where P(x) is predicate,
x means all the objects x in the universe
P(x) is true for every object x in the universe
‡ Example : English language to Propositional form
* "All cars have wheels"
x : car • x has wheel
* x P(x)
where P (x) is predicate tells : „x has wheels‟
x is variable for object „cars‟ that populate universe of discourse
41
KR - Predicate Logic
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - Predicate Logic
Apply Existential Quantifier
" There Exists "
Existential Quantification allows us to state that an object does exist
without naming it.
‡ Existential quantification:
x : a • p
* read “ there exists an x such that p holds ”
* a is universe of discourse
* x is a member of the domain of discourse.
* p is a statement about x
‡ In propositional form it is written as :
* read “ there exists an x such that P(x) ” or
“ there exists at least one x such that P(x) ”
* Where P(x) is predicate
x means at least one object x in the universe
P(x) is true for least one object x in the universe
‡ Example : English language to Propositional form
* “ Someone loves you ”
x : Someone • x loves you
* x P(x)
where P(x) is predicate tells : „ x loves you ‟
x is variable for object „ someone ‟ that populate universe of discourse
42
x P(x)
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR - Predicate Logic
Formula
In mathematical logic, a formula is a type of abstract object.
A token of a formula is a symbol or string of symbols which may be
interpreted as any meaningful unit in a formal language.
‡ Terms
Defined recursively as variables, or constants, or functions like
f(t1, . . . , tn), where f is an n-ary function symbol, and t1, . . . , tn
are terms. Applying predicates to terms produces atomic formulas.
‡ Atomic formulas
An atomic formula (or simply atom) is a formula with no deeper
propositional structure, i.e., a formula that contains no logical
connectives or a formula that has no strict sub-formulas.
− Atoms are thus the simplest well-formed formulas of the logic.
− Compound formulas are formed by combining the atomic formulas
using the logical connectives.
− Well-formed formula ("wiff") is a symbol or string of symbols (a
formula) generated by the formal grammar of a formal language.
An atomic formula is one of the form :
− t1 = t2, where t1 and t2 are terms, or
− R(t1, . . . , tn), where R is an n-ary relation symbol, and
t1, . . . , tn are terms.
− ¬ a is a formula when a is a formula.
− (a ∧ b) and (a v b) are formula when a and b are formula
‡ Compound formula : example
((((a ∧ b ) ∧ c) ∨ ((¬ a ∧ b) ∧ c)) ∨ ((a ∧ ¬ b) ∧ c)) 43
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – logic relation
2.2 Representing “ IsA ” and “ Instance ” Relationships
Logic statements, containing subject, predicate, and object, were
explained. Also stated, two important attributes "instance" and "isa", in a
hiera hical structure (Ref. Fig. Inheritable KR).
Attributes “ IsA ” and “ Instance ” support property inheritance and play
important role in knowledge representation.
The ways these two attributes "instance" and "isa", are logically expressed
are shown in the example below :
44
Example : A simple sentence like "Joe is a musician" ◊ Here "is a" (called IsA) is a way of expressing what logically is
called a class-instance relationship between the subjects
represented by the terms "Joe" and "musician". ◊ "Joe" is an instance of the class of things called
"musician". "Joe" plays the role of instance,
"musician" plays the role of class in that sentence. ◊ Note : In such a sentence, while for a human there is no confusion,
but for computers each relationship have to be defined explicitly.
This is specified as: [Joe] IsA [Musician]
i.e., [Instance] IsA [Class]
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – functions & predicates
2.3 Computable Functions and Predicates
The objective is to define class of functions C computable in terms of F.
This is expressed as C F is explained below using two examples :
(1) "evaluate factorial n" and (2) "expression for triangular functions". Example 1 : A conditional expression to define factorial n ie n!
◊ Expression “ if p1 then e1 else if p2 then e2 . . . else if pn then en” .
ie. (p1 → e1, p2 → e2, . . . . . . pn → en )
Here p1, p2, . . . . pn are propositional expressions taking the
values T or F for true and false respectively. ◊ The value of ( p1 → e1, p2 → e2, . . . . . .pn → en ) is the value of the
e corresponding to the first p that has value T.
◊ The expressions defining n! , n= 5, recursively are :
n! = n x (n-1)! for n ≥ 1
5! = 1 x 2 x 3 x 4 x 5 = 120
0! = 1
The above definition incorporates an instance that :
if the product of no numbers ie 0! = 1 ,
then only, recursive relation (n + 1)! = (n+1) x n! works for n = 0
◊ Use of the above conditional expressions to define functions n!
recursively is n! = ( n = 0 → 1, n ≠ 0 → n . (n – 1 ) ! )
◊ Example: Evaluate 2! according to above definition.
2! = ( 2 = 0 → 1, 2 ≠ 0 → 2 . ( 2 – 1 )! )
= 2 x 1!
= 2 x ( 1 = 0 → 1, 1 ≠ 0 → 1 . ( 1 – 1 )! )
= 2 x 1 x 0!
= 2 x 1 x ( 0 = 0 → 1, 0 ≠ 0 → 0 . ( 0 – 1 )! )
= 2 x 1 x 1
= 2
45
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – functions & predicates
Example 2 : A conditional expression for triangular functions
◊ The graph of a well known triangular function is shown below
Y
0,1
X
-1,0 1,0
Fig. A Triangular Function
the conditional expressions for triangular functions are
x = (x 0 → -x , x ≥ 0 → x)
◊ the triangular function of the above graph is represented by the
conditional expression
tri (x) = (x ≤ -1 → 0, x ≤ 0 → -x, x ≤ 1 → x, x 1 → 0) 46
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
2.4 Resolution KR - Predicate Logic – resolution
Resolution is a procedure used in proving that arguments which are
expressible in predicate logic are correct.
Resolution is a procedure that produces proofs by refutation or
contradiction.
Resolution lead to refute a theorem-proving technique for sentences in
propositional logic and first-order logic.
− Resolution is a rule of inference.
− Resolution is a computerized theorem prover.
− Resolution is so far only defined for Propositional Logic. The strategy is
that the Resolution techniques of Propositional logic be adopted in
Predicate Logic. 47
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR Using Rules
3. KR Using Rules
Knowledge representations using predicate logic
In the earlier slides, the
have been illustrated. The other popular approaches to Knowledge
representation are called production rules , semantic net and frames.
Production rules, sometimes called IF-THEN rules are most popular KR.
production rules are simple but powerful forms of KR.
production rules provide the flexibility of combining declarative and
procedural representation for using them in a unified
form. Examples of production rules :
− IF condition THEN action
− IF premise THEN conclusion
− IF proposition p1 and proposition p2 are true THEN proposition p3 is true
Advantages of production rules :
− they are modular,
− each rule define a small and independent piece of
knowledge. − new rules may be added and old ones deleted
− rules are usually independently of other rules.
The production rules as knowledge representation mechanism are used in the
design of many "Rule-based systems" also called "Production systems" .
48
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR Using Rules
• Types of Rules
Three types of rules are mostly used in the Rule-based production systems.
Knowledge Declarative Rules :
These rules state all the facts and relationships about a problem.
Example :
IF inflation rate declines
THEN the price of gold goes down.
These rules are a part of the knowledge base.
Inference Procedural Rules
These rules advise on how to solve a problem, while certain facts are
known.
Example :
IF the data needed is not in the
system THEN request it from the user.
These rules are part of the inference engine.
Meta rules
These are rules for making rules. Meta-rules reason about which rules
should be considered for firing.
Example :
IF the rules which do not mention the current goal in their premise, AND
there are rules which do mention the current goal in their premise, THEN
the former rule should be used in preference to the latter.
− Meta-rules direct reasoning rather than actually performing
reasoning.
− Meta-rules specify which rules should be considered and in which
order they should be invoked. 49
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – procedural & declarative
3.1 Procedural versus Declarative Knowledge
These two types of knowledge were defined in earlier slides.
Procedural Knowledge : knowing 'how to do'
Includes : rules, strategies, agendas, procedures, models.
These explains what to do in order to reach a certain conclusion.
Example
Rule: To determine if Peter or Robert is older, first find their ages.
It is knowledge about 'how to do' something. It manifests itself in the
doing of something, e.g., manual or mental skills cannot reduce to
words. It is held by individuals in a way which does not allow it to be
communicated directly to other individuals.
Accepts a description of the steps of a task or procedure. It Looks
similar to declarative knowledge, except that tasks or methods are
being described instead of facts or things.
Declarative Knowledge : knowing 'what', knowing 'that'
Includes : concepts, objects, facts, propositions, assertions, models.
It is knowledge about facts and relationships, that
− can be expressed in simple and clear statements,
− can be added and modified without difficulty.
Examples : A car has four tyres; Peter is older than Robert.
Declarative knowledge and explicit knowledge are articulated
knowledge and may be treated as synonyms for most practical
purposes. Declarative knowledge is represented in a format that can
be manipulated, decomposed and analyzed independent of its content. 50
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – procedural & declarative
Comparison :
Comparison between Procedural and Declarative Knowledge :
Procedural Knowledge Declarative Knowledge
• Hard to debug • Easy to validate
• Black box • White box
• Obscure • Explicit
• Process oriented • Data - oriented
• Extension may effect stability • Extension is easy
• Fast , direct execution • Slow (requires interpretation)
• Simple data type can be used • May require high level data type
• Representations in the form of • Representations in the form of
sets of rules, organized into production system, the entire set
routines and subroutines. of rules for executing the task.
51
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – procedural & declarative
Comparison :
Comparison between Procedural and Declarative Language :
Procedural Language Declarative Language
• Basic, C++, Cobol, etc. • SQL
• Most work is done by interpreter of • Most work done by Data Engine
the languages within the DBMS
• For one task many lines of code • Programmer must be skilled in
translating the objective into lines of procedural code
• For one task one SQL statement • Programmer must be skilled in
clearly stating the objective as a SQL statement
• Requires minimum of management • Relies on SQL-enabled DBMS to
around the actual data hold the data and execute the SQL statement .
• Programmer understands and has • Programmer has no interaction access to each step of the code with the execution of the SQL
statement
• Data exposed to programmer • Programmer receives data at end during execution of the code as an entire set
• More susceptible to failure due to • More resistant to changes in the
changes in the data structure data structure
• Traditionally faster, but that is • Originally slower, but now setting changing speed records
• Code of procedure tightly linked to • Same SQL statements will work front end with most front ends
Code loosely linked to front end.
• Code tightly integrated with • Code loosely linked to structure of structure of the data store data; DBMS handles structural
issues • Programmer works with a pointer
or cursor • Knowledge of coding tricks
applies only to one language
• Programmer not concerned with
positioning • Knowledge of SQL tricks applies
to any language using SQL 52
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
. 3.2 Logic Programming KR – Logic Programming
Logic programming offers a formalism for specifying a computation in
terms of logical relations between entities.
− logic program is a collection of logic statements.
programmer describes all relevant logical relationships between the
various entities.
computation determines whether or not, a particular conclusion follows
from those logical statements.
• Characteristics of Logic program
Logic program is characterized by set of relations and inferences. −
program consists of a set of axioms and a goal statement.
− rules of inference determine whether the axioms are sufficient to ensure
the truth of the goal statement.
− execution of a logic program corresponds to the construction of a
proof of the goal statement from the axioms.
− programmer specify basic logical relationships, does not specify the
manner in which inference rules are applied.
Thus Logic + Control = Algorithms
• Examples of Logic Statements
− Statement A grand-parent is a parent of a parent.
− Statement expressed in more closely related logic terms
as A person is a grand-parent if she/he has a child and
that child is a parent.
− Statement expressed in first order logic as
(for all) x: grandparent (x, y) :- parent (x, z), parent (z, y)
read as x is the grandparent of y
if x is a parent of z and z is a parent of y 53
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Logic Programming Language KR – Logic Programming
A programming language includes :
− the syntax
− the semantics of programs and
− the computational model.
There are many ways of organizing computations. The most familiar
paradigm is procedural. The program specifies a computation by saying
"how" it is to be performed. FORTRAN, C, and Object-oriented languages
fall under this general approach.
Another paradigm is declarative. The program specifies a computation by
giving the properties of a correct answer. Prolog and logic data language
(LDL) are examples of declarative languages, emphasize the logical
properties of a computation.
Prolog and LDL are called logic programming languages.
PROLOG (PROgramming LOGic) is the most popular Logic programming
language rose within the realm of Artificial Intelligence (AI). It became
popular with AI resea hers, who know more about "what" and "how"
intelligent behavior is achieved. 54
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Syntax and Terminology (relevant to Prolog programs)
(A) Data components :
Data components are collection of data objects that follow hiera hy.
Data Objects
(terms)
Simple Structured
Constants Variables
Atoms Numbers
Data object of any kind is also called
a term. A term is a constant, a
variable or a compound term. Simple data object is not
decomposable; e.g. atoms, numbers,
constants, variables. Syntax distinguishes the data objects,
hence no need for declaring them. Structured data object are made of
several components.
All these data components are explained in next slide.
55
In any language, the formation of components (expressions, statements,
etc.), is guided by syntactic rules.
The components are divided into two parts:
(A) data components and (B) program components.
KR – Logic Programming
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
(a) Data Objects : KR – Logic Programming
The data objects of any kind is called a term.
◊ Term : Examples
‡ Constants:
Denote elements such as integers, floating point, atoms.
‡ Variables:
Denote a single but unspecified element; symbols for variables
begin with an uppe ase letter or an underscore.
‡ Compound terms:
Comprise a functor and sequence of one or more compound
terms called arguments.
Functor: is characterized by its name and number of
arguments; name is an atom, and number of arguments is
arity.
ƒ/n = ƒ( t1 , t2, . . . tn )
where ƒ is name of the functor and is of arity n t i
's are the argument
ƒ/n denotes functor ƒ of arity n
Functors with same name but different arities are distinct.
‡ Ground and non-ground:
Terms are ground if they contain no variables (only constant
signs); otherwise they are non-ground.
Goals are atoms or compound terms, and are generally non-
ground. 56
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
(b) Simple Data Objects KR – Logic Programming
: Atoms, Numbers, Variables
◊ Atoms
‡ a lower-case letter, possibly followed by other letters of either
case, digits, and underscore character.
e.g. a greaterThan two_B_or_not_2_b
‡ a string of special characters such as: + - * / \ = ^ < > : ~ # $ &
e.g. <> ##&& ::=
‡ a string of any characters enclosed within single quotes.
e.g. 'ABC' '1234' 'a<>b'
‡ following are also atoms ! ; []
◊ Numbers
‡ applications involving heavy numerical calculations are rarely
written in Prolog.
‡ integer representation: e.g. 0 -16 33 +100
‡ real numbers written in standard or scientific notation,
e.g. 0.5 -3.1416 6.23e+23 11.0e-3 -2.6e-2
◊ Variables
‡ begins by a capital letter, possibly followed by other letters of
either case, digits, and underscore
character. e.g. X25 List Noun_Phrase 57
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
(c) Structured Data Objects : General Structures , Special Structures .
◊ General Structures
‡ a structured term is syntactically formed by a functor and a list of
arguments.
‡ functor is an atom.
‡ list of arguments appear between parentheses.
‡ arguments are separated by a comma.
‡ each argument is a term (i.e., any Prolog data object).
‡ the number of arguments of a structured term is called its arity.
‡ e.g. greaterThan(9, 6) f(a, g(b, c), h(d)) plus(2, 3, 5)
Note : a structure in Prolog is a mechanism for combining terms
together, like integers 2, 3, 5 are combined with the functor plus.
◊ Special Structures
‡ In Prolog an ordered collection of terms is called a list .
‡ Lists are structured terms and Prolog offers a convenient
notation to represent them:
* Empty list is denoted by the atom [ ].
* Non-empty list carries element(s) between square brackets,
separating elements by comma.
e.g. [bach, bee] [apples, oranges, grapes]
58
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
(B) KR – Logic Programming
Program Components
A Prolog program is a collection of predicates or rules.
A predicate establishes a relationships between objects.
(a) Clause, Predicate, Sentence, Subject
‡ Clause is a collection of grammatically-related words .
‡ Predicate is composed of one or more clauses.
‡ Clauses are the building blocks of sentences;
every sentence contains one or more clauses.
‡ A Complete Sentence has two parts: subject and predicate.
o subject is what (or whom) the sentence is about.
o predicate tells something about the subject.
‡ Example 1 : "cows eat grass".
It is a clause, because it contains
the subject "cows" and the
predicate "eat grass."
‡ Example 2 : "cows eating grass are visible from highway"
This is a complete clause.
the subject "cows eating grass" and
the predicate "are visible from the highway" makes complete
thought. 59
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – Logic Programming
(b) Predicates & Clause
Syntactically a predicate is composed of one or more clauses.
‡ The general form of clauses is
<left-hand-side> :- <right-hand-side>.
where LHS is a single goal called "goal" and RHS is composed of
one or more goals, separated by commas, called "sub-goals" of
the goal on left-hand side.
The symbol " :- " is pronounced as "it is the case" or "such that"
‡ The structure of a clause in logic program head body
pred ( functor(var1, var2)) :- pred(var1) , pred(var2)
literal literal
clause
Literals represent the possible choices in primitive types the
particular language. Some of the choices of types of literals are
often integers, floating point, Booleans and character strings.
‡ Example : grand_parent (X, Z) :- parent(X, Y), parent(Y, Z).
parent (X, Y) :- mother(X, Y).
parent (X, Y) :- father(X, Y).
Read as if x is mother of y then x is parent of y
[Continued in next slide]
60
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – Logic Programming
[Continued from previous slide] ‡ Interpretation:
* A clause specifies the conditional truth of the goal on the LHS;
goal on LHS is assumed to be true if the sub-goals on RHS are
all true. A predicate is true if at least one of its clauses is true.
* An individual "X" is the grand-parent of "Z" if a parent of that
same "X" is "Y" and "Y" is the parent of that "Z".
(X is parent of Y) (Y is parent of Z)
X Y Z
(X is grand parent of Z)
* An individual "X" is a parent of "Y" if "Y" is the mother of "X"
(X is parent of Y) X Y
(X is mother of Y)
* An individual "X" is a parent of "Y" if "Y" is the father of "X".
(X is parent of Y) X Y
(X is father of Y)
61
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – Logic Programming
(c) Unit Clause - a special Case
Unlike the previous example of conditional truth, one often encounters
unconditional relationships that hold.
‡ In Prolog the clauses that are unconditionally true are called
unit clause or fact .
‡ Example : Unconditionally relationships say 'X' is
the father of 'Y' is unconditionally true.
This relationship as a Prolog clause is
father(X, Y) :- true.
Interpreted as relationship of father between X and Y is always
true; or simply stated as X is father of Y .
‡ Goal true is built-in in Prolog and always holds.
‡ Prolog offers a simpler syntax to express unit clause or fact
father(X, Y)
ie the " :- true " part is simply omitted. 62
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
(d)
KR – Logic Programming
Queries
In Prolog the queries are statements called directive.
A special case of directives, are called queries.
‡ Syntactically, directives are clauses with an empty left-hand side.
Example : ? - grandparent(Q, Z).
This query Q is interpreted as : Who is a grandparent of Z ?
By issuing queries Q, Prolog tries to establish the validity of
specific relationships.
The answer from previous slides is (X is grand parent of Z)
‡ The result of executing a query is either success or failure Success,
means the goals specified in the query holds according to the facts
and rules of the program.
Failure, means the goals specified in the query does not hold
according to the facts and rules of the program.
63
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Programming Paradigms :
64
KR – Logic - models of computation
Models of Computation
There are three basic computational models :
(a) Imperative, (b) Functional, and (c) Logic.
In addition to these, there are two programming paradigms :
(a) concurrent (b) object-oriented programming .
While, these two are not models of computation, but they rank in
importance with computational models.
Models of Computation :
A computational model is a collection of values and operations, while
computation is the application of a sequence of operations to a value to
yield another value.
A complete description of a programming language includes the
computational model, syntax, semantics, and pragmatic considerations that
shape the language.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – Logic - models of computation
(a) Imperative Model
The Imperative model of computation, consists of a state and an
operation of assignment which is used to modify the state.
Programs consist of sequences of commands.
Computations are changes in the state.
Example : Linear function
A linear function y = 2x + 3 can be written as
Y := 2 ∗ X + 3
The implementation requires to determines the value of X in the state
and then creates a new state which differs from the old state.
New State: X = 3, Y = 9,
The imperative model is closest to the hardware model on which
programs are executed, that makes it most efficient model in terms of
execution time. 65
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
(b) Functional model KR – Logic - models of computation
The Functional model of computation, consists of a set of values,
functions, and the operation of functions. The functions may be named
and composed with other functions. It can take other functions as
arguments and return results.
Programs consist of definitions of functions.
Computations are application of functions to values.
‡ Example 1 : Linear function
A linear function y = 2x + 3 can be defined as
: f (x) = 2 ∗ x + 3
‡ Example 2 : Determine a value for Ci umference.
Assign a value to Radius, that determines a value for Ci umference.
Ci umference = 2 × pi × radius , where pi = 3.14
Generalize Ci umference with the variable "radius" ie
Ci umference(radius) = 2 × pi × radius , where pi = 3.14
Functional models are developed over many years. The notations and
methods form the base upon which problem solving methodologies rest.
66
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
(c) Logic Model KR – Logic - models of computation
The logic model of computation is based on relations and logical
inference.
Programs consist of definitions of relations.
Computations are inferences (is a proof).
‡ Example 1 : Linear function
A linear function y = 2x + 3 can be represented as :
f (X , Y) if Y is 2 ∗ X + 3.
Here the function represents the relation between X and Y.
‡ Example 2: Determine a value for Ci umference.
The ci umference computation can be represented
as:
Ci le (R , C) if Pi = 3.14 and C = 2 ∗ pi ∗ R.
Here the function is represented as the relation between radius
R and ci umference C.
‡ Example 3: Determine the mortality of Socrates and Penelope. The
program is to determine the mortality of Socrates and Penelope.
The fact given that Socrates and Penelope are human.
The rule is that all humans are mortal, that is
for all X, if X is human then X is mortal.
To determine the mortality of Socrates or Penelope, make the
assumption that there are no mortals, that is ¬ mortal (Y)
[logic model continued in the next slide]
67
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – Logic - models of computation
[logic model continued in the previous slide]
‡ The equivalent form of the facts and rules stated before are
human (Socrates)
mortal (X) if human (X)
‡ To determine the mortality of Socrates and Penelope, we made the
assumption that there are no mortals i.e. ¬ mortal (Y)
‡ Computation (proof) that Socrates is mortal
1. (a) human(Socrates) 2. mortal(X) if human(X) 3 ¬mortal(Y)
4.(a) X = Y
4.(b) ¬human(Y)
5. Y = Socrates 6. Contradiction
Fact Rule assumption
from 2 & 3 by unification
and modus tollens from 1 and 4 by
unification 5, 4b, and 1
‡ Explanation :
* The 1st line is the statement "Socrates is a man." *
The 2nd line is a phrase "all human are mortal"
into the equivalent "for all X, if X is a man then X is mortal".
* The 3rd line is added to the set to determine the mortality of Socrates.
* The 4th line is the deduction from lines 2 and 3. It is justified by the
inference rule modus tollens which states that if the conclusion of a
rule is known to be false, then so is the hypothesis.
* Variables X and Y are unified because they have same value.
* By unification, Lines 5, 4b, and 1 produce contradictions and identify
Socrates as mortal.
* Note that, resolution is an inference rule which looks for a
contradiction and it is facilitated by unification which determines if
there is a substitution which makes two terms the same.
Logic model formalizes the reasoning process. It is related to relational
data bases and expert systems. 68
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – forward-backward reasoning
3.3 Forward versus Backward Reasoning
Rule-Based system a hitecture consists a set of rules, a set of facts, and
an inference engine. The need is to find what new facts can be derived.
Given a set of rules, there are essentially two ways to generate new
knowledge: one, forward chaining and the other, backward chaining.
Forward chaining : also called data driven.
It starts with the facts, and sees what rules apply.
Backward chaining : also called goal driven.
It starts with something to find out, and looks for rules that will help in
answering it. 69
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – forward-backward reasoning
Example 1
.
Rule R1 : IF hot AND smoky THEN fire
Rule R2 : IF alarm_beeps THEN smoky
Rule R3 : IF fire THEN switch_on_sprinklers
Fact F1 : alarm_beeps [Given]
Fact F2 : hot
[Given]
Example 2
Rule R1 : IF hot AND smoky THEN ADD fire
Rule R2 : IF alarm_beeps THEN ADD smoky
Rule R3 : IF fire THEN ADD switch_on_sprinklers
Fact F1 : alarm_beeps [Given]
Fact F2 : hot [Given]
70
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – forward-backward reasoning
Example 3 : A typical Forward Chaining
Rule R1 : IF hot AND smoky THEN ADD fire
Rule R2 : IF alarm_beeps THEN ADD smoky
Rule R3 : If fire THEN ADD switch_on_sprinklers
Fact F1 : alarm_beeps [Given]
Fact F2 : hot [Given]
Fact F4 : smoky [from F1 by R2]
Fact F2 : fire [from F2, F4 by R1]
Fact F6 : switch_on_sprinklers [from F2 by R3]
Example 4 : A typical Backward Chaining
Rule R1 : IF hot AND smoky THEN fire
Rule R2 : IF alarm_beeps THEN smoky
Rule R3 : If _fire THEN switch_on_sprinklers
Fact F1 : hot [Given]
Fact F2 : alarm_beeps [Given]
Goal : Should I switch sprinklers on?
71
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – forward chaining
• Forward Chaining
The Forward chaining system, properties , algorithms, and conflict
resolution strategy are illustrated.
Forward chaining system
facts
Working Inference
Memory Engine
facts
facts rules
Rule
User
Base
‡ facts are held in a working memory
‡ condition-action rules represent actions to be taken when
specified facts occur in working memory.
‡ typically, actions involve adding or deleting facts from the working
memory.
Properties of Forward Chaining
‡ all rules which can fire do fire.
‡ can be inefficient - lead to spurious rules firing, unfocused problem
solving
‡ set of rules that can fire known as conflict set.
‡ decision about which rule to fire is conflict resolution. 72
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – forward chaining
Forward chaining algorithm - I
Repeat
‡ Collect the rule whose condition matches a fact in WM.
‡ Do actions indicated by the rule.
(add facts to WM or delete facts from WM)
Until problem is solved or no condition match
Apply on the Example 2 extended (adding 2 more rules and 1 fact)
Rule R1 : IF hot AND smoky THEN ADD fire
Rule R2 : IF alarm_beeps THEN ADD smoky
Rule R3 : If fire THEN ADD switch_on_sprinklers
Rule R4 : IF dry THEN ADD switch_on_humidifier
Rule R5 : IF sprinklers_on THEN DELETE dry
Fact F1 : alarm_beeps [Given]
Fact F2 : hot [Given]
Fact F2 : Dry [Given]
Now, two rules can fire (R2 and R4)
Rule R4 ADD humidifier is on [from F2]
Rule R2 ADD smoky [from F1]
[followed by ADD fire [from F2 by R1]
sequence of ADD switch_on_sprinklers [by R3] actions]
DELEATE dry, ie [by R5 ] humidifier is off a conflict !
Forward chaining algorithm - II (applied to example 2 above )
Repeat
‡ Collect the rules whose conditions match facts in WM.
‡ If more than one rule matches as stated above then
◊ Use conflict resolution strategy to eliminate all but one ‡ Do actions indicated by the rules
(add facts to WM or delete facts from WM)
Until problem is solved or no condition match
73
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – forward chaining
Conflict Resolution Strategy
Conflict set is the set of rules that have their conditions satisfied by
working memory elements.
Conflict resolution normally selects a single rule to fire.
The popular conflict resolution mechanisms are :
Refractory, Recency, Specificity.
◊ Refractory
‡ a rule should not be allowed to fire more than once on the
same data.
‡ discard executed rules from the conflict set.
‡ prevents undesired loops.
◊ Recency
‡ rank instantiations in terms of the recency of the elements in
the premise of the rule.
‡ rules which use more recent data are preferred.
‡ working memory elements are time-tagged indicating at what
cycle each fact was added to working memory.
◊ Specificity
‡ rules which have a greater number of conditions and are
therefore more difficult to satisfy, are preferred to more
general rules with fewer conditions.
‡ more specific rules are „better‟ because they take more of the
data into account. 74
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – forward chaining
Alternative to Conflict Resolution – Use Meta Knowledge
Instead of conflict resolution strategies, sometimes we want to use
knowledge in deciding which rules to fire. Meta-rules reason about
which rules should be considered for firing. They direct reasoning rather
than actually performing reasoning.
‡ Meta-knowledge : knowledge about knowledge
to guide sea h.
‡ Example of meta-knowledge
IF conflict set contains any rule (c , a) such that
a = "animal is mammal''
THEN fire (c , a)
‡ This example says meta-knowledge encodes knowledge about how
to guide sea h for solution.
‡ Meta-knowledge, explicitly coded in the form of rules with "object
level" knowledge. 75
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
KR – backward chaining
• Backward Chaining
Backward chaining system and the algorithm are illustrated.
Backward chaining system
‡ Backward chaining means reasoning from goals back to
facts. The idea is to focus on the sea h.
‡ Rules and facts are processed using backward chaining interpreter.
‡ Checks hypothesis, e.g. "should I switch the sprinklers on?"
Backward chaining algorithm
‡ Prove goal G
If G is in the initial facts , it is proven.
Otherwise, find a rule which can be used to conclude G, and
try to prove each of that rule's conditions.
alarm_beeps
Smoky
hot
fire
switch_on_sprinklers
Encoding of rules
Rule R1 : IF hot AND smoky THEN fire
Rule R2 : IF alarm_beeps THEN smoky
Rule R3 : If fire THEN switch_on_sprinklers
Fact F1 : hot [Given]
Fact F2 : alarm_beeps [Given]
Goal : Should I switch sprinklers on?
76
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Forward vs Backward Chaining KR – backward chaining
‡ Depends on problem, and on properties of rule set.
‡ Backward chaining is likely to be better if there is clear hypotheses.
Examples : Diagnostic problems or classification problems, Medical
expert systems
‡ Forward chaining may be better if there is less clear hypothesis and
want to see what can be concluded from current situation;
Examples : Synthesis systems - design / configuration. 77
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
3.4 Control Knowledge KR – control knowledge
An algorithm consists of : logic component, that specifies the knowledge
to be used in solving problems, and control component, that determines
the problem-solving strategies by means of which that knowledge is used.
Thus Algorithm = Logic + Control .
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
The logic component determines the meaning of the algorithm whereas
the control component only affects its efficiency.
An algorithm may be formulated in different ways, producing same
behavior. One formulation, may have a clear statement in logic
component but employ a sophisticated problem solving strategy in the
control component. The other formulation, may have a complicated
logic component but employ a simple problem-solving strategy.
The efficiency of an algorithm can often be improved by improving the
control component without changing the logic of the algorithm and
therefore without changing the meaning of the algorithm.
The trend in databases is towards the separation of logic and control.
The programming languages today do not distinguish between them.
The programmer specifies both logic and control in a single language.
The execution mechanism exe ises only the most rudimentary
problem-solving capabilities.
Computer programs will be more often correct, more easily improved,
and more readily adapted to new problems when programming
languages separate logic and control, and when execution mechanisms
provide more powerful problem-solving facilities of the kind provided by
intelligent theorem-proving systems.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
UNIT 04
PLANNING AND MACHINE LEARNING
• Reasoning is the act of deriving a conclusion from certain premises using a
given methodology.
• Reasoning is a process of thinking; reasoning is logically arguing;
reasoning is drawing inference.
• When a system is required to do something, that it has not been explicitly
told how to do, it must reason. It must figure out what it needs to know
from what it already knows.
• Many types of Reasoning have long been identified and recognized, but
many questions regarding their logical and computational properties still
remain controversial.
• The popular methods of Reasoning include abduction, induction, model-
based, explanation and confirmation. All of them are intimately related to
problems of belief revision and theory development, knowledge
assimilation, discovery and learning. 03
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Reasoning
1. Reasoning
if it has not been explicitly told
Any knowledge system to do something,
how to do it then it must reason.
The system must figure out what it needs to know from what it already knows.
Example
If we know : Robins are birds. All birds have wings.
Then if we ask : Do robins have wings?
Some reasoning (although very simple) has to go on answering the question.
1.1 Definitions :
• Reasoning is the act of deriving a conclusion from certain premises using
a given methodology.
Any knowledge system must reason, if it is required to do something
which has not been told explicitly .
For reasoning, the system must find out what it needs to know from
what it already knows.
Example :
If we know : Robins are birds.
All birds have wings
Then if we ask: Do robins have wings?
To answer this question - some reasoning must go. 04
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
05
AI - Reasoning
‡ Non-Logical Reasoning – linguistic , language
These three areas of reasoning, are in every human being, but the
ability level depends on education, environment and genetics.
The IQ (Intelligence quotient) is the summation of mathematical
reasoning skill and the logical reasoning.
The EQ (Emotional Quotient) depends mostly on non-logical reasoning
capabilities.
Note : The Logical Reasoning is of our concern in AI
‡ Mathematical Reasoning – axioms, definitions, theorems, proofs
‡ Logical Reasoning – deductive, inductive, abductive
Human reasoning capabilities are divided into three
areas:
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Logical Reasoning
Logic is a language for reasoning. It is a collection
Logic arguments, we use when doing logical reasoning.
AI - Reasoning
of rules called
Logic reasoning is the process of drawing conclusions from premises using
rules of inference. The study of logic is divided into formal and informal logic. The
formal logic is sometimes called symbolic logic.
Symbolic logic is the study of symbolic abstractions (construct) that
capture the formal features of logical inference by a formal system.
Formal system consists of two components, a formal language plus a set of
inference rules. The formal system has axioms.
Axiom is a sentence that is always true within the system.
Sentences are derived using the system's axioms and rules of derivation are
called theorems.
06
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Reasoning
Formal Logic
The Formal logic is the study of inference with purely formal content,
ie. where content is made explicit.
Examples - Propositional logic and Predicate logic.
‡ Here the logical arguments are a set of rules for manipulating
symbols. The rules are of two types
◊ Syntax rules : say how to build meaningful expressions.
◊ Inference rules : say how to obtain true formulas from other
true formulas.
‡ Logic also needs semantics, which says how to assign meaning to
expressions. 07
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Reasoning
Informal Logic
The Informal logic is the study of natural language arguments.
‡ The analysis of the argument structures in ordinary language is
part of informal logic.
‡ The focus lies in distinguishing good arguments (valid) from bad
arguments or fallacies (invalid). 08
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Reasoning
Formal Systems
Formal systems can have following three properties :
‡ Consistency : System's theorems do not contradict.
‡ Soundness : System's rules of derivation will never infer
anything false, so long as start is with only true premises.
‡ Completeness : There are no true sentences in the system that
cannot be proved using the derivation rules of the system.
System Elements
Formal systems consist of following elements :
‡ A finite set of symbols for constructing formulae.
‡ A grammar, is a way of constructing well-formed formulae (wff).
‡ A set of axioms; each axiom has to be a wff.
‡ A set of inference rules.
‡ A set of theorems.
A well-formed formulae, wff, is any string generated by a grammar.
e.g., the sequence of symbols ((α → β ) → (¬ β → ¬ α )) is a WFF
because it is grammatically correct in propositional logic. 09
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Formal Language
A formal language may be viewed as being analogous to
of words or a collection of sentences.
AI - Reasoning
a collection
‡ In computer science, a formal language is defined by precise
mathematical or machine process able formulas.
‡ A formal language L is characterized as a set F of finite-length
sequences of elements drawn from a specified finite set A of
symbols.
‡ Mathematically, it is an unordered pair L = A, F
‡ If A is words
then the set A is called alphabet of L, and
the elements of F are called words.
‡ If A is sentence
then the set A is called the lexicon or vocabulary of F, and
the elements of F are then called sentences.
‡ The mathematical theory that treats formal languages in general
is known as formal language theory. 10
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Reasoning
• Uncertainty in Reasoning
The world is an uncertain place; often the Knowledge is imperfect
which causes uncertainty. Therefore reasoning must be able to
operate under uncertainty.
AI systems must have ability to reason under conditions of uncertainty.
Uncertainties Desired action
‡ Incompleteness Knowledge : Compensate for lack of knowledge
‡ Inconsistencies Knowledge : Resolve ambiguities and contradictions
‡ Changing Knowledge : Update the knowledge base over time 11
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Reasoning
• Monotonic Logic
Formal logic is a set of rules for making deductions that seem
self evident. A Mathematical logic formalizes such deductions with rules
precise enough to program a computer to decide if an argument is
valid, representing objects and relationships symbolically.
Examples Predicate logic and the inferences we perform on it.
All humans are mortal. Socrates is a human. Therefore Socrates is mortal.
In monotonic reasoning if we enlarge at set of axioms we cannot retract
any existing assertions or axioms.
‡ Most formal logics have a monotonic consequence relation, meaning
that adding a formula to a theory never produces a reduction of its set
of consequences. In other words, a logic is monotonic if the truth
of a proposition does not change when new information (axioms)
are added. The traditional logic is monotonic.
‡ In mid 1970s, Marvin Minsky and John McCarthy pointed out that
pure classical logic is not adequate to represent the commonsense
nature of human reasoning. The reason is, the human reasoning is
non-monotonic in nature. This means, we reach to conclusions from
certain premises that we would not reach if certain other sentences are
included in our premises.
‡ The non-monotonic human reasoning is caused by the fact that our
knowledge about the world is always incomplete and therefore we are
fo ed to reason in the absence of complete information. Therefore we
often revise our conclusions, when new information becomes available.
‡ Thus, the need for non-monotonic reasoning in AI was recognized, and
several formalizations of non-monotonic reasoning.
Only the non-monotonic logic reasoning is presented in next few slides.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
12 Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Reasoning
• Non-Monotonic Logic
Inadequacy of monotonic logic for reasoning is said in the previous slide.
A monotonic logic cannot handle :
Reasoning by default : because consequences may be derived
only because of lack of evidence of the contrary.
Abductive reasoning : because consequences are only deduced as
most likely explanations.
Belief revision : because new knowledge may contradict old beliefs.
A non-monotonic logic is a formal logic whose consequence relation
is not monotonic. A logic is non-monotonic if the truth of a proposition
may change when new information (axioms) are added.
‡ Allows a statement to be retracted.
‡ Used to formalize plausible (believable) reasoning.
Example 1 :
Birds typically fly.
Tweety is a bird.
-------------------------- Tweety (presumably) flies.
‡ Conclusion of non-monotonic argument may not be correct.
Example-2 : (Ref. Example-1)
If Tweety is a penguin, it is incorrect to conclude that Tweety flies.
(Incorrect because, in example-1, default rules were applied when
case-specific information was not available.)
‡ All non-monotonic reasoning are concerned with consistency.
Inconsistency is resolved, by removing the relevant conclusion(s)
derived by default rules, as shown in the example below.
Example -3 :
The truth value (true or false), of propositions such as "Tweety is a bird"
accepts default that is normally true, such as "Birds typically fly".
Conclusions derived was "Tweety flies". When an inconsistency is
recognized, only the truth value of the last type is changed. 13
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Reasoning
1.2 Different Methods of Reasoning
Mostly three kinds of logical reasoning: Deduction, Induction, Abduction.
Deduction
‡ Example: "When it rains, the grass gets wet. It rains. Thus, the
grass is wet."
This means in determining the conclusion; it is using rule and its
precondition to make a conclusion.
‡ Applying a general principle to a special case.
‡ Using theory to make predictions
‡ Usage: Inference engines, Theorem provers, Planning.
Induction
‡ Example: "The grass has been wet every time it has rained. Thus,
when it rains, the grass gets wet."
This means in determining the rule; it is learning the rule after
numerous examples of conclusion following the precondition.
‡ Deriving a general principle from special cases
‡ From observations to generalizations to knowledge
‡ Usage: Neural nets, Bayesian nets, Pattern recognition 14
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Reasoning
Abduction
.
‡ Example: "When it rains, the grass gets wet. The grass is wet, it
must have rained."
Means determining the precondition; it is using the conclusion and
the rule to support that the precondition could explain the conclusion.
‡ Guessing that some general principle can relate a given pattern of
cases
‡ Extract hypotheses to form a tentative theory
‡ Usage: Knowledge discovery, Statistical methods, Data mining.
Analogy
‡ Example: "An atom, with its nucleus and electrons, is like the solar
system, with its sun and planets."
Means analogous; it is illustration of an idea by means of a more
familiar idea that is similar to it in some significant features. and
thus said to be analogous to it.
‡ finding a common pattern in different cases
‡ usage: Matching labels, Matching sub-graphs, Matching
transformations.
Note: Deductive reasoning and Inductive reasoning are the two most
commonly used explicit methods of reasoning to reach a conclusion. 15
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• More about different methods of Reasoning AI - Reasoning
Deduction Example
Reason from facts and general principles to other facts.
Guarantees that the conclusion is true.
‡ Modus Ponens : a valid form of argument affirming the antecedent.
◊ If it is rainy, John carries an umbrella
It is rainy ----------------- (doted line read as "therefore") John carries an umbrella.
◊ If p then q
p ------- q
‡ Modus Tollens : a valid form of argument denying the consequent.
◊ If it is rainy, John carries an umbrella
John does not carry an umbrella ------- ----- ----- (doted line read as "because")
It is not rainy
◊ If p then q
not q ------- not p
16
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Reasoning
Induction Example
Reasoning from many instances to all instances.
‡ Good Movie
Fact You have liked all movies starring Mery.
Inference You will like her next movie.
‡ Birds
Facts: Woodpeckers, swifts, eagles, finches have four
toes on each foot.
Inductive Inference All birds have 4 toes on each foot.
(Note: partridges have only 3).
‡ Objects
Facts Cars, bottles, blocks fall if not held up.
Inductive Inference If not supported, an object will fall.
(Note: an unsupported helium balloon will rise.)
‡ Medicine
Noted People who had cowpox did not get smallpox.
Induction: Cowpox prevents smallpox.
Problem : Sometime inference is correct, sometimes not correct.
Advantage : Inductive inference may be useful even if not correct. It
generates a proposition which may be validated deductively. 17
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Reasoning
Abduction Example
Common form of human reasoning– "Inference to the best explanation".
In Abductive reasoning you make an assumption which, if true,
together with your general knowledge, will explain the facts.
‡ Dating
Fact: Mary asks John to a party.
Abductive Inferences Mary likes John.
John is Mary's last choice.
Mary wants to make someone else jealous.
‡ Smoking house
Fact: A large amount of black smoke is coming
from a home.
Abduction1: the house is on fire.
Abduction2: bad cook.
‡ Diagnosis
Facts: A thirteen year-old boy has a sharp pain
in his right side, a fever, and a high white
blood count.
Abductive inference Appendicitis.
Problem: Not always correct; many explanations possible.
Advantage : Understandable conclusions. 18
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Reasoning
Analogy Example
Analogical Reasoning yields conjectures, possibilities.
If A is like B in some ways, then infer A is like B in other ways.
‡ Atom and Solar System
Statements: An atom, with its nucleus and electrons, is like the
solar system, with its sun and planets.
Inferences: Electrons travel around the nucleus.
Orbits are ci ular.
? Orbits are all in one plane.
? Electrons have little people living on them.
Idea: Transfer information from known (sou e)
to unknown (target).
‡ Sun and Girl
Statement: She is like the sun to me.
Inferences: She lights up my life.
She gives me warmth.
? She is gaseous.
? She is spherical.
‡ Sale man Logic
Statement: John has a fancy car and a pretty girlfriend.
Inferences: If Peter buys a fancy car,
Then Peter will have a pretty girlfriend.
Problems : Few analogical inferences are correct
Advantage : Suggests novel possibilities. Helps to organize information. 19
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
1.3 Sou es of Uncertainty in Reasoning AI - Reasoning
In many problem domains it is not possible to create complete, consistent
models of the world. Therefore agents (and people) must act in uncertain
worlds (which the real world is). We want an agent to make rational
decisions even when there is not enough information to prove that an
action will work.
Uncertainty is omnipresent because of
‡ Incompleteness
‡ Incorrectness
Uncertainty in Data or Expert Knowledge
‡ Data derived from defaults/assumptions
‡ Inconsistency between knowledge from different experts.
‡ “Best Guesses”
Uncertainty in Knowledge Representation
‡ Restricted model of the real system.
‡ Limited expressiveness of the representation mechanism.
Uncertainty in Rules or Inference Process
‡ Incomplete because too many conditions to be explicitly enumerated
‡ Incomplete because some conditions are unknown
‡ Conflict Resolution 20
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Reasoning
1.4 Reasoning and KR
To certain extent, the reasoning depends on the way the knowledge is
represented or chosen.
A good knowledge representation scheme allows easy, natural and
plausible (credible) reasoning.
Reasoning methods are broadly identified as :
‡ Formal reasoning
‡ Procedural reasoning
‡ Reasoning by analogy
‡ Generalization and abstraction
‡ Meta-level reasoning
: Using basic rules of inference with logic
knowledge representations. : Uses procedures that specify how to
perhaps solve sub problems. : This is as Human do, but more difficult
for AI systems. : This is also as Human do; are basically
learning and understanding methods. : Uses knowledge about what we know
and ordering them as per importance.
Note : What ever may be the reasoning method, the AI model must be
able to reason under conditions of uncertainty mentioned before. 21
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Reasoning
1.5 Approaches to Reasoning
There are three different approaches to reasoning under uncertainties.
‡ Symbolic reasoning
‡ Statistical reasoning
‡ Fuzzy logic reasoning
The first two approaches are presented in the subsequent slides. 22
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
2. Symbolic Reasoning AI - Symbolic Reasoning
The basis for intelligent mathematical software is the integration of the "power
of symbolic mathematical tools" with the suitable "proof technology".
Mathematical reasoning enjoys a property called monotonicity, that says,
"If a conclusion follows from given premises A, B, C, …
then it also follows from any larger set of premises, as long as
the original premises A, B, C, … are included."
Human reasoning is not monotonic.
People arrive to conclusions only tentatively, based on partial or incomplete
information, reserve the right to retract those conclusions while they learn new
facts. Such reasoning is non-monotonic, precisely because the set of accepted
conclusions have become smaller when the set of premises is expanded.
23
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
2.1 Non-Monotonic Reasoning AI - Symbolic Reasoning
Non-Monotonic reasoning is a generic name to a class or a specific theory
of reasoning. Non-monotonic reasoning attempts to formalize reasoning
with incomplete information by classical logic systems.
The Non-Monotonic reasoning are of the type
Default reasoning
Ci umscription
Truth Maintenance Systems
24
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Symbolic Reasoning
• Default Reasoning
This is a very common from of non-monotonic reasoning. The conclusions
are drawn based on what is most likely to be true.
There are two approaches, both are logic type, to Default reasoning :
one is Non-monotonic logic and the other is Default logic.
Non-monotonic logic
It has already been defined. It says, "the truth of a proposition may
change when new information (axioms) are added and a logic may be
build to allows the statement to be retracted."
Non-monotonic logic is predicate logic with one extension called modal
operator M which means “consistent with everything we know”. The
purpose of M is to allow consistency.
A way to define consistency with PROLOG notation is :
To show that fact P is true, we attempt to prove ¬P.
If we fail we may say that P is consistent since ¬P is false.
Example :
∀ x : plays_instrument(x) ∧ M manage(x) → jazz_musician(x)
States that for all x, the x plays an instrument and if the fact
that x can manage is consistent with all other knowledge then we
can conclude that x is a jazz musician.
25
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Default Logic
Default logic initiates a new inference rule:
A is known as the prerequisite,
B as the justification, and
C as the consequent.
AI - Symbolic Reasoning
A : B
where
C
‡ Read the above inference rule as:
" if A, and if it is consistent with the rest of what is known to
assume that B, then conclude that C ".
‡ The rule says that given the prerequisite, the consequent can be
inferred, provided it is consistent with the rest of the data.
‡ Example : Rule that "birds typically fly" would be represented as
bird(x) : flies(x) which says flies (x)
" If x is a bird and the claim that x flies is consistent with
what we know, then infer that x flies".
‡ Note : Since, all we know about Tweety is that :
Tweety is a bird, we therefore inferred that Tweety flies.
‡ The idea behind non-monotonic reasoning is to reason with first
order logic, and if an inference can not be obtained then use the set
of default rules available within the first order formulation. 26
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Symbolic Reasoning
. [continuing default logic]
‡ Applying Default Rules :
While applying default rules, it is necessary to check their
justifications for consistency, not only with initial data, but also with
the consequents of any other default rules that may be applied. The
application of one rule may thus block the application of another.
To solve this problem, the concept of default theory was extended.
‡ Default Theory
It consists of a set of premises W and a set of default rules D.
An extension for a default theory is a set of sentences E which can
be derived from W by applying as many rules of D as possible
(together with the rules of deductive inference) without generating
inconsistency.
Note : D the set of default rules has a unique syntax of the form
α (x) : E β (x) where
γ (x)
α (x) is the prerequisite of the default rule
E β (x) is the consistency test of the default rule
γ (x) is the consequent of the default rule
The rule can be read as
For all individual x1 . . . . xm
If α (x) is believed and
If each of β (x) is consistent with our beliefs,
Then γ (x) may be believed.
27
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Symbolic Reasoning
[continuing default logic] Example : A Default Rule says " Typically an American adult owns a car ".
American(x) ∧ Adult(x) : M((∃ y) . car(y) ∧ owns(x,y))
((∃ y) . car(y) ∧ owns(x,y))
The rule is explained below :
The rule is only accessed if we wish to know whether or not John
owns a car then an answer can not be deduced from our current
beliefs.
This default rule is applicable if we can prove from our beliefs that
John is an American and an adult, and believing that there is some
car that is owned by John does not lead to an inconsistency.
If these two sets of premises are satisfied, then the rule states that
we can conclude that John owns a car. 28
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Symbolic Reasoning
• Ci umscription
Ci umscription is a non-monotonic logic to formalize the common sense
assumption. Ci umscription is a formalized rule of conjecture (guess) that
can be used along with the rules of inference of first order logic.
Ci umscription involves formulating rules of thumb with "abnormality"
predicates and then restricting the extension of these predicates,
ci umscribing them, so that they apply to only those things to which they
are currently known.
Example : Take the case of Bird Tweety
The rule of thumb is that "birds typically fly" is conditional. The
predicate "Abnormal" signifies abnormality with respect to flying ability.
Observe that the rule ∀ x(Bird(x) & ¬ Abnormal(x) → Flies)) does not
allow us to infer that "Tweety flies", since we do not know that he is
abnormal with respect to flying ability.
But if we add axioms which ci umscribe the abnormality predicate to
which they are currently known say "Bird Tweety" then the inference
can be drawn. This inference is non-monotonic.
29
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Truth Maintenance Systems AI - Symbolic Reasoning
Reasoning Maintenance System (RMS) is a critical part of a reasoning
system. Its purpose is to assure that inferences made by the reasoning
system (RS) are valid.
The RS provides the RMS with information about each inference it
performs, and in return the RMS provides the RS with information about
the whole set of inferences.
Several implementations of RMS have been proposed for non-monotonic
reasoning. The important ones are the :
Truth Maintenance Systems (TMS) and
Assumption-based Truth Maintenance Systems (ATMS).
The TMS maintains the consistency of a knowledge base as soon as new
knowledge is added. It considers only one state at a time so it is not
possible to manipulate environment.
The ATMS is intended to maintain multiple environments.
The typical functions of TMS are presented in the next slide. 30
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
[continuing Truth Maintenance Systems] AI - Symbolic Reasoning
Truth Maintenance Systems (TMS)
A truth maintenance system maintains consistency in knowledge
representation of a knowledge base.
The functions of TMS are to :
Provide justifications for conclusions
When a problem solving system gives an answer to a user's query, an
explanation of that answer is required;
Example : An advice to a stockbroker is supported by an explanation of
the reasons for that advice. This is constructed by the Inference Engine
(IE) by tracing the justification of the assertion.
Recognize inconsistencies
The Inference Engine (IE) may tell the TMS that some sentences are
contradictory. Then, TMS may find that all those sentences are believed
true, and reports to the IE which can eliminate the inconsistencies by
determining the assumptions used and changing them appropriately.
Example : A statement that either Abbott, or Babbitt, or Cabot is guilty
together with other statements that Abbott is not guilty, Babbitt is not
guilty, and Cabot is not guilty, form a contradiction.
Support default reasoning
In the absence of any firm knowledge, in many situations we want to
reason from default assumptions.
Example : If "Tweety is a bird", then until told otherwise, assume that
"Tweety flies" and for justification use the fact that "Tweety is a bird"
and the assumption that "birds fly". 31
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Symbolic Reasoning
2.2 Implementation Issues
The issues and weaknesses related to implementation of non-monotonic
reasoning in problem solving are :
How to derive exactly those non-monotonic conclusion that are relevant
to solving the problem at hand while not wasting time on those that
are not necessary.
How to update our knowledge incrementally as problem solving
progresses.
How to over come the problem where more than one interpretation of the
known facts is qualified or approved by the available inference rules.
In general the theories are not computationally effective, decidable or
semi decidable.
The solutions offered, considering the reasoning processes into two parts :
one, a problem solver that uses whatever mechanism it happens to
have to draw conclusions as necessary, and
second, a truth maintenance system whose job is to maintain
consistency in knowledge representation of a knowledge base. 32
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning 3. Statistical Reasoning :
.
In the logic based approaches described, we have assumed that everything is
either believed false or believed true.
However, it is often useful to represent the fact that we believe such that
something is probably true, or true with probability (say) 0.65.
This is useful for dealing with problems where there is randomness and
unpredictability (such as in games of chance) and also for dealing with problems
where we could, if we had sufficient information, work out exactly what is true.
To do all this in a principled way requires techniques for probabilistic reasoning.
In this section, the Bayesian Probability Theory is first described and then
discussed how uncertainties are treated.
33
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
• Recall glossary of terms .
Probabilities :
Usually, are descriptions of the likelihood of some event occurring
(ranging from 0 to 1).
Event :
One or more outcomes of a probability experiment .
Probability Experiment :
Process which leads to well-defined results call outcomes.
Sample Space :
Set of all possible outcomes of a probability experiment.
Independent Events :
Two events, E1 and E2, are independent if the fact that E1 occurs
does not affect the probability of E2 occurring.
Mutually Exclusive Events :
Events E1, E2, ..., En are said to be mutually exclusive if
the occurrence of any one of them automatically implies the
non-occurrence of the remaining n − 1 events.
Disjoint Events :
Another name for mutually exclusive events. 34
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Classical Probability : AI - Statistical Reasoning
.
Also called a priori theory of probability.
The probability of event A = no of possible outcomes f divided by the
total no of possible outcomes n ; ie., P(A) = f / n.
Assumption: All possible outcomes are equal likely.
Empirical Probability :
Determined analytically, using knowledge about the nature of the
experiment rather than through actual experimentation.
Conditional Probability :
The probability of some event A, given the occurrence of some other
event B. Conditional probability is written P(A|B), and read as "the
probability of A, given B ".
Joint probability :
The probability of two events in conjunction. It is the probability of
both events together. The joint probability of A and B is written P(A ∩
B) ; also written as P(A, B).
Marginal Probability :
The probability of one event, regardless of the other event. The
marginal probability of A is written P(A), and the marginal probability
of B is written P(B).
35
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Examples
Example 1
AI - Statistical Reasoning
Sample Space - Rolling two dice The sums can be 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 . Note that each of these are not equally likely. The only way to get a
sum 2 is to roll a 1 on both dice, but can get a sum 4 by rolling out
comes as (1,3), (2,2), or (3,1). Table below illustrates a sample space for the sum obtain.
Second Dice
First dice 1 2 3 4 5 6 1 2 3 4 5 6 7 2 3 4 5 6 7 8 3 4 5 6 7 8 9 4 5 6 7 8 9 10 5 6 7 8 9 10 11 6 7 8 9 10 11 12
Classical Probability
Table below illustrates frequency and distribution for the above sums.
Sum 2 3 4 5 6 7 8 9 10 11 12
Frequency 1 2 3 4 5 6 5 4 3 2 1
Relative frequency 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 1/36 1/36
The classical probability is the relative frequency of each event.
Classical probability P(E) = n(E) / n(S); P(6) = 5 / 36, P(8) = 5 / 36
Empirical Probability
The empirical probability of an event is the relative frequency of a
frequency distribution based upon observation P(E) = f / n
36
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
Example 2
Mutually Exclusive Events (disjoint) : means nothing in common
Two events are mutually exclusive if they cannot occur at the
same time.
(a) If two events are mutually exclusive,
then probability of both occurring at same time is P(A and B) = 0
(b) If two events are mutually exclusive ,
then the probability of either occurring is P(A or B) = P(A) + P(B)
Given P(A)= 0.20, P(B)= 0.70, where A and B are disjoint
then P(A and B) = 0
The table below indicates intersections ie "and" of each pair of
events. "Marginal" means total; the values in bold means given; the
rest of the values are obtained by addition and subtraction.
B B' Marginal
A 0.00 0.20 0.20
A' 0.70 0.10 0.80
Marginal 0.70 0.30 1.00
Non-Mutually Exclusive Events The non-mutually exclusive events have some overlap. When P(A) and P(B) are added, the probability of the intersection (ie.
"and" ) is added twice, so subtract once.
P(A or B) = P(A) + P(B) - P(A and B) Given : P(A) = 0.20, P(B) = 0.70, P(A and B) = 0.15
B B' Marginal
A 0.15 0.05 0.20
A' 0.55 0.25 0.80
Marginal 0.70 0.30 1.00
37
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
Example 3
Factorial , Permutations and Combinations
Factorial
The factorial of an integer n ≥ 0 is written as n! .
n! = n × n-1 × . . . × 2 × 1. and in particular, 0! = 1.
It is, the number of permutations of n distinct objects;
e.g., no of ways to arrange 5 letters A, B, C, D and E into a word is 5!
5! = 5 x 4 x 3 x 2 x 1 = 120
N! = (N) x (N-1) x (N-2) x . . . x (1)
n! = n (n - 1)! , 0! = 1
38
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Permutation
AI - Statistical Reasoning
The permutation is arranging elements (objects or symbols) into
distinguishable sequences. The ordering of the elements is important.
Each unique ordering is a permutation. Number of permutations of „ n ‟ different things taken „ r ‟ at a time is
given by n n!
P r
= (n –r)!
(for convenience in writing, here after the symbol Pn
r is written as
nPr or P(n,r) )
Example 1
Consider a total of 10 elements, say integers 1, 2, ...,
10. A permutation of 3 elements from this set is (5, 3, 4).
Here n = 10 and r = 3.
The number of such unique sequences are calculated as P(10,3) = 720.
Example 2
Find the number of ways to arrange the three letters in the word
CAT in to two-letter groups like CA or AC and no repeated letters.
This means permutations are of size r = 2 taken from a set of
size n = 3. so P(n, r) = P(3,2) = 6.
The ways are listed as CA CT AC AT TC TA.
Similarly, permutations of size r = 4, taken from a set of size n = 10,
10! 10! 10x9x8x7x6x5x4x3x2x1
P(n, r) = P(10,4) = = =
(10 – 4)! 6! 6x5x4x3x2x1
[continuing next slide]
39
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
[continuing example 3]
Combinations
AI - Statistical Reasoning
Combination means selection of elements (objects or
symbols). The ordering of the elements has no importance. Number of Combination of „ n ‟ different things, taken „ r ‟ at a time is
n n n!
Cr = r =
here
r!(n –r)!
r is the size of each combination of elements,
n is the total size of elements from which elements are permuted,
! is the factorial operator.
(for convenience in writing, here after the symbol Cn
r is written as nCr
or C(n,r) )
Example Find the number of combinations of size 2 without repeated letters that
can be made from the three letters in the word CAT, order doesn't matter; AT is the same as TA.
This means combinations of size r =2 taken from a set of size n = 3,
so C(n , r) = C(3 , 2) = 3 . The ways are listed as CA CT CA .
Using the formula for finding the number of combinations of
r objects from a set of n objects is:
n! 3! 3 x 2 x 1 6
C(n, r) = C(3,2) = = = =
= 3
r! (n-r)! 2! X 1! 2 x 1 X (1!) 2
If n is large then finding n! becomes difficult. The alternate way is
given below
Find combinations of size r = 4, taken from a set of size n = 10,
P(10,4) 10! 10!
C(n, r) = C(10,4) = = =
4! 4! X 6! 4! X (10 – 4)!
10 x 9 x 8 x 7 x 6 x 5 x 4 x 3 x 2 x 1
= 4 x 3 x 2 x 1 (6 x 5 x 4 x 3 x 2 x 1)
40
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
3.1 Probability and Bayes’ Theorem
In probability theory, Bayes' theorem relates
marginal probabilities of two random events.
AI - Statistical Reasoning
the conditional and
• Probability : The Probabilities are numeric values between 0 and 1 (both
inclusive) that represent ideal uncertainties (not beliefs).
Probability of event A is P(A)
instances of the event A
P(A) =
total instances
P(A) = 0
indicates total uncertainty in A,
P(A) = 1 indicates total certainty and
0< P(A) < 1 values in between tells degree of uncertainty
Probability Rules :
‡ All probabilities are between 0 and 1 inclusive 0 <= P(E) <= 1.
‡ The sum of all the probabilities in the sample space is 1.
‡ The probability of an event which must occur is 1.
‡ The probability of the sample space is 1.
‡ The probability of any event which is not in the sample space is zero.
‡ The probability of an event not occurring is P(E') = 1 - P(E)
Example 1 : A single 6-sided die is rolled.
What is the probability of each outcome?
What is the probability of rolling an even number?
What is the probability of rolling an odd number?
The possible outcomes of this experiment are 1, 2, 3, 4, 5, 6.
The Probabilities are :
P(1) = No of ways to roll 1 / total no of sides = 1/6
P(2) = No of ways to roll 2 / total no of sides = 1/6
P(3) = No of ways to roll 3 / total no of sides = 1/6
P(4) = No of ways to roll 4 / total no of sides = 1/6
P(5) = No of ways to roll 5 / total no of sides = 1/6
P(6) = No of ways to roll 6 / total no of sides = 1/6
P(even) = ways to roll even no / total no of sides = 3/6 = 1/2
P(odd) = ways to roll odd no / total no of sides = 3/6 = 1/2
41
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
Example 2 : Roll two dices
Each dice shows one of 6 possible numbers;
Total unique rolls is 6 x 6 = 36;
List of the joint possibilities for the two dices are:
(1, 1) (1, 2)
(1, 4) (1, 5) (1, 6)
(1, 3)
(2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6)
(3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6)
(4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6)
(5, 1) (5, 2) (5, 3) (5, 4) (5, 5) (5, 6)
(6, 1) (6, 2) (6, 3) (6, 4) (6, 5) (6, 6)
Roll two dices;
The rolls that add up to 4 are ((1,3), (2,2), (3,1)).
The probability of rolling dices such that total of 4 is 3/36 = 1/12
and the chance of it being true is (1/12) x 100 = 8.3%.
42
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Conditional probability P(A|B)
A conditional probability is the probability of
another event has occurred.
AI - Statistical Reasoning
an event given that
Example : Roll two dices. What is the probability that the total of two dice will be greater than 8 given that the first die is a 6 ? First List of the joint possibilities for the two dices are:
(1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6)
(2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6)
(3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6)
(4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6)
(5, 1) (5, 2) (5, 3) (5, 4) (5, 5) (5, 6)
(6, 1) (6, 2)
(6, 3) (6, 4) (6, 5) (6, 6)
There are 6 outcomes for which the first die is a 6, and of these, there
are 4 outcomes that total more than 8 are (6,3; 6,4; 6,5; 6,6).
The probability of a total > 8 given that first die is 6 is
therefore 4/6 = 2/3 .
This probability is written as: P(total>8 | 1st die = 6) = 2/3
event condition
Read as "The probability that the total is > 8 given that die one
is 6 is 2/3."
Written as P(A|B) , is the probability of event A given that the
event B has occurred. 43
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
Probability of A and B is P(A and B)
.
The probability that events A and B both occur.
Note : Two events are independent if the occurrence of one is
unrelated to the probability of the occurrence of the other.
‡ If A and B are independent
then probability that events A and B both occur is:
P(A and B) = P(A) x P(B)
ie product of probability of A and probability of B. ‡ If A and B are not independent
then probability that events A and B both occur is:
P(A and B) = P(A) x P(B|A) where
P(B|A) is conditional probability of B given A
Example 1: P(A and B) if events A and B are independent
Draw a card from a deck , then replace it, draw another card.
Find probability that 1st card is Ace of clubs (event A) and 2nd
card is any Club (event B).
Since there is only one Ace of Clubs, therefore
probability P(A) = 1/52.
Since there are 13 Clubs, the probability P(B) = 13/52 = 1/4.
Therefore, P(A and B) = p(A) x p(B) = 1/52 x 1/4 = 1/208.
Example 2: P(A and B) if events A and B are not independent
Draw a card from a deck, not replacing it, draw another card.
Find probability that both cards are Aces ie the 1st card is Ace
(event A) and the 2nd card is also Ace (event B).
Since 4 of 52 cards are Aces, therefore probability P(A) = 4/52
= 1/13.
Of the 51 remaining cards, 3 are aces. so, probability of 2nd
card is also Ace (event B) is P(B|A) = 3/51 = 1/17.
Therefore, P(A and B) = p(A) x p(B|A) = 1/13 x 1/17 = 1/221 44
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Probability of A or B is P(A or B)
The probability of either event A or event
AI - Statistical Reasoning
B occur.
Two events are mutually exclusive if they cannot occur at same time. ‡ If A and B are mutually exclusive
then probability that events A or B occur is:
P(A or B) = p(A) + p(B)
ie sum of probability of A and probability of B
‡ If A and B are not mutually exclusive
then probability that events A and B both occur is:
P(A or B) = P(A) x P(B|A) – P(A and B) where
P(A and B) is probability that events A and B both occur while events
A and B are independent and P(B|A) is conditional probability of B
given A.
Example 1: P(A or B) if events A or B are mutually exclusive
Rolling a die.
Find probability of getting either, event A as 1 or event B as 6?
Since it is impossible to get both, the event A as 1 and event B
as 6 in same roll, these two events are mutually exclusive.
The probability P(A) = P(1) = 1/6 and P(B) = P(6) = 1/6
Hence probability of either event A or event B is :
P(A or B) = p(A) + p(B) = 1/6 + 1/6 = 1/3
Example 2: P(A or B) if events A or B are not mutually exclusive
Find probability that a card from a deck will be either an
Ace or a Spade?
probability P(A) is P(Ace) = 4/52 and P(B) is P(spade) = 13/52.
Only way in a single draw to be Ace and Spade is Ace of
Spade; which is only one, so probability P(A and B) is
P(Ace and Spade) = 1/52.
Therefore, the probability of event A or B is :
P(A or B) = P(A) + P(B) – P(A and B)
= P(ace) + P(spade) - P(Ace and Spade)
= 4/52 + 13/52 - 1/52 = 16/52 = 4/13
45
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
Summary of symbols & notations
A U B (A union B) 'Either A or B occurs or both occur'
A ∩ B (A intersection B) 'Both A and B occur'
A ⊆ B (A is a subset of B) 'If A occurs, so does B'
A' Ā 'Event A does not occur'
Φ (the empty set) An impossible event
S (the sample space) An event that is certain to occur
A ∩ B = Φ Mutually exclusive Events
P(A) Probability that event A occurs
P(B) Probability that event B occurs
P(A U B) Probability that event A or event B occurs
P(A ∩ B) Probability that event A and event B occur
P(A ∩ B) = P(A) . P(B) Independent events
P(A ∩ B) = 0 Mutually exclusive Events
P(A U B) = P(A) + P(B) – P(AB) Addition rule;
P(A U B) = P(A) + P(B) – P(A) . P(B) Addition rule; independent events
P(A U B) = P(A) + P(B) – P(A ∩ B)
P(A U B) = P(A) + P(B) – P(B|A).P(A)
P(A U B) = P(A) + P(B) Addition rule; mutually exclusive Events
A|B (A given B) "Event A will occur given that event B has
occurred"
P(A|B) Conditional probability that event A will occur given that event B has occurred already
P(B|A) Conditional probability that event B will occur given that event A has occurred already
P(A ∩ B) = P(A|B).P(B) or Multiplication rule P(A ∩ B) = P(B|A).P(A)
P(A ∩ B) = P(A) . P(B) Multiplication rule; independent events; ie probability of joint events A and B
P(A|B) = P(A ∩ B) / P(B) Rule to determine a conditional probability
from unconditional probabilities. 46
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
• Bayes’ Theorem .
Bayesian view of probability is related to degree of belief.
It is a measure of the plausibility of an event given incomplete knowledge.
Bayes' theorem is also known as Bayes' rule or Bayes' law, or called
Bayesian reasoning.
The probability of an event A conditional on another event B ie P(A|B) is
generally different from probability of B conditional on A ie P(B|A).
There is a definite relationship between the two, P(A|B) and P(B|A),
and Bayes' theorem is the statement of that relationship.
Bayes theorem is a way to calculate P(A|B) from a knowledge of P(B|A).
Bayes' Theorem is a result that allows new information to be used to
update the conditional probability of an event.
[Continued in next slide] 47
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
[Continued from previous
Bayes' Theorem
slide]
AI - Statistical Reasoning
Let S be a sample space.
Let A1, A2, ... , An be a set of mutually exclusive events from S.
Let B be any event from the same S, such that P(B) > 0. Then Bayes' Theorem describes following two probabilities :
P(Ak|B) = P(Ak ∩ B)
and
P(A1 ∩ B) + P(A2 ∩ B) + - - - - + P(An ∩ B)
by invoking the fact P(Ak ∩ B) = P(Ak).P(B|Ak) the probability
P(Ak).P(B|Ak) P(Ak|B) =
P(A1).P(B|A1) + P(A2).P(B|A2)+ - - - - + P(An).P(B|An)
Applying Bayes' Theorem :
Bayes' theorem is applied while following conditions exist.
‡ the sample space S is partitioned into a set of mutually exclusive
events A1, A2, . . . . . , An .
‡ within S, there exists an event B, for which P(B) > 0.
‡ the goal is to compute a conditional probability of the form :
P(Ak|B).
‡ you know at least one of the two sets of probabilities
described below
◊ P(Ak ∩ B) for each Ak
◊ P(Ak) and P(B|Ak) for each Ak
The Bayes' theorem is best understood through an example below. 48
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
Example 1: Applying Bayes' Theorem
Problem : Marie's marriage is tomorrow.
in recent years, each year it has rained only 5 days.
the weatherman has predicted rain for tomorrow.
when it actually rains, the weatherman correctly forecasts rain 90% of
the time.
when it doesn't rain, the weatherman incorrectly forecasts rain 10% of
the time. The question : What is the probability that it will rain on the day of
Marie's wedding?
Solution : The sample space is defined by two mutually exclusive events
– "it rains" or "it does not rain". Additionally, a third event occurs when
the "weatherman predicts rain".
The events and probabilities are stated below. ◊ Event A1 : rains on Marie's wedding.
◊ Event A2 : does not rain on Marie's wedding
◊ Event B : weatherman predicts rain. ◊ P(A1) = 5/365 =0.0136985 [Rains 5 days in a year.]
◊ P(A2) = 360/365 = 0.9863014 [Does not rain 360 days in a year.]
◊ P(B|A1) = 0.9 [When it rains, the weatherman predicts rain 90% time.]
◊ P(B|A2) = 0.1 [When it does not rain, weatherman predicts rain 10% time.]
We want to know P(A1|B), the probability that it will rain on the day of
Marie's wedding, given a forecast for rain by the weatherman. The answer can be determined from Bayes' theorem, shown below. P(A1).P(B|A1) (0.014)(0.9)
P(A1|B) =
=
P(A1).P(B|A1)+P(A2).P(B|A2) [(0.014)(0.9)+(0.986)(0.1)]
= 0.111
So, despite the weatherman's prediction, there is a good chance that
Marie will not get rain on at her wedding. Thus Bayes theorem is used to calculate conditional probabilities.
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
49
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
Example 2: Applying Bayes' Theorem
‡ Let S be a sample space.
‡ Let E1 and E2 be two mutually exclusive events forming a partition
of the sample space S
‡ Let E be any event of the sample space such that P(E) ≠ 0.
Recall from Conditional Probability
The notation P(E1 | E) means "the probability of the event E1 given that
E has already occurred".
‡ The sample space S is described as "the integers 1 to 15" and is
partitioned into :
E1 = "the integers 1 to 8" and
E2 = "the integers 9 to 15".
‡ If E is the event "even number" then the probabilities for the
situation described by Baye's Theorem can be calculated in two
ways, both giving same results.
P(E1|E) =
P(E1 ∩ E) =
4 / 15 = 4 / 7
P(E1 ∩ E) + P(E2 ∩ E)
(4 / 15) + (3 / 15)
P(E1|E) =
P(E1).P(E|E1) 8 / 15 x 4 / 8
= = 4 / 7
P(E1).P(E|E1) + P(E2).P(E|E2)
(8/15 x 4/8) + (7/15 x 3/15)
Thus Bayes' Theorem can be extended for Mutually Exclusive Events as :
P(Ei | E) =
P(Ei ∩ E)
P(E1 ∩ E) + P(E2 ∩ E) + . . . . . + P(Ek ∩ E)
50
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
Example 3 : Clinic Trial
In a clinic, the probability of the patients having HIV virus is 0.15.
A blood test done on patients :
If patient has virus, then the test is +ve with probability 0.95.
If the patient does not have the virus, then the test is +ve with probability 0.02.
Assign labels to events : H = patient has virus; P = test +ve
Given : P(H) = 0.15 ; P(P|H) = 0.95 ; P(P|¬H) = 0.02
Find :
If the test is +ve what are the probabilities that the patient
i) has the virus ie P(H|P) ; ii) does not have virus ie P(¬H|P) ;
If the test is -ve what are the probabilities that the patient
iii) has the virus ie P(H|¬P) ; iv) does not have virus ie P(¬H|¬P) ;
Calculations : i) For P(H|P) we can write down Bayes Theorem as
P(H|P) = [ P(P|H) P(H) ] / P(P)
We know P(P|H) and P(H) but not P(P) which is probability of a +ve result.
There are two cases, that a patient could have a +ve result, stated below :
1. Patient has virus and gets a +ve result : H ∩ P
2. Patient does not have virus and gets a +ve result: ¬H ∩
P Find probabilities for the above two cases and then add
ie P(P) = P(H ∩ P) + P(¬H ∩ P).
But from the second axiom of probability we have :
P(H ∩ P) = P(P|H) P(H) and P(¬H ∩ P) = P(P|¬H) P(¬H).
Therefore putting these we get :
P(P) = P(P|H) P(H) + P(P|¬H) P(¬H) = 0.95 × 0.15 + 0.02 × 0.85 = 0.1595
Now substitute this into Bayes Theorem and obtain P(H|P)
P(P|H) P(H)
P(H|P) = = 0.95 × 0.15 / 0.1595 = 0.8934
P(P|H) P(H) + P(P|¬H) P(¬H)
ii) Next is to work out P(¬H|P)
P(¬H|P) = 1 - P(H|P) = 1 – 0.8934 = 0.1066
iii)
Next is to work out P(H|¬P) ; again we write down Bayes Theorem
P(¬P|H) P(H)
P(H|¬P) =
here we need P(¬P) which is 1 – P(P)
P(¬P)
= (0.05 × 0.15)/(1-0.1595) = 0.008923
iv) Finally, work out P(¬H|¬P) It is just 1 - P(H|¬P) = 1- 0.008923 = 0.99107
51
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
3.2 Certainty Factors in Rule-Based Systems
The certainty-factor model was one of the most popular model for
the representation and manipulation of uncertain knowledge in the
early (1980s) Rule-based expert systems.
The model was criticized by resea hers in artificial intelligence
and statistics being ad-hoc-in nature. Resea hers and developers
have stopped using the model.
Its place has been taken by more expressive formalisms of
Bayesian belief networks for the representation and manipulation
of uncertain knowledge.
The manipulation of uncertain knowledge in the Rule-based expert
systems is illustrated in the next three slide before moving to Bayesian
Networks.
52
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
• Rule Based Systems .
Rule based systems have been discussed in previous lectures.
Here it is recalled to explain uncertainty.
A rule is an expression of the form "if A then B" where
A is an assertion and B can be either an action or
another assertion.
Example : Trouble shooting of water pumps
1. If pump failure then the pressure is low
2. If pump failure then check oil level
3. If power failure then pump failure
Rule based system consists of a library of such rules.
Rules reflect essential relationships within the domain.
Rules reflect ways to reason about the domain.
Rules draw conclusions and points to actions, when specific
information about the domain comes in. This is called inference.
The inference is a kind of chain reaction like :
If there is a power failure then (see rules 1, 2, 3 mentioned above)
Rule 3 states that there is a pump failure, and
Rule 1 tells that the pressure is low, and
Rule 2 gives a (useless) recommendation to check the oil level.
It is very difficult to control such a mixture of inference back and forth
in the same session and resolve such uncertainties.
How to deal such uncertainties ? [continued in the next slide]
53
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning [continued from the previous slide]
.
How to deal uncertainties in rule based system?
A problem with rule-based systems is that often the connections
reflected by the rules are not absolutely certain (i.e. deterministic),
and the gathered information is often subject to uncertainty.
In such cases, a certainty measure is added to the premises as well
as the conclusions in the rules of the system.
A rule then provides a function that describes : how much a change
in the certainty of the premise will change the certainty of the
conclusion.
In its simplest form, this looks like :
If A (with certainty x) then B (with certainty f(x))
This is a new rule, say rule 4, added to earlier three rules.
[continued in the next slide]
54
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
[continued from the previous slide]
There are many schemes for treating uncertainty in rule based systems.
The most common are :
‡ Adding certainty factors.
‡ Adoptions of Dempster-Shafer belief functions.
‡ Inclusion of fuzzy logic.
In these schemes, uncertainty is treated locally, means action is
connected directly to incoming rules and uncertainty of their elements.
Example : In addition to rule 4 , in previous slide, we have the rule
If C (with certainty x) then B (with certainty g(x))
Now If the information is that A holds with certainty a and C holds
with certainty c, Then what is the certainty of B ?
Note : Depending on the scheme, there are different algebras for such
a combination of uncertainty. But all these algebras in many cases
come to incorrect conclusions because combination of uncertainty is
not a local phenomenon, but it is strongly dependent on the entire
situation (in principle a global matter).
55
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
3.3 Bayesian Networks and Certainty Factors .
A Bayesian network (or a belief network) is a probabilistic graphical model
that represents a set of variables and their probabilistic independencies.
For example, a Bayesian network could represent the probabilistic
relationships between diseases and symptoms. Given symptoms, the network
can be used to compute the probabilities of the presence of various diseases.
Bayesian Networks are also called : Bayes nets, Bayesian Belief Networks
(BBNs) or simply Belief Networks. Causal Probabilistic Networks (CPNs).
A Bayesian network consists of :
a set of nodes and a set of directed edges between nodes.
the edges reflect cause-effect relations within the domain.
The effects are not completely deterministic (e.g. disease -> symptom).
the strength of an effect is modeled as a probability. 56
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
• Bayesian Networks
We have applied Bayesian probability theory, in earlier three examples
(example 1, 2, and 3) , to relate two or more events. But this can be
used to relate many events by tying them together in a network.
Consider the previous example 3 - Clinic trial The trial says, the probability of the patients having HIV virus is 0.15.
A blood test done on patients :
If patient has virus, the test is +ve with probability 0.95. If the patient does not have the virus, the test is +ve with probability 0.02. This means given : P(H) = 0.15 ; P(P|H) = 0.95 ; P(P|¬H) = 0.02
Imagine, the patient is given a second test independently of the first;
means the second test is done at a later date by a different person
using different equipment. So, the error on the first test does not affect the probability of an error on the second test.
In other words the two tests are independent. This is depicted using the
diagram below :
A simple example of a Bayesian Network.
P1
Event H is the cause of the two events P1 and P2.
The arrows represent the fact that H is driving H
P2
P1 and P2.
The network contained 3 nodes.
If both P1 and P2 are +ve then find the probability that patient has the virus ? In
other words asked to find P(H|P1 ∩ P2) .
How to find ? [continued in the next slide]
57
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Bayes Theorem
AI - Statistical Reasoning
(Ref. previous previous slide example 3 )
P(P1∩ P2|H) . P(H)
P(H|P1 ∩ P2) =
P(P1∩ P2) Here there are two quantities which we do not know. The first is and the second is ‡ Find P(P1 ∩ P2|H)
Since the two tests are independent, so
P(P1 ∩ P2|H) = (P1|H)P(P2|H)
‡ Find P(P1 ∩ P2)
As worked before for P(P) which is the probability of a +ve
result, here again break this into two separate cases:
◊ patient has virus and both tests are +ve
◊ patient not having virus and both tests are +ve ‡ As before use the second axiom of probability
P(P1 ∩ P2) = P(P1 ∩ P2 |H) P(H) + P(P1 ∩ P2 |¬H) P(¬H) ‡ Because the two tests are independent given H we can write :
P(P1 ∩ P2) = P(P1|H) P(P2|H) P(H) + P(P1|¬H) P(P2|¬H) P(¬H)
= 0.95 × 0.95 × 0.15 + 0.02× 0.02 × 0.85
= 0.135715
‡ Substitute this into Bayes Theorem above and obtain
P(P1∩ P2|H) . P(H) P(H|P1 ∩ P2) =
P(P1∩ P2)
= (0.95 x 0.95 x 0.15) / 0.135715 = 0.99749
‡ Note : The results while two independent HIV tests performed
Previously we calculated the probability, that the patient had HIV
given one +ve test, as 0.8934.
Later second HIV test was performed. After two +ve tests, we
see that the probability has gone up to 0.99749.
So after two +ve tests it is more certain that the patient does
have the HIV virus.
The next slide : a case where one tests is +ve and other is -ve.
P(P1 ∩ P2) P(P1 ∩ P2|H)
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
58
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
Case where one tests is +ve and other is -ve.
This means, an error on one of the tests but we don‟t know which one;
it may be any one.
The issue is - whether the patient has HIV virus or not ?
‡ We need to calculate
Following same steps for the case of two +ve tests,
write Bayes Theorem
P(P1 ∩ ¬P2 |H) P(H) P(H| P1 ∩ ¬P2) =
P(P1 ∩ ¬P2) ‡ Now work out P(P1 ∩ ¬P2 |H) and P(P1 ∩ ¬P2) using the fact that
P1 and P2 are independent given H ,
P(P1 ∩ ¬P2 |H) = P(P1|H) P(¬P2|H) and
P(P1 ∩ ¬P2) = P(P1 ∩ ¬P2 |H) P(H) + P(P1 ∩ ¬P2 |¬H) P(¬H)
= P(P1|H) P(¬P2 |H) P(H) + P(P1|¬H) P(¬P2|¬H) P(¬H)
= 0.95 × 0.05 × 0.15 + 0.02 × 0.98 × 0.85
= 0.023785
‡ Substitute these values into Bayes Theorem, we obtain
0.95 x 0.05 x 0.15 P(H| P1 ∩ ¬P2) =
= 0.299
0.023785
‡ Note :
Belief in H, the event that the patient has virus, has increased.
Prior belief was 0.15 but it has now gone up to 0.299.
This appears strange because we have been given two
contradictory pieces of data. But looking closely we see that
probability of an error in each case is not equal. ‡ The probability of a +ve test when patient is actually -ve is 0.02.
The probability of a -ve test when patient is actually +ve is 0.05.
Therefore we are more inclined to believe an error on the second
test and this slightly increases our belief that the patient is +ve.
P(H| P1 ∩ ¬P2).
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
59 Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
• More Complicated Bayesian Networks .
The previous network was simple contained three nodes. Let us look at a
slightly more complicated one in the context of heart disease.
Given the following facts about heart disease.
Either smoking or bad diet or both can make heart disease more likely.
Heart disease can produce either or both of the following two
symptoms:
‡ high blood pressure
‡ an abnormal electrocardiogram
Here smoking and bad diet are regarded as causes of heart disease.
The heart disease in turn is a cause of high blood pressure and an
abnormal electrocardiogram.
[continued in the next slide] 60
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
[continued from
An appropriate
S
AI - Statistical Reasoning
the previous slide]
network for heart disease is represented as
The symbols define :
B S = smoking,
D = bad diet,
H D
H = heart disease,
B = high blood pressure,
E = abnormal electrocardiogram
E
Here H has two causes S and D.
Find probability of H, given each of the four possible
combinations of S and D.
A medical survey gives us the following data :
P(S) = 0.3 P(D) = 0.4
P(H| S ∩ D) = 0.8
P(H| ¬S ∩ D) = 0.5
P(H| S ∩ ¬D) = 0.4
P(H| ¬S ∩ ¬D) = 0.1
P(B|H) = 0.7 P(B|¬H) = 0.1
P(E|H) = 0.8 P(E|¬H) = 0.1
Given these information, an answer to the question concerning this
network :
what is the probability of heart disease ?
[Note : The interested students may try to the find answer.] 61
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
3.4 Dempster – Shafer Theory (DST)
DST is a mathematical theory of evidence based on belief functions and
plausible reasoning. It is used to combine separate pieces of information
(evidence) to calculate the probability of an event.
DST offers an alternative to traditional probabilistic theory for the
mathematical representation of uncertainty.
DST can be regarded as, a more general approach to represent
uncertainty than the Bayesian approach.
Bayesian methods are sometimes inappropriate
Example :
Let A represent the proposition "Moore is attractive". Then
the axioms of probability insist that P(A) + P(¬A) = 1.
Now suppose that Andrew does not even know who "Moore" is, then
‡ We cannot say that Andrew believes the proposition if he has no
idea what it means.
‡ Also, it is not fair to say that he disbelieves the proposition.
‡ It would therefore be meaningful to denote Andrew's belief B of
B(A) and B(¬A) as both being 0.
‡ Certainty factors do not allow this.
62
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
• Dempster-Shafer Model
The idea is to allocate a number between 0 and 1 to indicate a
degree of belief on a proposal as in the probability framework.
However, it is not considered a probability but a belief mass.
The distribution of masses is called basic belief assignment.
In other words, in this formalism a degree of belief (referred as mass) is
represented as a belief function rather than a Bayesian probability
distribution.
Example: Belief assignment (continued from previous slide)
Suppose a system has five members, say five independent states, and
exactly one of which is actual. If the original set is called S, | S | = 5, then
the set of all subsets (the power set) is called 2S.
If each possible subset as a binary vector (describing any member is
present or not by writing 1 or 0 ), then 25 subsets are possible, ranging
from the empty subset ( 0, 0, 0, 0, 0 ) to the "everything" subset ( 1, 1,
1, 1, 1 ).
The "empty" subset represents a "contradiction", which is not true in
any state, and is thus assigned a mass of one ;
The remaining masses are normalized so that their total is 1.
The "everything" subset is labeled as "unknown"; it represents the
state where all elements are present one , in the sense that you cannot
tell which is actual.
Note : Given a set S, the power set of S, written 2S, is the set of all subsets of S,
including the empty set and S. 63
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
• Belief and Plausibility
Shafer's framework allows for belief about propositions to be represented
as intervals, bounded by two values, belief (or support) and plausibility:
belief ≤ plausibility
Belief in a hypothesis is constituted by the sum of the masses of all
sets enclosed by it (i.e. the sum of the masses of all subsets of the
hypothesis). It is the amount of belief that directly supports a given
hypothesis at least in part, forming a lower bound.
Plausibility is 1 minus the sum of the masses of all sets whose intersection
with the hypothesis is empty. It is an upper bound on the possibility that the
hypothesis could possibly happen, up to that value, because there is only so
much evidence that contradicts that hypothesis.
Example :
A proposition say "the cat in the box is dead."
Suppose we have belief of 0.5 and plausibility of 0.8 for the proposition.
[continued in the next slide]
64
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
[continued in the previous slide]
.
Example :
AI - Statistical Reasoning
Suppose we have belief of 0.5 and plausibility of 0.8 for the proposition.
Evidence to state strongly, that proposition is true with confidence 0.5.
Evidence contrary to hypothesis ("the cat is alive") has confidence 0.2.
Remaining mass of 0.3 (the gap between the 0.5 supporting evidence
and the 0.2 contrary evidence) is "indeterminate," meaning that the
cat could either be dead or alive. This interval represents the level of
uncertainty based on the evidence in the system.
Hypothesis Mass belief plausibility
Null (neither alive nor dead) 0 0 0
Alive 0.2 0.2 0.5
Dead 0.5 0.5 0.8
Either (alive or dead) 0.3 1.0 1.0
Null hypothesis is set to zero by definition, corresponds to "no solution".
Orthogonal hypotheses "Alive" and "Dead" have probabilities of 0.2 and
0.5, respectively. This could correspond to "Live/Dead Cat Detector"
signals, which have respective reliabilities of 0.2 and 0.5.
All-encompassing "Either" hypothesis (simply acknowledges there is a
cat in the box) picks up the slack so that the sum of the masses is 1.
Belief for the "Alive" and "Dead" hypotheses matches their
corresponding masses because they have no subsets;
Belief for "Either" consists of the sum of all three masses (Either, Alive,
and Dead) because "Alive" and "Dead" are each subsets of "Either". "Alive" plausibility is 1- m (Death) and "Dead" plausibility is 1- m (Alive).
"Either" plausibility sums m(Alive) + m(Dead) + m(Either).
Universal hypothesis ("Either") will always have 100% belief and
plausibility; it acts as a checksum of sorts.
65
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
• Dempster-Shafer Calculus
In the previous slides, two specific examples of Belief and plausibility have
been stated. It would now be easy to understand their generalization.
The Dempster-Shafer (DS) Theory, requires a Universe of Discourse U
(or Frame of Judgment) consisting of mutually exclusive alternatives,
corresponding to an attribute value domain. For instance, in satellite image
classification the set U may consist of all possible classes of interest.
Each subset S ⊆ U is assigned a basic probability m(S), a belief Bel(S),
and a plausibility Pls(S) so that
m(S), Bel(S), Pls(S) ∈ [0, 1] and Pls(S) ≥ Bel(S) where
m represents the strength of an evidence, is the basic probability;
e.g., a group of pixels belong to certain class, may be assigned value m.
Bel(S) summarizes all the reasons to believe S.
Pls(S) expresses how much one should believe in S if all currently
unknown facts were to support S.
The true belief in S is somewhere in the belief interval [Bel(S), Pls(S)].
The basic probability assignment m is defined as function
m : 2U
→ [0,1] , where m(Ø) = 0 and sum of m over all subsets of
U is 1 (i.e., ∑ S ⊆ U m(s) = 1 ).
For a given basic probability assignment m, the belief Bel of a
subset A of U is the sum of m(B) for all subsets B of A , and
the plausibility Pls of a subset A of U is Pls(A) = 1 - Bel(A') (5)
where A' is complement of A in U.
[continued in the next slide]
66
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
[continued in the next slide] .
Summarize :
The confidence interval is that interval of probabilities within which
the true probability lies with a certain confidence based on the
belief "B" and plausibility "PL" provided by some evidence "E" for a
proposition "P".
The belief brings together all the evidence that would lead us to
believe in the proposition P with some certainty.
The plausibility brings together the evidence that is compatible with
the proposition P and is not inconsistent with it.
If "Ω" is the set of possible outcomes, then a mass probability "M"
is defined for each member of the set 2Ω
and takes values in
the range [0,1] . The Null set, "ф", is also a member of 2Ω
.
Example
If Ω is the set Flu (F), Cold (C), Pneumonia (P)
Then 2Ω
is the set ф, F, C, P, F, C, F, P, C, P, F, C, P
Confidence interval is then defined as [ B(E), PL(E) ] where
B(E) = ∑A M , where A ⊆ E i.e., all evidence that makes us believe
in the correctness of P, and
PL(E) = 1 – B(¬E) = ∑¬A M , where ¬A ⊆ ¬E i.e., all the evidence
that contradicts P.
67
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
• Combining Beliefs
The Dempster-Shafer calculus combines the available evidences
resulting in a belief and a plausibility in the combined evidence that
represents a consensus on the correspondence. The model maximizes
the belief in the combined evidences.
The rule of combination states that two basic probability assignments
M1 and M2 are combined into a third basic probability assignment
by the normalized orthogonal sum m1 ⊕ m2 stated below.
Suppose M1 and M2 are two belief functions.
Let X be the set of subsets of Ω to which M1 assigns a nonzero
value and let Y be a similar set for M2 ,
then a new belief function M3 from the combination of beliefs in M1
and M2 is obtained as
M3 (Z) = ∑ x ∩ Y = Z M1(X) M2(Y)
1 – K
where ∑ x ∩ Y = ф M1(X) M2(Y) , for Z = ф
M3 (ф) is defined to be 0 so that the orthogonal sum remains
a basic probability assignment.
68
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
3.5 Fuzzy Logic
We have discussed only binary valued logic and classical set theory like :
A person belongs to a set of all human beings, and if given a
specific subset, say all males, then one can say whether or not
the particular person belongs to this set.
This is ok since it is the way human reason. e.g.,
IF person is male AND a parent THEN person is a father. The
rules are formed using operators.
Here, it is intersection operator "AND" which manipulates the sets.
However, not everything can be described using binary valued sets.
The grouping of persons into "male" or "female" is
easy, but as "tall" or "not tall" is problematic.
A set of "tall" people is difficult to define, because there is no distinct
cut-off point at which tall begins.
Fuzzy logic was suggested by Zadeh as a method for mimicking the ability of
human reasoning using a small number of rules and still producing a smooth
output via a process of interpolation.
69
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI - Statistical Reasoning
• Description of Fuzzy Logic .
With fuzzy logic an element could partially belong to a set represented by
the set membership. Example, a person of height 1.79 m would belong to
both tall and not tall sets with a particular degree of membership.
Difference between binary logic and fuzzy logic
Grade of thruth Grade of thruth
Not tall Tall Not tall Tall
1 1
0
0
1.8 M height x
1.8 M height x
Binary valued logic 0, 1 Fuzzy logic [0, 1]
A fuzzy logic system is one that has at least one system component
that uses fuzzy logic for its internal knowledge representation.
Fuzzy system communicate information using fuzzy sets.
Fuzzy logic is used purely for internal knowledge representation and
externally it can be considered as any other system component. 70
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Fuzzy Membership
Example : Five tumblers
Consider two sets: F and E.
F is set of all tumblers belong to the class
E is set of all tumblers belong to the class
AI - Statistical Reasoning
full, and
empty.
Definition of the set F and E
Tumblers
Grade of membership to set F 100% 75% 50% 25% 0% Grade of membership to set E 0% 25% 50% 75% 100%
Graphical representation of set F and E
Grade of Membership
Set F 1.0
Set E
0.5
Tumblers
The sets F and E have some elements, having partial membership.
Such kind of non-crisp sets are called fuzzy sets.
The set "all tumblers" here is the basis of the fuzzy sets F and E,
is called the base set. 71
72
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
1. Introduction
are computer applications which embody some
Expert systems
non-algorithmic expertise for solving certain types of problems. For
1.1 Expe
rt S
UNIT 05 EXPERT SYSTEM
Expert system Components And Human Interfaces
Expert systems have a number of major system components and interface
with individuals who interact with the system in various roles. These are
illustrated below.
User Domain Expert
Expertise User Interface
System
Engineer
Knowledge Inference
Engineer Engine
Encoded
Expertise Knowledge Working
Base Storage
Components of Expert System
The individual components and their roles are explained in next slides. 04
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system - Introduction
Components and Interfaces
‡ Knowledge base : A declarative representation of the expertise;
often in IF THEN rules ;
‡ Working storage : The data which is specific to a problem being
solved;
‡ Inference engine : The code at the core of the system which derives
recommendations from the knowledge base and problem-specific data
in working storage;
‡ User interface : The code that controls the dialog between the user
and the system.
Roles of Individuals who interact with the system
‡ Domain expert : The individuals who currently are experts in
solving the problems; here the system is intended to solve;
‡ Knowledge engineer : The individual who encodes the expert's
knowledge in a declarative form that can be used by the expert
system;
‡ User : The individual who will be consulting with the system to get
advice which would have been provided by the expert.
05
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system - Introduction
Expert System Shells
Many expert systems are built with products called expert
system shells. A shell is a piece of software which contains the
user interface, a format for declarative knowledge in the knowledge
base, and an inference engine. The knowledge and system engineers
uses these shells in making expert systems.
‡ Knowledge engineer : uses the shell to build a system for a
particular problem domain.
‡ System engineer : builds the user interface, designs the declarative
format of the knowledge base, and implements the inference engine.
Depending on the size of the system, the knowledge engineer and the
system engineer might be the same person. 06
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system - Introduction
1.2 Expert System Characteristics
Expert system operates as an interactive system that responds to
questions, asks for clarifications, makes recommendations and generally
aids the decision-making process.
Expert systems have many Characteristics :
Operates as an interactive system
This means an expert system :
‡ Responds to questions
‡ Asks for clarifications
‡ Makes recommendations
‡ Aids the decision-making process.
Tools have ability to sift (filter) knowledge
‡ Storage and retrieval of knowledge
‡ Mechanisms to expand and update knowledge base on a continuing
basis.
Make logical inferences based on knowledge stored
‡ Simple reasoning mechanisms is used
‡ Knowledge base must have means of exploiting the knowledge
stored, else it is useless; e.g., learning all the words in a language,
without knowing how to combine those words to form a meaningful sentence.
07
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system - Introduction
Ability to Explain Reasoning
‡ Remembers logical chain of reasoning; therefore user may ask
◊ for explanation of a recommendation
◊ factors considered in recommendation
‡ Enhances user confidence in recommendation and acceptance of
expert system
Domain-Specific
‡ A particular system caters a narrow area of specialization;
e.g., a medical expert system cannot be used to find faults in an
electrical ci uit.
‡ Quality of advice offered by an expert system is dependent on the
amount of knowledge stored.
Capability to assign Confidence Values
‡ Can deliver quantitative information
‡ Can interpret qualitatively derived values
‡ Can address imprecise and incomplete data through assignment of
confidence values. 08
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system - Introduction
Applications
‡ Best suited for those dealing with expert heuristics for solving
problems.
‡ Not a suitable choice for those problems that can be solved using
purely numerical techniques.
Cost-Effective alternative to Human Expert
‡ Expert systems have become increasingly popular because of their
specialization, albeit in a narrow field.
‡ Encoding and storing the domain-specific knowledge is economic
process due to small size.
‡ Specialists in many areas are rare and the cost of consulting
them is high; an expert system of those areas can be useful
and cost-effective alternative in the long run. 09
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system - Introduction
1.3 Expert System Features
The features which commonly exist in expert systems are :
Goal Driven Reasoning or Backward Chaining
An inference technique which uses IF-THEN rules to repetitively
break a goal into smaller sub-goals which are easier to prove;
Coping with Uncertainty
The ability of the system to reason with rules and data which are
not precisely known;
Data Driven Reasoning or Forward Chaining
An inference technique which uses IF-THEN rules to deduce a
problem solution from initial data;
Data Representation
The way in which the problem specific data in the system is stored and
accessed;
User Interface
That portion of the code which creates an easy to use system;
Explanations
The ability of the system to explain the reasoning process that it used
to reach a recommendation.
Each of these features were discussed in detail in previous lectures on AI.
However for completion or easy to recall these are mentioned briefly here. 10
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Goal-Driven Reasoning AI – Expert system - Introduction
Goal-driven reasoning, or backward chaining, is an efficient way to solve
problems. The algorithm proceeds from the desired goal, adding new
assertions found.
Data Rules Conclusion
a = 1 if a = 1 & b = 2 then c = 3, if c = 3 then d = 4, d = 4 b = 2
The knowledge is structured in rules which describe how each of the
possibilities might be selected.
The rule breaks the problem into sub-
problems. Example :
KB contains Rule set :
Rule 1: If A and C Then F
Rule 2: If A and E Then G
Rule 3: If B Then E
Rule 4: If G Then D
Problem : prove
If A and B true Then D is true 11
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system - Introduction
• Uncertainty
Often the Knowledge is imperfect which causes uncertainty.
To work in the real world, Expert systems must be able to deal with
uncertainty.
one simple way is to associate a numeric value with each piece of
information in the system.
the numeric value represents the certainty with which the information
is known.
There are different ways in which these numbers can be defined, and how
they are combined during the inference process. 12
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Data Driven Reasoning AI – Expert system - Introduction
The data driven approach, or Forward chaining, uses rules similar to those
used for backward chaining. However, the inference process is different.
The system keeps track of the current state of problem solution and looks
for rules which will move that state closer to a final solution. The
Algorithm proceeds from a given situation to a desired goal, adding new
assertions found.
Data Rules Conclusion
a = 1 if a = 1 & b = 2 then c = 3, if c = 3 then d = 4, d = 4 b = 2
The knowledge is structured in rules which describe how each of the
possibilities might be selected. The rule breaks the problem into sub-
problems.
Example :
KB contains Rule set :
Rule 1: If A and C Then F
Rule 2: If A and E Then G
Rule 3: If B Then E
Rule 4: If G Then D
Problem : prove
If A and B true Then D is true
13
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system - Introduction
• Data Representation
Expert system is built around a knowledge base module.
knowledge acquisition is transferring knowledge from human expert
to computer.
Knowledge representation is faithful representation of what the expert
knows.
No single knowledge representation system is optimal for all applications.
The success of expert system depends on choosing knowledge encoding
scheme best for the kind of knowledge the system is based on.
The IF-THEN rules, Semantic networks, and Frames are the most
commonly used representation schemes.
14
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• User Interface AI – Expert system - Introduction
The acceptability of an expert system depends largely on the quality of the
user interface.
Scrolling dialog interface : It is easiest to implement and communicate
with the user.
Pop-up menus, windows, mice are more advanced interfaces and
powerful tools for communicating with the user; they require graphics
support. 15
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
• Explanations
An important
themselves.
AI – Expert system - Introduction
features of expert systems is their ability to explain
Given that the system knows which rules were used during the
inference process, the system can provide those rules to the user
as means for explaining the results.
By looking at explanations, the knowledge engineer can see how the
system is behaving, and how the rules and data are interacting.
This is very valuable diagnostic tool during development.
16
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
2. Knowledge Acquisition AI – Expert system - Knowledge acquisition
Knowledge acquisition includes the elicitation, collection, analysis, modeling
and validation of knowledge.
2.1 Issues in Knowledge Acquisition
The important issues in knowledge acquisition are:
knowledge is in the head of experts
Experts have vast amounts of knowledge
Experts have a lot of tacit knowledge
‡ They do not know all that they know and use
‡ Tacit knowledge is hard (impossible) to describe
Experts are very busy and valuable people
One expert does not know everything
Knowledge has a "shelf life"
17
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system - Knowledge acquisition
2.2 Techniques for Knowledge Acquisition .
The techniques for acquiring, analyzing and modeling knowledge are :
Protocol-generation techniques, Protocol analysis techniques, Hiera
hy-generation techniques, Matrix-based techniques, Sorting techniques,
Limited-information and constrained-processing tasks, Diagram-based
techniques. Each of these are briefly stated in next few slides.
Protocol-generation techniques
Include many types of interviews (unstructured, semi-structured
and structured), reporting and observational techniques.
Protocol analysis techniques
Used with transcripts of interviews or text-based information to
identify basic knowledge objects within a protocol, such as goals,
decisions, relationships and attributes. These act as a bridge between
the use of protocol-based techniques and knowledge modeling
techniques.
18
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system - Knowledge acquisition
Hiera hy-generation techniques
Involve creation, reviewing and modification of hiera hical knowledge.
Hiera hy-generation techniques, such as laddering, are used to
build taxonomies or other hiera hical structures such as goal trees
and decision networks. The Ladders are of various forms like concept
ladder, attribute ladder, composition ladders.
Matrix-based techniques
Involve the construction and filling-in a 2-D matrix (grid, table),
indicating such things, as may be, for example, between concepts and
properties (attributes and values) or between problems and solutions
or between tasks and resou es, etc. The elements within the
matrix can contain: symbols (ticks, crosses, question marks ) , colors ,
numbers , text. 19
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system - Knowledge acquisition
Sorting techniques
Used for capturing the way people compare and order concepts; it
may reveal knowledge about classes, properties and priorities.
Limited-information and constrained-processing tasks
Techniques that either limit the time and/or information available to
the expert when performing tasks. For example, a twenty-questions
technique provides an efficient way of accessing the key information in
a domain in a prioritized order.
Diagram-based techniques
Include generation and use of concept maps, state transition networks,
event diagrams and process maps. These are particularly
important in capturing the "what, how, when, who and why" of
tasks and events. 20
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
3. . Knowledge Base (Representing and Using Domain Knowledge)
AI – Expert system - Knowledge base
Expert system is built around a knowledge base module. Expert system
contains a formal representation of the information provided by the domain
expert. This information may be in the form of problem-solving rules,
procedures, or data intrinsic to the domain. To incorporate these information
into the system, it is necessary to make use of one or more knowledge
representation methods. Some of these methods are described here.
Transferring knowledge from the human expert to a computer is often the most
difficult part of building an expert system.
The knowledge acquired from the human expert must be encoded in such a
way that it remains a faithful representation of what the expert knows, and it
can be manipulated by a computer.
Three common methods of knowledge representation evolved over the years
are IF-THEN rules, Semantic networks and Frames.
The first two methods were illustrated in the earlier lecture slides on knowledge
representation therefore just mentioned here. The frame based representation
is described more.
21
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system - Knowledge base
3.1 IF-THEN rules
Human experts usually tend to think along :
condition ⇒ action or Situation ⇒ conclusion
Rules "if-then" are predominant form of encoding knowledge in
expert systems. These are of the form :
If a1 , a2 , . . . . . , an
Then b1 , b2 , . . . . . , bn where
each ai is a condition or situation, and
each bi is an action or a conclusion.
22
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system - Knowledge base
3.2 Semantic Networks
In this scheme, knowledge is represented in terms of objects and
relationships between objects.
The objects are denoted as nodes of a graph. The relationship between
two objects are denoted as a link between the corresponding two nodes.
The most common form of semantic networks uses the links between
nodes to represent IS-A and HAS relationships between objects.
Example of Semantic Network
The Fig. below shows a car IS-A vehicle; a vehicle HAS wheels.
This kind of relationship establishes an inheritance
hiera hy in the
network, with the objects lower down in the network inheriting
properties from the objects higher up.
HAS
Vehicle
Wheels
Is - A
CAR
HAS Engine
HAS
Battery
Is - A
Is - A
Honda Nissan
Civic Sentra
HAS
Power
Steering
23
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system - Knowledge base
3.3 Frames
In this technique, knowledge is decomposed into highly modular
pieces called frames, which are generalized record structures.
Knowledge consist of concepts, situations, attributes of concepts,
relationships between concepts, and procedures to handle relationships
as well as attribute values.
‡ Each concept may be represented as a separate frame.
‡ The attributes, the relationships between concepts, and the
procedures are allotted to slots in a frame.
‡ The contents of a slot may be of any data type - numbers,
strings, functions or procedures and so on.
‡ The frames may be linked to other frames, providing the same
kind of inheritance as that provided by a semantic network.
A frame-based representation is ideally suited for objected-oriented
programming techniques. An example of Frame-based representation of
knowledge is shown in next slide. 24
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system - Knowledge base
Example : Frame-based Representation of Knowledge.
Two frames, their slots and the slots filled with data type are shown.
Frame Car
Inheritance Slot Is-A
Value Vehicle
Attribute Slot Engine
Value Vehicle
Value 1
Value
Attribute Slot Cylinders
Value 4
Value 6
Value 8
Attribute Slot Doors
Value 2
Value 5 Value 4
Frame Car
Inheritance Slot Is-A
Value Car
Attribute Slot Make
Value Honda
Value
Value
Attribute Slot Year
Value 1989
Value
Value
Attribute Slot
Value
Value
Value
25
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
4. Working Memory AI – Expert system - Working memory
Working memory refers to task-specific data for a problem. The contents
of the working memory, changes with each problem situation. Consequently, it
is the most dynamic component of an expert system, assuming that it is kept
current.
‡ Every problem in a domain has some unique data associated with it.
‡ Data may consist of the set of conditions leading to the problem,
its parameters and so on.
‡ Data specific to the problem needs to be input by the user at the
time of using, means consulting the expert system. The Working memory
is related to user interface
‡ Fig. below shows how Working memory is closely related to user interface
of the expert system.
U s e r
User Interface
Working Memory (Task specific data)
Inference Engine
Knowledge Base 26
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system – Inference Engine
5. Inference Engine
a generic control mechanism for navigating through
The inference engine is
and manipulating knowledge and deduce results in an organized manner.
The inference engine's generic control mechanism applies the axiomatic
(self-evident) knowledge present in the knowledge base to the task-specific
data to arrive at some conclusion.
‡ Inference engine the other key component of all expert systems.
‡ Just a knowledge base alone is not of much use if there are no facilities for
navigating through and manipulating the knowledge to deduce something
from knowledge base.
‡ A knowledge base is usually very large, it is necessary to have inferencing
mechanisms that sea h through the database and deduce results in an
organized manner.
The Forward chaining, Backward
chaining and Tree sea hes are some
of the techniques used for drawing inferences from the knowledge base.
These techniques were talked in the earlier lectures on Problem Solving : Sea
h and Control Strategies, and Knowledge Representation. However they are
relooked in the context of expert system. 27
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system – Inference Engine
5.1 Forward Chaining Algorithm
Forward chaining is a techniques for drawing inferences from Rule
base. Forward-chaining inference is often called data driven.
‡ The algorithm proceeds from a given situation to a desired goal, adding
new assertions (facts) found.
‡ A forward-chaining, system compares data in the working memory
against the conditions in the IF parts of the rules and determines which
rule to fire.
‡ Data Driven
Data Rules Conclusion
a = 1 if a = 1 & b = 2 then c = 3,
b = 2 if c = 3 then d = 4 d = 4
‡ Example : Forward Channing
Given : A Rule base contains following Rule set
Rule 1: If A and C Then F
Rule 2: If A and E Then G
Rule 3: If B Then E
Rule 4: If G Then D
Problem : Prove
If A and B true Then D is true
[Continued in next slide] 28
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
.
29
AI – Expert system – Inference Engine [Continued from previous slide]
Solution :
(i) ‡ Start with input given A, B is true and then
‡ start at Rule 1 and go forward/down till a rule
“fires'' is found.
First iteration :
(ii) ‡ Rule 3 fires : conclusion E is true
‡ new knowledge found (iii) ‡ No other rule fires;
‡ end of first iteration.
(iv) ‡ Goal not found;
‡ new knowledge found at (ii);
‡ go for second iteration
Second iteration :
(v) ‡ Rule 2 fires : conclusion G is true
‡ new knowledge found
(vi) ‡ Rule 4 fires : conclusion D is true
‡ Goal found;
‡ Proved
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system – Inference Engine
5.2 Backward Chaining Algorithm
Backward chaining is a techniques for drawing inferences from Rule
base. Backward-chaining inference is often called goal driven.
‡ The algorithm proceeds from desired goal, adding new assertions found.
‡ A backward-chaining, system looks for the action in the THEN clause of
the rules that matches the specified goal.
‡ Goal Driven
Data Rules Conclusion
a = 1 if a = 1 & b = 2 then c = 3,
b = 2 if c = 3 then d = 4 d = 4
‡ Example : Backward Channing
Given : Rule base contains following Rule set
Rule 1: If A and C Then F
Rule 2: If A and E Then G
Rule 3: If B Then E
Rule 4: If G Then D
Problem : Prove
If A and B true Then D is true
[Continued in next slide] 30
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
. [Continued from
Solution :
previous slide]
AI – Expert system – Inference Engine
(i) ‡ Start with goal ie D is true
‡ go backward/up till a rule "fires'' is found.
First iteration :
(ii) ‡ Rule 4 fires :
‡ new sub goal to prove G is true
‡ go backward (iii) ‡ Rule 2 "fires''; conclusion: A is true
‡ new sub goal to prove E is true
‡ go backward; (iv) ‡ no other rule fires; end of first iteration.
‡ new sub goal found at (iii);
‡ go for second iteration
Second iteration :
(v) ‡ Rule 3 fires :
‡ conclusion B is true (2nd input found)
‡ both inputs A and B ascertained
‡ Proved 31
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
5.3 Tree Sea hes AI – Expert system – Inference Engine
Often a knowledge base is represented as a branching network or tree.
Many tree sea hing algorithms exists but two basic approaches are
depth-first sea h and breadth-first sea h.
Note : Here these two sea h are briefly mentioned since they were
described with examples in the previous lectures.
Depth-First Sea h
‡ Algorithm begins at initial node
‡ Check to see if the left-most below initial node (call node A)
is a goal node.
‡ If not, include node A on a list of sub-goals outstanding.
‡ Then starts with node A and looks at the first node below it,
and so on.
‡ If no more lower level nodes, and goal node not reached,
then start from last node on outstanding list and follow next
route of descent to the right.
Breadth-First Sea h
‡ Algorithm starts by expanding all the nodes one level below
the initial node.
‡ Expand all nodes till a solution is reached or the tree is completely
expanded.
‡ Find the shortest path from initial assertion to a solution.
32
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system – Shells
6. Expert System Shells
system shell is a software development environment. It
An Expert
contains the basic components of expert systems. A shell is associated
with a prescribed method for building applications by configuring and
instantiating these components.
6.1 Shell components and description
The generic components of a shell : the knowledge acquisition, the
knowledge Base, the reasoning, the explanation and the user interface are
shown below. The knowledge base and reasoning engine are the core
components.
Expert System Shell
E
x
U
Inference
p Knowledge
Knowledge Base
Mechanism
Explanation
User
s
Acquisition
e
Facts, Heuristics
Reasoning with
Subsystem
Interface
e
Subsystem
Uncertainty
r r
t
Knowledge Engineer
All these components are described in the next slide.
33
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
.
AI – Expert system – Shells
[continuing from previous slide]
Knowledge Base
A store of factual and heuristic knowledge. Expert system tool
provides one or more knowledge representation schemes for
expressing knowledge about the application domain. Some tools
use both Frames (objects) and IF-THEN rules. In PROLOG the
knowledge is represented as logical statements.
Reasoning Engine
Inference mechanisms for manipulating the symbolic information and
knowledge in the knowledge base form a line of reasoning in
solving a problem. The inference mechanism can range from
simple modus ponens backward chaining of IF-THEN rules to
Case-Based reasoning.
Knowledge Acquisition subsystem
A subsystem to help experts in build knowledge bases. However,
collecting knowledge, needed to solve problems and build the
knowledge base, is the biggest bottleneck in building expert systems.
Explanation subsystem
A subsystem that explains the system's actions. The explanation can
range from how the final or intermediate solutions were arrived at
justifying the need for additional data.
User Interface
A means of communication with the user. The user interface is
generally not a part of the expert system technology. It was not given
much attention in the past. However, the user interface can make a
critical difference in the pe eived utility of an Expert system.
34
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system – Explanation 7. Explanation
Most expert systems have explanation facilities that allow the user to ask questions - why and how it reached some conclusion.
The questions are answered by referring to the system goals, the rules
being used, and existing problem solving. The rules typically reflect
empirical, or "compiled" knowledge. They are codes of an expert's
rules of thumb, not the expert's deeper understanding.
Example :
Dialog with an expert system designed to give advice on car problems.
System Does the car start ?
User No.
System Does the engine turn over ?
User Yes.
System Do you smell gas ?
User Yes
System Recommendation - Wait 5 minutes and try again.
User Why ?
System I used the rule:
If not start, and engine_turn_over, and smell_gas
Then recommendation is 'Wait 5 minutes and try again'
User
Note : The rule gives the correct advice for a flooded car, and knows the
questions to be ask to determine if the car is flooded, but it does not contain
the knowledge of what a flooded car is and why waiting will help.
Types of Explanation
There are four types of explanations commonly used in expert systems.
‡ Rule trace reports on the progress of a consultation;
‡ Explanation of how the system reached to the given conclusion;
‡ Explanation of why the system did not give any conclusion.
‡ Explanation of why the system is asking a question;
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
35 Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
8. Application of Expert Systems AI – Expert system – Application
The Expert systems have found their way into most areas of knowledge
work. The applications of expert systems technology have widely proliferated
to industrial and comme ial problems, and even helping NASA to plan
the maintenance of a space shuttle for its next flight. The main applications
are stated in next few slides.
‡ Diagnosis and Troubleshooting of Devices and Systems
Medical diagnosis was one of the first knowledge areas to which Expert
system technology was applied in 1976. However, the diagnosis of
engineering systems quickly surpassed medical diagnosis.
‡ Planning and Scheduling
The Expert system's comme ial potential in planning and scheduling
has been recognized as very large. Examples are airlines scheduling their
flights, personnel, and gates; the manufacturing process planning and job
scheduling;
‡ Configuration of Manufactured Objects from sub-assemblies
Configuration problems are synthesized from a given set of elements
related by a set of constraints. The Expert systems have been very useful
to find solutions. For example, modular home building and manufacturing
involving complex engineering design. 36
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
AI – Expert system – Application
‡ Financial Decision Making
The financial services are the vigorous user of expert system
techniques. Advisory programs have been created to assist bankers
in determining whether to make loans to businesses and
individuals. Insurance companies to assess the risk presented by
the customer and to determine a price for the insurance. ES are used in
typical applications in the financial markets / foreign exchange trading.
‡ Knowledge Publishing
This is relatively new, but also potentially explosive area. Here the
primary function of the Expert system is to deliver knowledge that
is relevant to the user's problem. The two most widely known
Expert systems are : one, an advisor on appropriate grammatical
usage in a text; and the other, is a tax advisor on tax strategy,
tactics, and individual tax policy.
‡ Process Monitoring and Control
Here Expert system does analysis of real-time data from physical devices,
looking for anomalies, predicting trends, controlling optimality and failure
correction. Examples of real-time systems that actively monitor processes
are found in the steel making and oil refining industries.
‡ Design and Manufacturing
Here the Expert systems assist in the design of physical devices and
processes, ranging from high-level conceptual design of abstract entities
all the way to factory floor configuration of manufacturing processes.
37
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology
38
Fatima Michael College of Engineering & Technology
Fatima Michael College of Engineering & Technology