geekztrainerblog.files.wordpress.comUNIT I – INTRODUCTION TO AI AND PRODUCTION SYSTEMS What is AI?...

UNIT I – INTRODUCTION TO AI AND PRODUCTION SYSTEMS

What is AI?

Artificial intelligence (AI) is the intelligence exhibited by machines or software. It is also the name of the academic field of study which studies how to create computers and computersoftware that are capable of intelligent behavior. The central problems (or goals) of AI research include reasoning, knowledge, planning, learning, natural language processing (communication), perception and the ability to move and manipulate objects.

Example

Cleverbot Cleverbot is a chatterbot that’s modeled after human behavior and able to hold a conversation. It does so by remembering words from conversations. The responses are not programmed. Instead, it finds keywords or phrases that matches the input and searches through its saved conversations to find how it should respond to the input. MySong MySong is an application which can help people who has no experience in write a song or even can not play any instrument, to create original music by themselves. It will automatically choose chords to accompany the vocal melody that you have just inputed by microphone. In the other hand, MySong can help songwriters to record their new ideas and melodies no matter where and when they are. But MySong is not a professional application which can produce or edit your song, then it means you have to use other tools or software to really develop a song. Artificial Intelligence in Video Games The AI components used in video games is often a slimmed down version of a true AI implementation, as the scope of a video game is often limited (ie Console memory capacity). The most innovative use of AI is garnered on Personal Computers, whose memory capabilities are adjustable beyond the capacity of modern gaming consoles. Some examples of AI components typically used in video games are Path Finding, Adaptiveness (learning), perception, and planning (decision

Fatima Michael College of Engineering & Technology


making). The present state of video games can offer a variety of "worlds" for AI concepts to be tested in, such as a static or dynamic environment, deterministic or non-deterministic transitioning, and fully or partially known game worlds. The real-time performance constraint of AI in video game processing must also be considered, which is another contributing factor to why video games may choose to implement a "simple" AI, ie: finite state machine as AI, which may not even be considered Artifical Intelligence at heart. Artificial Intelligence in Mobile System As smartphone come into our daily life, we need to the make our device even more clever. Recently, researchers are trying to apply traditional AI techniques into mobile environment. Those techniques, including speech recognition, machine learning, classification and natural language processing give us a more powerful application, such as SIRI on iOS, kinect from Microsoft. AI on mobile device introduces some new challenges, such as limited computation resource and energy consumption etc. History of AI

Cybernetics and early neural networks Turing's test Game AI Symbolic reasoning and the Logic Theorist AI

Problem Formulation States possible world states accessibility the agent can determine via its sensors in which state it is consequences of actions the agent knows the results of its actions levels problems and actions can be specified at various levels constraints conditions that influence the problem-solving process Performance measures to be applied costs utilization of resources Problem Types

single-state problem multiple-state problem



contingency problem exploration problem

Single-State Problem Exact prediction is possible state is known exactly after any sequence of actions accessibility of the world all essential information can be obtained through sensors consequences of actions are known to the agent goal for each known initial state, there is a unique goal state that is guaranteed to be reachable via an action sequence simplest case, but severely restricted Multiple-State Problem Semi-exact prediction is possible state is not known exactly, but limited to a set of possible states after each action accessibility of the world not all essential information can be obtained through sensors reasoning can be used to determine the set of possible states consequences of actions are not always or completely known to the agent; actions or the environment might exhibit randomness goal due to ignorance, there may be no fixed action sequence that leads to the goal less restricted, but more complex. Contingency Problem Exact prediction is impossible state unknown in advance, may depend on the outcome of actions and changes in the environment accessibility of the world some essential information may be obtained through sensors only at execution time consequences of actions may not be known at planning time goal instead of single action sequences, there are trees of actions contingency branching point in the tree of actions agent design different from the previous two cases: the agent must act on incomplete plans Exploration Problem Effects of actions are unknown state the set of possible states may be unknown accessibility of the world some essential information may be obtained through sensors only at execution time consequences of actions may not be known at planning time goal can’t be completely formulated in advance because states and consequences may not be known at planning time discovery what states exist experimentation what are the outcomes of actions learning remember and evaluate experiments



MATCHING

Clever search involves choosing from among the rules that can be applied at a particular point, the ones that are most likely to lead to a solution. We need to extract from the entire collection of rules, those that can be applied at a given point. To do so requires some kind of matching between the current state and the preconditions of the rules.

How should this be done?

One way to select applicable rules is to do a simple search through all the rules comparing one’s preconditions to the current state and extracting all the ones that match . this requires indexing of all the rules. But there are two problems with these simple solutions:

A. It requires the use of a large number of rules. Scanning through all of them would be hopelessly inefficeint.

B. It is not always immediately obvious whether a rule’s preconditions are satisfied by a particular state.

Sometimes , instead of searching through the rules, we can use the current state as an index into the rules and select the matching ones immediately. In spite of limitations, indexing in some form is very important in the efficient operation of rules based systems.

A more complex matching is required when the preconditions of rule specify required properties that are not stated explicitly in the description of the current state. In this case, a separate set of rules must be used to describe how some properties can be



inferred from others. An even more complex matching process is required if rules should be applied and if their pre condition approximately match the current situation. This is often the case in situations involving physical descriptions of the world.

LEARNING

Learning is the improvement of performance with experience over time.

Learning element is the portion of a learning AI system that decides how to modify the performance element and implements those modifications.

We all learn new knowledge through different methods, depending on the type of material to be learned, the amount of relevant knowledge we already possess, and the environment in which the learning takes place. There are five methods of learning . They are,

1. Memorization (rote learning)

2. Direct instruction (by being told)

3. Analogy

4. Induction

5. Deduction

Learning by memorizations is the simplest from of le4arning. It requires the least amount of inference and is accomplished by simply copying the knowledge in the same form that it will be used directly into the knowledge base.



Example:- Memorizing multiplication tables, formulate , etc.

Direct instruction is a complex form of learning. This type of learning requires more inference than role learning since the knowledge must be transformed into an operational form before learning when a teacher presents a number of facts directly to us in a well organized manner.

Analogical learning is the process of learning a new concept or solution through the use of similar known concepts or solutions. We use this type of learning when solving problems on an exam where previously learned examples serve as a guide or when make frequent use of analogical learning. This form of learning requires still more inferring than either of the previous forms. Since difficult transformations must be made between the known and unknown situations.

Learning by induction is also one that is used frequently by humans . it is a powerful form of learning like analogical learning which also require s more inferring than the first two methods. This learning re quires the use of inductive inference, a form of invalid but useful inference. We use inductive learning of instances of examples of the concept. For example we learn the

concepts of color or sweet taste after experiencing the sensations associated with several examples of colored objects or sweet foods.

Deductive learning is accomplished through a sequence of deductive inference steps using known facts. From the known



facts, new facts or relationships are logically derived. Deductive learning usually requires more inference than the other methods.

Review Questions:-

1. what is perception ?

2. How do we overcome the Perceptual Problems?

3. Explain in detail the constraint satisfaction waltz algorithm?

4. What is learning ?

5. What is Learning element ?

6. List and explain the methods of learning?

Types of learning:- Classification or taxonomy of learning types serves as a guide in studying or comparing a differences among them. One can develop learning taxonomies based on the type of knowledge representation used (predicate calculus , rules, frames), the type of knowledge learned (concepts, game playing, problem solving), or by the area of application(medical diagnosis , scheduling , prediction and so on).

The classification is intuitively more appealing and is one which has become popular among machine learning researchers . it is independent of the knowledge domain and the representation scheme is used. It is based on the type of inference strategy employed or the methods used in the learning process. The five different learning methods under this taxonomy are:

Memorization (rote learning)



Direct instruction(by being told)

Analogy

Induction

Deduction

Learning by memorization is the simplest form of learning. It requires the least5 amount of inference and is accomplished by simply copying the knowledge in the same form that it will be used directly into the knowledge base. We use this type of learning when we memorize multiplication tables ,

for example.

A slightly more complex form of learning is by direct instruction. This type of learning requires more understanding and inference than role learning since the knowledge must be transformed into an operational form before being integrated into the knowledge base. We use this type of learning when a teacher presents a number of facts directly to us in a well organized manner.

The third type listed, analogical learning, is the process of learning an ew concept or solution through the use of similar known concepts or solutions. We use this type of learning when solving problems on an examination where previously learned examples serve as a guide or when we learn to drive a truck using our knowledge of car driving. We make frewuence use of analogical learning. This form of learning requires still more inferring than either of the previous forms, since difficult transformations must be made between the



known and unknown situations. This is a kind of application of knowledge in a new situation.

The fourth type of learning is also one that is used frequency by humans. It is a powerful form of learning which, like analogical learning, also requires more inferring than the first two methods. This form of learning requires the use of inductive inference, a form of invalid but useful inference. We use inductive learning when wed formulate a general concept after seeing a number of instance or examples of the concept. For example, we learn the concepts of color sweet taste after experiencing the sensation associated with several examples of colored objects or sweet foods.

The final type of acquisition is deductive learning. It is accomplished through a sequence of deductive inference steps using known facts. From the known facts, new facts or relationships are logically derived. Deductive learning usually requires more inference than the other methods. The inference method used is, of course , a deductive type, which is a valid from of inference.

In addition to the above classification, we will sometimes refer to learning methods as wither methods or knowledge-rich methods. Weak methods are general purpose methods in which little or no initial knowledge is available. These methods are more mechanical than the classical AI knowledge – rich methods. They often rely on a form of heuristics search in the learning process.



Heuristic Search

All of the search methods in the preceding section are uninformed in that they did not take into account the goal. They do not use any information about where they are trying to get to unless they happen to stumble on a goal. One form of heuristic information about which nodes seem the most promising is a heuristic function h(n), which takes a node n and returns a non-negative real number that is an estimate of the path cost from node n to a goal node. The function h(n) is an underestimate if h(n) is less than or equal to the actual cost of a lowest-cost path from node n to a goal.

The heuristic function is a way to inform the search about the direction to a goal. It provides an informed way to guess which neighbor of a node will lead to a goal.

There is nothing magical about a heuristic function. It must use only information that can be readily obtained about a node. Typically a trade-off exists between the amount of work it takes to derive a heuristic value for a node and how accurately the heuristic value of a node measures the actual path cost from the node to a goal.

A standard way to derive a heuristic function is to solve a simpler problem and to use the actual cost in the simplified problem as the heuristic function of the original problem.

The straight-line distance in the world between the node and the goal position can be used as the heuristic function. The examples that follow assume the following heuristic function: h(mail) = 26 h(ts) = 23 h(o103) = 21



h(o109) = 24 h(o111) = 27 h(o119) = 11 h(o123) = 4 h(o125) = 6 h(r123) = 0

h(b1) = 13 h(b2) = 15 h(b3) = 17 h(b4) = 18 h(c1) = 6 h(c2) = 10 h(c3) = 12 h(storage) = 12

This h function is an underestimate because the h value is less than or equal to the exact cost of a lowest-cost path from the node to a goal. It is the exact cost for node o123. It is very much an underestimate for node b1, which seems to be close, but there is only a long route to the goal. It is very misleading for c1, which also seems close to the goal, but no path exists from that node to the goal.

where the state space includes the parcels to be delivered. Suppose the cost function is the total distance traveled by the robot to deliver all of the parcels. One possible heuristic function is the largest distance of a parcel from its destination. If the robot could only carry one parcel, a possible heuristic function is the sum of the distances that the parcels must be carried. If the robot could carry multiple parcels at once, this may not be an underestimate of the actual cost.

The h function can be extended to be applicable to (non-empty) paths. The heuristic value of a path is the heuristic value of the node at the end of the path. That is:

h(⟨no,...,nk⟩)=h(nk)

A simple use of a heuristic function is to order the neighbors that are added to the stack representing the frontier in depth-first search. The neighbors can be added to the frontier so that the



best neighbor is selected first. This is known as heuristic depth-first search. This search chooses the locally best path, but it explores all paths from the selected path before it selects another path. Although it is often used, it suffers from the problems of depth-fist search.

Another way to use a heuristic function is to always select a path on the frontier with the lowest heuristic value. This is called best-first search. It usually does not work very well; it can follow paths that look promising because they are close to the goal, but the costs of the paths may keep increasing.

Figure 3.8: A graph that is bad for best-first search



PROBLEM CHARACTERISTICS Heuristic search is a very general method applicable to a large class of problem . It includes a variety of techniques. In order to choose an appropriate method, it is necessary to analyze the problem with respect to the following considerations. Is the problem decomposable ? A very large and composite problem can be easily solved if it can be broken into smaller problems and recursion could be used. Suppose we want to solve. Ex:- ∫ x2 + 3x+sin2x cos 2x dx This can be done by breaking it into three smaller problems and solving each by applying specific rules. Adding the results the complete solution is obtained. 2. Can solution steps be ignored or undone? Problem fall under three classes ignorable , recoverable and irrecoverable. This classification is with reference to the steps of the solution to a problem. Consider thermo proving. We may later find that it is of no help. We can still proceed further, since nothing is lost by this redundant step. This is an example of ignorable solutions steps. Now consider the 8 puzzle problem tray and arranged in specified order. While moving from the start state towards goal state, we may make some stupid move and consider theorem proving. We may proceed by first proving lemma. But we may backtrack and undo the unwanted move. This only involves additional steps and the solution steps are recoverable. Lastly consider the game of chess. If a wrong move is made, it can neither be ignored nor be recovered. The thing to do is to make the best use of current situation and proceed. This is an example of an irrecoverable solution steps. 1. Ignorable problems Ex:- theorem proving · In which solution steps can be ignored. 2. Recoverable problems Ex:- 8 puzzle · In which solution steps can be undone 3. Irrecoverable problems Ex:- Chess · In which solution steps can’t be undone A knowledge of these will help in determining the control structure.



3.. Is the Universal Predictable? Problems can be classified into those with certain outcome (eight puzzle and water jug problems) and those with uncertain outcome ( playing cards) . in certain – outcome problems, planning could be done to generate a sequence of operators that guarantees to a lead to a solution. Planning helps to avoid unwanted solution steps. For uncertain out come problems, planning can at best generate a sequence of operators that has a good probability of leading to a solution. The uncertain outcome problems do not guarantee a solution and it is often very expensive since the number of solution and it is often very expensive since the number of solution paths to be explored increases exponentially with the number of points at which the outcome can not be predicted. Thus one of the hardest types of problems to solve is the irrecoverable, uncertain – outcome problems ( Ex:- Playing cards). 4. Is good solution absolute or relative ? (Is the solution a state or a path ?) There are two categories of problems. In one, like the water jug and 8 puzzle problems, we are satisfied with the solution, unmindful of the solution path taken, whereas in the other category not just any solution is acceptable. We want the best, like that of traveling sales man problem, where it is the shortest path. In any – path problems, by heuristic methods we obtain a solution and we do not explore alternatives. For the best-path problems all possible paths are explored using an exhaustive search until the best path is obtained. 5. The knowledge base consistent ? In some problems the knowledge base is consistent and in some it is not. For example consider the case when a Boolean expression is evaluated. The knowledge base now contains theorems and laws of Boolean Algebra which are always true. On the contrary consider a knowledge base that contains facts about production and cost. These keep varying with time. Hence many reasoning schemes that work well in consistent domains are not appropriate in inconsistent domains. Ex.Boolean expression evaluation. 6. What is the role of Knowledge? Though one could have unlimited computing power, the size of the knowledge base available for solving the problem does matter in arriving at a good solution. Take for example the game of playing chess, just the rues for determining legal moves and some simple control mechanism is sufficient to arrive at a solution. But additional knowledge about good strategy and tactics could help to constrain the search and speed up the execution of the program. The solution would then be realistic. Consider the case of predicting the political trend. This would require an enormous amount of knowledge even to be able to recognize a solution , leave alone the best.



Ex:- 1. Playing chess 2. News paper understanding 7. Does the task requires interaction with the person. The problems can again be categorized under two heads. 1. Solitary in which the computer will be given a problem description and will produce an answer, with no intermediate communication and with he demand for an explanation of the reasoning process. Simple theorem proving falls under this category . given the basic rules and laws, the theorem could be proved, if one exists. Ex:- theorem proving (give basic rules & laws to computer) 2. Conversational, in which there will be intermediate communication between a person and the computer, wither to provide additional assistance to the computer or to provide additional informed information to the user, or both problems such as medical diagnosis fall under this category, where people will be unwilling to accept the verdict of the program, if they can not follow its reasoning. Ex:- Problems such as medical diagnosis. 8. Problem Classification Actual problems are examined from the point of view , the task here is examine an input and decide which of a set of known classes. Ex:- Problems such as medical diagnosis , engineering design.



FORMALIZING GRAPH SEARCHING

A directed graph consists of

a set N of nodes and a set A of ordered pairs of nodes called arcs.

In this definition, a node can be anything. All this definition does is constrain arcs to be ordered pairs of nodes. There can be infinitely many nodes and arcs. We do not assume that the graph is represented explicitly; we require only a procedure that can generate nodes and arcs as needed.

The arc ⟨n1,n2⟩ is an outgoing arc from n1 and an incoming arc to n2.

A node n2 is a neighbor of n1 if there is an arc from n1 to n2; that is, if ⟨n1,n2⟩∈A. Note that being a neighbor does not imply symmetry; just because n2 is a neighbor of n1 does not mean that n1 is necessarily a neighbor of n2. Arcs may be labeled, for example, with the action that will take the agent from one state to another.

A path from node s to node g is a sequence of nodes ⟨n0, n1,..., nk⟩ such that s=n0, g=nk, and ⟨ni-

1,ni⟩∈A; that is, there is an arc from ni-1 to ni for each i. Sometimes it is useful to view a path as the sequence of arcs, ⟨no,n1⟩, ⟨n1,n2⟩,..., ⟨nk-1,nk⟩ , or a sequence of labels of these arcs.

A cycle is a nonempty path such that the end node is the same as the start node - that is, a cycle is a path ⟨n0, n1,..., nk⟩ such that n0=nk and k≠0. A directed graph without any cycles is called a directed acyclic graph (DAG). This should probably be an acyclic directed graph, because it is a directed graph that happens to be acyclic, not an acyclic graph that happens to be directed, but DAG sounds better than ADG!

A tree is a DAG where there is one node with no incoming arcs and every other node has exactly one incoming arc. The node with no incoming arcs is called the root of the tree and nodes with no outgoing arcs are called leaves.

To encode problems as graphs, one set of nodes is referred to as the start nodes and another set is called the goal nodes. A solution is a path from a start node to a goal node.

Sometimes there is a cost - a positive number - associated with arcs. We write the cost of arc ⟨ni,nj⟩ as cost(⟨ni,nj⟩). The costs of arcs induces a cost of paths.

Given a path p = ⟨n0, n1,..., nk⟩, the cost of path p is the sum of the costs of the arcs in the path:

cost(p) = cost(⟨n0,n1⟩) + ...+ cost(⟨nk-1,nk⟩)

An optimal solution is one of the least-cost solutions; that is, it is a path p from a start node to a goal node such that there is no path p' from a start node to a goal node where cost(p')<cost(p).



Figure 3.2: A graph with arc costs for the delivery robot domain



Production System Types of Production Systems.

A Knowledge representation formalism consists of collections of condition-action rules(Production Rules or Operators), a database which is modified in accordance with the rules, and a Production System Interpreter which controls the operation of the rules i.e The 'control mechanism' of a Production System, determining the order in which Production Rules are fired. A system that uses this form of knowledge representation is called a production system. A production system consists of rules and factors. Knowledge is encoded in a declarative from which comprises of a set of rules of the form Situation ------------ Action SITUATION that implies ACTION. Example:- IF the initial state is a goal state THEN quit. The major components of an AI production system are i. A global database ii. A set of production rules and iii. A control system The goal database is the central data structure used by an AI production system. The production system. The production rules operate on the global database. Each rule has a precondition that is either satisfied or not by the database. If the precondition is satisfied, the rule can be applied. Application of the rule changes the database. The control system chooses which applicable rule should be applied and ceases computation when a termination condition on the database is satisfied. If several rules are to fire at the same time, the control system resolves the conflicts.

Four classes of production systems:- 1. A monotonic production system 2. A non monotonic production system 3. A partially commutative production system 4. A commutative production system. Advantages of production systems:- 1. Production systems provide an excellent tool for structuring AI programs.



2. Production Systems are highly modular because the individual rules can be added, removed or modified independently.

3. The production rules are expressed in a natural form, so the statements contained in the knowledge base should the a recording of an expert thinking out loud. Disadvantages of Production Systems:- One important disadvantage is the fact that it may be very difficult analyse the flow of control within a production system because the individual rules don’t call each other. Production systems describe the operations that can be performed in a search for a solution to the problem. They can be classified as follows. Monotonic production system :- A system in which the application of a rule never prevents the later application of another rule, that could have also been applied at the time the first rule was selected. Partially commutative production system:- A production system in which the application of a particular sequence of rules transforms state X into state Y, then any permutation of those rules that is allowable also transforms state x into state Y. Theorem proving falls under monotonic partially communicative system. Blocks world and 8 puzzle problems like chemical analysis and synthesis come under monotonic, not partially commutative systems. Playing the game of bridge comes under non monotonic , not partially commutative system. For any problem, several production systems exist. Some will be efficient than others. Though it may seem that there is no relationship between kinds of problems and kinds of production systems, in practice there is a definite relationship. Partially commutative , monotonic production systems are useful for solving ignorable problems. These systems are important for man implementation standpoint because they can be implemented without the ability to backtrack to previous states, when it is discovered that an incorrect path was followed. Such systems increase the efficiency since it is not necessary to keep track of the changes made in the search process. Monotonic partially commutative systems are useful for problems in which changes occur but can be reversed and in which the order of operation is not critical (ex: 8 puzzle problem). Production systems that are not partially commutative are useful for many problems in which irreversible changes occur, such as chemical analysis. When dealing with such systems, the order in which operations are performed is very important and hence correct decisions have to be made at the first time itself.



Control or Search Strategy :

Selecting rules; keeping track of those sequences of rules that have already been tried and the states produced by them.

Goal state provides a basis for the termination of the problem solving task.

1- PATTERN MATCHING STAGE Execution of a rule requires a match. preconditions match content of of a rule <=====> the working memory.

when match is found => rule is applicable several rules may be applicable

2- CONFLICT RESOLUTION (SELECTION STRATEGY ) STAGE Selecting one rule to execute;

3- ACTION STAGE Applying the action part of the rule => changing the content of the workspace =>new patterns,new matches => new set of rules eligible for execution

Recognize -act control cycle



Search Strategies Uninformed Search Strategies have no additional information about states beyond that

provided in the problem definition.

Strategies that know whether one non goal state is ―more promising‖ than another are

called Informed search or heuristic search strategies.

There are five uninformed search strategies as given below.

o Breadth-first search

o Uniform-cost search

o Depth-first search

o Depth-limited search

o Iterative deepening search

Problem characteristics Analyze each of them with respect to the seven problem characteristics

Chess Water jug 8-puzzle Traveling salesman Missionaries and cannibals Tower of Hanoi

1. Chess

Problem characteristic Satisfied Reason

Is the problem decomposable?

No One game have Single solution

Can solution steps be ignored or undone?

No In actual game(not in PC) we can’t undo previous steps

Is the problem universe predictable?

No Problem Universe is not predictable as we are not sure about move of other player(second player)



Is a good solution absolute or relative?

absolute Absolute solution : once you get one solution you do need to bother about other possible solution. Relative Solution : once you get one solution you have to find another possible solution to check which solution is best(i.e low cost). By considering this chess is absolute

Is the solution a state or a path?

Path Is the solution a state or a path to a state? – For natural language understanding, some of the words have different interpretations .therefore sentence may cause ambiguity. To solve the problem we need to find interpretation only , the workings are not necessary (i.e path to solution is not necessary) So In chess winning state(goal state) describe path to state

What is the role of knowledge?

lot of knowledge helps to constrain the

search for a solution.

Does the task require human-interaction?

No Conversational In which there is intermediate

communication between a person and the computer, either to provide additional assistance to the computer or to provide additional information to the user, or both. In chess additional assistance is not required

2. Water jug





No One Single solution


Yes


Yes Problem Universe is predictable bcz to slove this problem it require only one person .we can predict what will happen in next step


absolute Absolute solution , water jug problem may have number of solution , bt once we found one solution,no need to bother about other solution Bcz it doesn’t effect on its cost


Path Path to solution


lot of knowledge helps to constrain the search for a solution.


Yes additional assistance is required. Additional assistance, like to get jugs or pump

3. 8 puzzle





Yes We can undo the previous move


Yes Problem Universe is predictable bcz to slove this problem it require only one person .we can predict what will beposition of blocks in next move




absolute Absolute solution : once you get one solution you do need to bother about other possible solution. Relative Solution : once you get one solution you have to find another possible solution to check which solution is best(i.e low cost). By considering this 8 puzzle is absolute


Path Is the solution a state or a path to a state? – For natural language understanding, some of the words have different interpretations .therefore sentence may cause ambiguity. To solve the problem we need to find interpretation only , the workings are not necessary (i.e path to solution is not necessary) So In 8 puzzle winning state(goal state) describe path to state






communication between a person and the computer, either to provide additional assistance to the computer or to provide additional information to the user, or both. In 8 puzzle additional assistance is not required

4. Travelling Salesman (TSP)


Is the problem decomposable? No One game have Single solution




Yes


Yes


absolute Absolute solution : once you get one solution you do need to bother about other possible solution. Relative Solution : once you get one solution you have to find another possible solution to check which solution is best(i.e low cost). By considering this TSP is absolute


Path Is the solution a state or a path to a state? – For natural language understanding, some of the words have different interpretations .therefore sentence may cause ambiguity. To solve the problem we need to find interpretation only , the workings are not necessary (i.e path to solution is not necessary) So In TSP (goal state) describe path to state


lot of knowledge helps to constrain the search for a solution.



communication between a person and the computer, either to provide additional assistance to the computer or to provide additional information to the user, or both.



In chess additional assistance is not required

5. Missionaries and cannibals





Yes


Yes Problem Universe is not predictable as we are not sure about move of other player(second player)


absolute Absolute solution : once you get one solution you do need to bother about other possible solution. Relative Solution : once you get one solution you have to find another possible solution to check which solution is best(i.e low cost). By considering this is absolute


Path Is the solution a state or a path to a state? – For natural language understanding, some of the words have different interpretations .therefore sentence may cause ambiguity. To solve the problem we need to find interpretation only , the workings are not necessary (i.e path to solution is not necessary) So In winning state(goal state) describe path to state







Yes Conversational In which there is intermediate

communication between a person and the computer, either to provide additional assistance to the computer or to provide additional information to the user, or both. In chess additional assistance is required to move Missionaries to other side of river of other assistance is required

6. Tower of Hanoi Problem characteristic Satisfied Reason




Yes


Yes


absolute Absolute solution : once you get one solution you do need to bother about other possible solution. Relative Solution : once you get one solution you have to find another possible solution to check which solution is best(i.e low cost). By considering this Tower of Hanoi isabsolute


Path Is the solution a state or a path to a state? – For natural language understanding, some of the words have different interpretations .therefore sentence may cause ambiguity. To solve the problem we need to find interpretation only



, the workings are not necessary (i.e path to solution is not necessary) So In tower of Hanoi winning state(goal state) describe path to state






communication between a person and the computer, either to provide additional assistance to the computer or to provide additional information to the user, or both. In tower of Hanoi additional assistance is not required



1

Measuring Performance of Algorithms

There are two aspects of algorithmic performance: • Time

- Instructions take time. - How fast does the algorithm perform? - What affects its runtime?

• Space - Data structures take space - What kind of data structures can be used? - How does choice of data structure affect the

runtime? Algorithms can not be compared by running them on computers. Run time is system dependent. Even on same computer would depend on language Real time units like microseconds not to be used. Generally concerned with how the amount of work varies with the data.



2

Measuring Time Complexity Counting number of operations involved in the algorithms to handle n items. Meaningful comparison for very large values of n.

Complexity of Linear Search

Consider the task of searching a list to see if it contains a particular value. • A useful search algorithm should be general. • Work done varies with the size of the list • What can we say about the work done for list of any length? i = 0; while (i < MAX && this_array[i] != target) i = i + 1; if (i <MAX) printf ( “Yes, target is there \n” ); else printf( “No, target isn’t there \n” ); The work involved : Checking target value with each of the n elements.



3

no. of operations: 1 (best case) n (worst case) n/2 (average case) Computer scientists tend to be concerned about the Worst Case complexity. The worst case guarantees that the performance of the algorithm will be at least as good as the analysis indicates. Average Case Complexity: It is the best statistical estimate of actual performance, and tells us how well an algorithm performs if you average the behavior over all possible sets of input data. However, it requires considerable mathematical sophistication to do the average case analysis.



4

Algorithm Analysis: Loops

Consider an n X n two dimensional array. Write a loop to store the row sums in a one-dimensional array rows and the overall total in grandTotal. LOOP 1: grandTotal = 0; for (k=0; k<n-1; ++k)

rows[k] = 0; for (j = 0; j <n-1; ++j)

rows[k] = rows[k] + matrix[k][j]; grandTotal = grandTotal + matrix[k][j];

It takes 2n2 addition operations LOOP 2: grandTotal =0; for (k=0; k<n-1; ++k)

rows[k] = 0; for (j = 0; j <n-1; ++j)

rows[k] = rows[k] + matrix[k][j]; grandTotal = grandTotal + rows[k];

This one takes n2 + n operations



5

Big-O Notation We want to understand how the performance of an algorithm responds to changes in problem size. Basically the goal is to provide a qualitative insight. The Big-O notation is a way of measuring the order of magnitude of a mathematical expression O(n) means on the Order of n Consider n4 + 31n2 + 10 = f (n) The idea is to reduce the formula in the parentheses so that it captures the qualitative behavior in simplest possible terms. We eliminate any term whose contribution to the total ceases to be significant as n becomes large. We also eliminate any constant factors, as these have no effect on the overall pattern as n increases. Thus we may approximate f(n) above as O (n4 + 31n2 + 10) = O( n4)

Let g(n) = n4

Then the order of f(n) is O[g(n)]. Definition: f(n) is O(g(n)) if there exist positive numbers c and N such that f(n) < = c g(n) for all n >=N. i.e. f is big –O of g if there is c such that f is not larger than cg for sufficiently large value of n ( greater than N)



6

c g(n) is an upper bound on the value of f(n) That is, the number of operations is at worst proportional to g(n) for all large values of n. How does one determine c and N? Let f(n) = 2 n2 + 3 n + 1 = O (n2 ) Now 2 n2 + 3 n + 1 < = c n2 Or 2 + (3/n) + ( 1 / n2 ) < = c You want to find c such that a term in f becomes the largest and stays the largest. Compare first and second term. First will overtake the second at N = 2, so for N= 2, c >= 3.75, for N = 5, c >= slightly more than 2, for very large value of n, c is almost 2. g is almost always > = f if it is multiplied by a constant c Look at it another way : suppose you want to find weight of elephants, cats and ants in a jungle. Now irrespective of how many of each item were there, the net weight would be proportional to the weight of an elephant. Incidentally we can also say f is big -O not only of n2 but also of n3 , n4 , n5 etc (HOW ?)



7

Loop 1 and Loop 2 are both in the same big-O category: O(n2)

Properties of Big-O notation: O(n) + O(m) = O(n) if n > = m The function log n to base a is order of O( log n to base b) For any values of a and b ( you can show that any log values are multiples of each other) Linear search Algorithm: Best Case - It’s the first value “order 1,” O(1) Worst Case - It’s the last value, n “order n,” O(n) Average - N/2 (if value is present) “order n,” O(n) Example 1: Use big-O notation to analyze the time efficiency of the following fragment of C code: for(k = 1; k <= n/2; k++) . . for (j = 1; j <= n*n; j++)



8

. . Since these loops are nested, the efficiency is n3/2, or O(n3) in big-O terms. Thus, for two loops with O[f1(n)] and O[f2(n)] efficiencies, the efficiency of the nesting of these two loops is O[f1(n) * f2(n)]. Example 2: Use big-O notation to analyze the time efficiency of the following fragment of C code: for (k=1; k<=n/2; k++) . . for (j = 1; j <= n*n; j++) . . The number of operations executed by these loops is the sum of the individual loop efficiencies. Hence, the efficiency is n/2+n2, or O(n2) in big-O terms.



9

Thus, for two loops with O[f1(n)] and O[f2(n)] efficiencies, the efficiency of the sequencing of these two loops is O[fD(n)] where fD(n) is the dominant of the functions f1(n) and f2(n).



10

Complexity of Linear Search In measuring performance, we are generally concerned with how the amount of work varies with the data. Consider, for example, the task of searching a list to see if it contains a particular value. • A useful search algorithm should be general. • Work done varies with the size of the list • What can we say about the work done for list of any length? i = 0; while (i < MAX && this_array[i] != target) i = i + 1; if (i <MAX) printf ( “Yes, target is there \n” ); else printf( “No, target isn’t there \n” );



11

Order Notation How much work to find the target in a list containing N elements? Note: we care here only about the growth rate of work. Thus, we toss out all constant values. Best Case work is constant; it does not grow with the

size of the list. Worst and Average Cases work is proportional to the

size of the list, N.



12



13

Order Notation

O(1) or “Order One”: Constant time does not mean that it takes only one operation does mean that the work doesn’t change as N changes is a notation for “constant work”

O(n) or “Order n”: Linear time does not mean that it takes N operations does mean that the work changes in a way that is

proportional to N is a notation for “work grows at a linear rate”

O(n2) or “Order n2 ”: Quadratic time O(n3) or “Order n3 ”: Cubic time Algorithms whose efficiency can be expressed in terms of a polynomial of the form amnm + am-1nm-1 + ... + a2n2 + a1n + a0 are called polynomial algorithms. Order O(nm). Some algorithms even take less time than the number of elements in the problem. There is a notion of logarithmic time algorithms. We know 103 =1000 So we can write it as log101000 = 3



14

Similarly suppose we have 26 =64 then we can write log264 = 6 If the work of an algorithm can be reduced by half in one step, and in k steps we are able to solve the problem then 2k = n or in other words log2n = k This algorithm will be having a logarithmic time complexity ,usually written as O(ln n). Because logan will increase much more slowly than n itself, logarithmic algorithms are generally very efficient. It also can be shown that it does not matter as to what base value is chosen. Example 3: Use big-O notation to analyze the time efficiency of the following fragment of C code: k = n; while (k > 1)

.



15

. k = k/2;

Since the loop variable is cut in half each time through the loop, the number of times the statements inside the loop will be executed is log2n. Thus, an algorithm that halves the data remaining to be processed on each iteration of a loop will be an O(log2n) algorithm. There are a large number of algorithms whose complexity is O( n log2n) . Finally there are algorithms whose efficiency is dominated by a term of the form an

These are called exponential algorithms. They are of more theoretical rather than practical interest because they cannot reasonably run on typical computers for moderate values of n.



16

Comparison of N, logN and N2

N O(LogN) O(N2) 16 4 256 64 6 4K

256 8 64K 1,024 10 1M

16,384 14 256M 131,072 17 16G 262,144 18 6.87E+10 524,288 19 2.74E+11

1,048,576 20 1.09E+12 1,073,741,824 30 1.15E+18



Constraint Satisfaction

Constraint satisfaction is the process of finding a solution to a set of constraints that impose conditions that the variables must satisfy. A solution is therefore a set of values for the variables that satisfies all constraints—that is, a point in the feasible region.

The techniques used in constraint satisfaction depend on the kind of constraints being considered. Often used are constraints on a finite domain, to the point that constraint satisfaction problems are typically identified with problems based on constraints on a finite domain. Such problems are usually solved via search, in particular a form of backtracking or local search. Constraint propagation are other methods used on such problems; most of them are incomplete in general, that is, they may solve the problem or prove it unsatisfiable, but not always. Constraint propagation methods are also used in conjunction with search to make a given problem simpler to solve. Other considered kinds of constraints are on real or rational numbers; solving problems on these constraints is done via variable elimination or the simplex algorithm.

Complexity

Solving a constraint satisfaction problem on a finite domain is an NP complete problem with respect to the domain size. Research has shown a number of tractable subcases, some limiting the allowed constraint relations, some requiring the scopes of constraints to form a tree, possibly in a reformulated version of the problem. Research has also established



relationship of the constraint satisfaction problem with problems in other areas such as finite model theory.



UNIT 02

GAME PLAYING

Introduction

Game playing has been a major topic of AI since the very beginning. Beside the attraction of the topic to people, it is also because its close relation to "intelligence", and its well-defined states

and rules.

The most common used AI technique in game is search. In some other problem-solving

activities, state change is solely caused by the action of the system itself. However, in multi- player games, states also depend on the actions of other players (systems) who usually have

different goals.

A special situation that has been studied most is two-person zero-sum game, where the two

players have exactly opposite goals, that is, each state can be evaluated by a score from one player's viewpoint, and the other's viewpoint is exactly the opposite. This type of game is

common, and easy to analyze, though not all competitions are zero-sum!

There are perfect information games (such as Chess and Go) and imperfect information games (such as Bridge and games where dice are used). Given sufficient time and space, usually an

optimum solution can be obtained for the former by exhaustive search, though not for the latter. However, for most interesting games, such a solution is usually too inefficient to be practically

used.

Minimax Procedure

For two-person zero-sum perfect-information game, if the two players take turn to move, the

minimax procedure can solve the problem given sufficient computational resources. This algorithm assumes each player takes the best move in each step.

First, we distinguish two types of nodes, MAX and MIN, in the state graph, determined by the

depth of the search tree.

Minimax procedure: starting from the leaves of the tree (with final scores with respect to one

player, MAX), and go backwards towards the root (the starting state).

At each step, one player (MAX) takes the action that leads to the highest score, while the other player (MIN) takes the action that leads to the lowest score.

All nodes in the tree will all be scored, and the path from root to the actual result is the one on

which all nodes have the same score.



Example:

Because of computational resources limitation, the search depth is usually restricted, and

estimated scores generated by a heuristic function are used in place of the actual score in the above procedure.

Example: Tic-tac-toe, with the difference of possible win paths as the henristic function.

Alpha-Beta Pruning

Very often, the game graph does not need to be fully explored using Minimax.



Based on explored nodes' score, inequity can be set up for nodes whose children haven't been exhaustively explored. Under certain conditions, some branches of the tree can be ignored

without changing the final score of the root.

In Alpha-Beta Pruning, each MAX node has an alpha value, which never decreases; each MIN

node has a beta value, which never increases. These values are set and updated when the value of

a child is obtained. Search is depth-first, and stops at any MIN node whose beta value is smaller than or equal to the alpha value of its parent, as well as at any MAX node whose alpha value is

greater than or equal to the beta value of its parent.

Examples: in the following partial trees, the other children of node (5) do not need to be generated.

(1)MAX[>=3] ----- (2)MIN[==3] ----- (3)MAX[==5]

| |------------ (4)MAX[==3]

|

|------------ (5)MIN[<=0] ----- (6)MAX[==0]

| ---------- X

| ---------- X

(1)MIN[<=5] ----- (2)MAX[==5] ----- (3)MIN[==5]

| |------------ (4)MIN[==3]

|

|------------ (5)MAX[>=8] ----- (6)MIN[==8]

| ---------- X

| ---------- X

This method is used in a Prolog program that plays Tic-tac-toe.



ITERATIVE DEEPENING

While still an unintelligent algorithm, the iterative deepening search combines the positive elements of breadth-first and depth-first searching to create an algorithm which is often an

improvement over each method individually.

An iterative deepening search operates like a depth-first search, except slightly more constrained-

-there is a maximum depth which defines how many levels deep the algorithm can look for solutions. A node at the maximum level of depth is treated as terminal, even if it would

ordinarily have successor nodes. If a search "fails," then the maximum level is increased by one and the process repeats. The value for the maximum depth is initially set at 0 (i.e., only the initial

node).

The initial node is

checked for a goal

state; then, since the

search cannot go any deeper, it "fails."

The maximum level is increased

to 1; then the search restarts-the search (in its most basic

implementation) does not remember testing the initial node

already. This time, since the initial node is not at the maximum

level, it can be expanded.

Its successors, however, cannot;

they are checked...if they fail, they are treated as terminal nodes and

deleted. The search "fails," and the search once again restarts, with

maximum level 2.

Visited

Nodes

Current

Node



This continues until a solution is found.

An interesting observation is that the nodes in this search are first checked in the same order they would be checked in a breadth-first-search; however, since nodes are deleted as the search

progresses, much less memory is used at any given time.

The drawback to the iterative deepening search is clear from the walkthrough--it can be painfully

redundant, rechecking every node it has already checked with each new iteration. The algorithm can be enhanced to remember what nodes it has already seen, but this sacrifices most of the

memory efficiency that made the algorithm worthwhile in the first place, and nodes at the maximum level for one iteration will still need to be re-accessed and expanded in the following

iteration. Still, when memory is at a premium, iterative deepening is preferable to a plain depth- first search when there is danger of looping or the most efficient solution is desired.



Knowledge Representation

Typically, a problem to solve or a task to carry out, as well as what constitutes a solution, is only given informally, such as "deliver parcels promptly when they arrive" or "fix whatever is wrong

with the electrical system of the house."

The role of representations in solving problems

To solve a problem, the designer of a system must

flesh out the task and determine what constitutes a solution;

represent the problem in a language with which a computer can reason; use the computer to compute an output, which is an answer presented to a user or a

sequence of actions to be carried out in the environment; and interpret the output as a solution to the problem.

Knowledge is the information about a domain that can be used to solve problems in that domain.

To solve many problems requires much knowledge, and this knowledge must be represented in the computer. As part of designing a program to solve problems, we must define how the

knowledge will be represented. A representation scheme is the form of the knowledge that is used in an agent. A representation of some piece of knowledge is the internal representation of

the knowledge. A representation scheme specifies the form of the knowledge. A knowledge

base is the representation of all of the knowledge that is stored by an agent.

A good representation scheme is a compromise among many competing objectives. A

representation should be

rich enough to express the knowledge needed to solve the problem. as close to the problem as possible; it should be compact, natural, and maintainable. It

should be easy to see the relationship between the representation and the domain being represented, so that it is easy to determine whether the knowledge represented is correct.

A small change in the problem should result in a small change in the representation of the problem.



amenable to efficient computation, which usually means that it is able to express features of the problem that can be exploited for computational gain and able to trade off accuracy

and computation time.

able to be acquired from people, data and past experiences.

Many different representation schemes have been designed. Many of these start with some of these objectives and are then expanded to include the other objectives. For example, some are

designed for learning and then expanded to allow richer problem solving and inference abilities. Some representation schemes are designed with expressiveness in mind, and then inference and

learning are added on. Some schemes start from tractable inference and then are made more natural, and more able to be acquired.

Some of the questions that must be considered when given a problem or a task are the following:

What is a solution to the problem? How good must a solution be? How can the problem be represented? What distinctions in the world are needed to solve

the problem? What specific knowledge about the world is required? How can an agent

acquire the knowledge from experts or from experience? How can the knowledge be debugged, maintained, and improved?

How can the agent compute an output that can be interpreted as a solution to the problem? Is worst-case performance or average-case performance the critical time to

minimize? Is it important for a human to understand how the answer was derived?



Predicate Calculus



First-order logic

• Whereas propositional logic assumes the world contains facts,

• first-order logic (like natural language) assumes the world contains

• Objects: people, houses, numbers, colors, baseball games, wars, …

• Relations: red, round, prime, brother of, bigger than, part of, comes between, …



Syntax of FOL: Basic elements

• Constants TaoiseachJohn, 2, DIT,...

• Predicates Brother, >,...

• Functions Sqrt, LeftLegOf,...

• Variables x, y, a, b,...

• Connectives , , , ,

• Equality =

• Quantifiers ,



Atomic sentences

Atomic sentence = predicate (term1,...,termn)

or term1 = term2

Term = function (term1,...,termn)

or constant or variable

• E.g., Brother(TaoiseachJohn,RichardTheLionheart) >

(Length(LeftLegOf(Richard)),

Length(LeftLegOf(TaoiseachJohn)))



Complex sentences

• Complex sentences are made from atomic

sentences using connectives

•

S, S1 S2, S1 S2, S1 S2, S1 S2,

E.g. Sibling(TaoiseachJohn,Richard)

Sibling(Richard,TaoiseachJohn)

>(1,2) ≤ (1,2)

>(1,2) >(1,2)



Truth in first-order logic

• Sentences are true with respect to a model and an interpretation

• Model contains objects (domain elements) and relations among

them

•

• Interpretation specifies referents for

constant symbols → objects

predicate symbols → relations

function symbols → functional relations

• An atomic sentence predicate(term1,...,termn) is true

iff the objects referred to by term1,...,termn

are in the relation referred to by predicate



Universal quantification

• <variables> <sentence>

•

Everyone at DIT is smart:

x At(x,DIT) Smart(x)

• x P is true in a model m iff P is true with x being each possible object in the model

•



• Roughly speaTaoiseach, equivalent to the conjunction of instantiations of P

• At(TaoiseachJohn,DIT) Smart(TaoiseachJohn)

At(Richard,DIT) Smart(Richard)

At(DIT,DIT) Smart(DIT)

...



A common mistake to avoid

• Typically, is the main connective with

•

• Common mistake: using as the main

connective with :


means “Everyone is at DIT and everyone is smart”



Existential quantification


• Someone at DIT is smart:

• x At(x,DIT) Smart(x)$

•

• x P is true in a model m iff P is true with x being some possible object in the model

•



• Roughly speaTaoiseach, equivalent to the disjunction of instantiations of P




...



Another common mistake to

avoid • Typically, is the main connective with


connective with :

•


is true if there is anyone who is not at DIT!



Properties of quantifiers

• x y is the same as y x

•


•

• x y is not the same as y x

•

• x y Loves(x,y) – “There is a person who loves everyone in the world”

–

• y x Loves(x,y) – “Everyone in the world is loved by at least one person”

–

• Quantifier duality: each can be expressed using the other

•

• x Likes(x,IceCream) x Likes(x,IceCream)

•



Equality

• term1 = term2 is true under a given interpretation

if and only if term1 and term2 refer to the same

object

•

• E.g., definition of Sibling in terms of Parent:

•

x,y Sibling(x,y) [ (x = y) m,f (m = f)

Parent(m,x) Parent(f,x) Parent(m,y) Parent(f,y)]



Using FOL

The kinship domain:

• Brothers are siblings

•

x,y Brother(x,y) Sibling(x,y)



• One's mother is one's female parent

•

m,c Mother(c) = m (Female(m) Parent(m,c))

• “Sibling” is symmetric

•

x,y Sibling(x,y) Sibling(y,x)



Knowledge engineering in FOL

1. Identify the task

2. Assemble the relevant knowledge

3. Decide on a vocabulary of predicates, functions, and constants

4. Encode general knowledge about the domain

5. Encode a description of the specific problem instance

6. Pose queries to the inference procedure and get answers

7. Debug the knowledge base



Summary

• First-order logic:

•

– objects and relations are semantic primitives

– syntax: constants, functions, predicates,

equality, quantifiers

–



Semantics for Predicate Calculus • An interpretation over D is an assignment

of the entities of D to each of the constant,

variable, predicate and function symbols of

a predicate calculus expression such that:



• 1: Each constant is assigned an element of D

• 2: Each variable is assigned a non-empty subset of D;(these are the allowable substitutions for that variable)

• 3: Each predicate of arity n is defined on n arguments from D and defines a mapping from Dn into T,F

• 4: Each function of arity n is defined on n arguments from D and defines a mapping from Dn into D



The meaning of an expression

• Given an interpretation, the meaning of an

expression is a truth value assignment

over the interpretation.



Truth Value of Predicate Calculus

expressions

• Assume an expression E and an

interpretation I for E over a non empty

domain D. The truth value for E is

determined by:

• The value of a constant is the element of

D assigned to by I

• The value of a variable is the set of

elements assigned to it by I



More truth values

• The value of a function expression is that

element of D obtained by evaluating the

function for the argument values assigned

by the interpretation

• The value of the truth symbol “true” is T

• The value of the symbol “false” is F

• The value of an atomic sentence is either

T or F as determined by the interpretation I



Similarity with Propositional logic

truth values

• The value of the negation of a sentence is

F if the value of the sentence is T and F

otherwise

• The values for conjunction, disjunction

,implication and equivalence are

analogous to their propositional logic

counterparts



Universal Quantifier

• The value for

• Is T if S is T for all assignments to X under

I, and F otherwise



Existential Quantifier

• The value for

• Is T if S is T for any assignment to X under

I, and F otherwise



Some Definitions

• A predicate calculus expressions S1 is satisfied.

• Definition If there exists an Interpretation I and a variable assignment under I which returns a value T for S1 then S1 is said to be satisfied under I.

• S is Satisfiable if there exists an interpretation

and variable assignment that satisfies it: Otherwise it is unsatisfiable



Some Definitions

• A set of predicate calculus expressions

S is satisfied.

• Definition For any interpretation I and

variable assignment where a value T is

returned for every element in S the the

set S is said to be satisfied,



• A set of expressions is satisfiable if and only if there exist an intrepretation and variable assignment that satisfy every element

• If a set of expressions is not satisfiable, it is said to be inconsistent

• If S has a value T for all possible

interpretations , it is said to be valid



Some Definitions



• A set of predicate calculus expressions S is satisfied.

•

• Definition For any interpretation I and variable assignment where a value T is returned for every element in S the the set S is said to be satisfied,

• An inference rule is complete.

• Definition If all predicate calculus expressions X that logically follow from a set of expressions, S can be produced using the inference rule , then the inference rule is said to be complete.



• A predicate calculus expression X logically follows from a set S of predicate calculus expressions .

• For any interpretation I and variable assignment where S is satisfied, if X is also satisfied under the same interpretation and variable assignment then X logically follows from S.

• Logically follows is sometimes called entailment



Soundness

• An inference rule is sound.

• If all predicate calculus expressions X

produced using the inference rule from a

set of expressions, S logically follow from

S then the inference rule is said to be

sound.



Completeness

• An inference Rule is complete if given a

set S of predicate calculus expressions, it

can infer every expression that logically

follows from S



Equivalence

• Recall that :

• See attached word

document



Predicate Calculus



First-order logic

• Whereas propositional logic assumes the world contains facts,

• first-order logic (like natural language) assumes the world contains

• Objects: people, houses, numbers, colors, baseball games, wars, …

• Relations: red, round, prime, brother of, bigger than, part of, comes between, …



Syntax of FOL: Basic elements

• Constants TaoiseachJohn, 2, DIT,...

• Predicates Brother, >,...

• Functions Sqrt, LeftLegOf,...

• Variables x, y, a, b,...

• Connectives , , , ,

• Equality =

• Quantifiers ,



Atomic sentences

Atomic sentence = predicate (term1,...,termn)

or term1 = term2

Term = function (term1,...,termn)

or constant or variable

• E.g., Brother(TaoiseachJohn,RichardTheLionheart) >

(Length(LeftLegOf(Richard)),

Length(LeftLegOf(TaoiseachJohn)))



Complex sentences

• Complex sentences are made from atomic

sentences using connectives

•

S, S1 S2, S1 S2, S1 S2, S1 S2,

E.g. Sibling(TaoiseachJohn,Richard)

Sibling(Richard,TaoiseachJohn)

>(1,2) ≤ (1,2)

>(1,2) >(1,2)



Truth in first-order logic

• Sentences are true with respect to a model and an interpretation

• Model contains objects (domain elements) and relations among

them

•

• Interpretation specifies referents for

constant symbols → objects

predicate symbols → relations

function symbols → functional relations

• An atomic sentence predicate(term1,...,termn) is true

iff the objects referred to by term1,...,termn

are in the relation referred to by predicate



Universal quantification


•

Everyone at DIT is smart:


• x P is true in a model m iff P is true with x being each possible object in the model

•



• Roughly speaTaoiseach, equivalent to the conjunction of instantiations of P




...



A common mistake to avoid

• Typically, is the main connective with

•


connective with :


means “Everyone is at DIT and everyone is smart”



Existential quantification


• Someone at DIT is smart:

• x At(x,DIT) Smart(x)$

•

• x P is true in a model m iff P is true with x being some possible object in the model

•



• Roughly speaTaoiseach, equivalent to the disjunction of instantiations of P




...



Another common mistake to

avoid • Typically, is the main connective with


connective with :

•


is true if there is anyone who is not at DIT!



Properties of quantifiers


•


•

• x y is not the same as y x

•

• x y Loves(x,y) – “There is a person who loves everyone in the world”

–

• y x Loves(x,y) – “Everyone in the world is loved by at least one person”

–

• Quantifier duality: each can be expressed using the other

•

• x Likes(x,IceCream) x Likes(x,IceCream)

•