Machine Learning

32
Machine Learning A Quick look Sources: Artificial Intelligence – Russell & Norvig Artifical Intelligence - Luger By: Héctor Muñoz-Avila

description

Machine Learning. A Quick look. Sources: Artificial Intelligence – Russell & Norvig Artifical Intelligence - Luger. By: H é ctor Muñoz-Avila. Action 1. Action 2. Knowledge. Knowledge. changed. What Is Machine Learning?. - PowerPoint PPT Presentation

Transcript of Machine Learning

PowerPoint Presentation

Machine LearningA Quick lookSources: Artificial Intelligence Russell & Norvig Artifical Intelligence - LugerBy: Hctor Muoz-AvilaWhat Is Machine Learning?Logic is not the end of wisdom, it is just the beginning --- Spock

SystemKnowledgeAction1timeGamechangedGame

Action2Knowledge

System

Learning: The Big PictureTwo forms of learning:

Supervised: the input and output of the learning component can be perceived (for example: experienced player giving friendly teacher)

Unsupervised: there is no hint about the correct answers of the learning component (for example to find clusters of data)4Online during gameplayAdapt to player tacticsAvoid repetition of mistakesRequirements: computationally cheap, effective, robust, fast learning (Spronck 2004)Offline Between the end of a game and the nextDevise new tacticsDiscover exploitsOffline vs. Online Learning11-Feb-104Two different types of learning in games: online and offline:Online learning takes place during gameplay against human players. The main applications for online learning are to adapt to player tactics and playing styles and avoid the repetition of mistakes. However, online learning has rarely been applied in commercial games because most game companies are not comfortable with shipping their games with an online learning technique included in the game. What if the NPC learns something stupid? What if the learning takes too long or is too computational expensive? Therefore we set a couple of requirements for online learning:Computationally cheap: Should not disturb flow of gameplay.Effective: Should not generate too many bad inferior opponents.Robust: Should be able to deal with randomness incorporated in games. For instance you dont want permanently unlearn a specific behavior because it performed badly at some point. The reason for the bad performance could be subscribed to chance.Fast learning: Should lead to results quickly. Dont want to bore the players with slow learning AI.

Offline learning can be applied before a games is being released. Typically, offline learning is used to explore new game tactics or to find exploits.Classification(According to the language representation)SymbolicVersion SpacesDecision TreesExplanation-Based Learning

Sub-symbolicReinforcement LearningConnectionistEvolutionaryClassification(According to the language representation)SymbolicVersion SpaceDecision TreesExplanation-Based Learning

Sub-symbolicReinforcement LearningConnectionistEvolutionaryVersion SpaceIdea: Learn a concept from a group of instances, some positive and some negativeExample: target: obj(Size,Color,Shape) Size = {large, small} Color = {red, white, blue} Shape = {ball, brick, cube}

Instances: +:obj(large,white,ball)obj(small,blue,ball)

: obj(small,red,brick) obj(large,blue,cube)

Two extremes (temptative) solutions:obj(X,Y,Z)obj(large,white,ball) obj(small,blue,ball) too generaltoo specificobj(large,Y,ball)obj(small,Y,ball)obj(X,Y,ball)concept spaceHow Version Space Works++++++++If we consider only positivesIf we consider positive and negatives++++++++What is the role of the negative instances?to help prevent over-generalizationsClassification(According to the language representation)SymbolicVersion SpaceDecision TreesExplanation-Based Learning

Sub-symbolicReinforcement LearningConnectionistEvolutionaryExplanation-Based learningACBABCACBCBABACBACBCACABACBBCAABCABCABCCan we avoid making this error again????Explanation-Based learning (2)ACBABCCBAACBABC???Possible rule: If the initial state is this and the final state is this, dont do that More sensible rule: dont stack anything above a block, if the block has to be free in the final state Classification(According to the language representation)SymbolicVersion SpaceDecision TreesExplanation-Based Learning

Sub-symbolicReinforcement LearningConnectionistEvolutionaryMotivation # 1: Analysis ToolSuppose that a gaming company have a data base of runs with a beta version of the game, lots of data

How can that companys developers use this data to figure out an good strategies for their AI

Motivation # 1: Analysis Tool (contd)Exple Bar Fri Hun Pat TypeReswait x1 no no yes some french yes yes x4 no yes yes full thai no yes x5 no yes no full french yes no x6 x7 x8 x9 x10 x11Games dataif built center hall & has built 4 workers then build defense tower inductionDecision TreeThe Knowledge Base in Expert SystemsA knowledge base consists of a collection of IF-THEN rules:if built center hall & has built 4 workers then build defense tower

if built center hall & mine then upgrade center hallKnowledge bases of expert systems contain hundreds and sometimes even thousands such rules. Frequently rules are contradictory and/or overlapSample Expert System in Games: Age of Empires(defrule(current-age == dark-age(building-type-count-total mining-camp > 0)(not (research-available feudal-age))=>(set-strategic-number sn-food-gatherer-percentage 52)(set-strategic-number sn-wood-gatherer-percentage 35)(set-strategic-number sn-gold-gatherer-percentage 13)(set-strategic-number sn-stone-gatherer-percentage 0)(disable-self))http://www.youtube.com/watch?v=GEbnqc82lew

Main Drawback of Expert Systems: The Knowledge Acquisition Bottle-NeckThe main problem of expert systems is acquiring knowledge from human specialist is a difficult, cumbersome and long activity. NameKB #Rules Const. time(man/years)Maint. time(man/years) MYCIN KA 500 10 N/A XCONKA 2500 18 3KB = Knowledge BaseKA = Knowledge AcquisitionMotivation # 2: Avoid Knowledge Acquisition Bottle-NeckGASOIL is an expert system for designing gas/oil separation systems stationed of-shore

The design depends on multiple factors including: proportions of gas, oil and water, flow rate, pressure, density, viscosity, temperature and others

To build that system by hand would had taken 10 person years

It took only 3 person-months by using inductive learning!

GASOIL saved BP millions of dollarsMotivation # 2 : Avoid Knowledge Acquisition Bottle-Neck NameKB #Rules Const. time (man/years)Maint. time(man/months) MYCIN KA 500 10 N/A XCONKA 2500 18 3 GASOILIDT 2800 1 0.1 BMTKA (IDT) 30000+ 9 (0.3) 2 (0.1)KB = Knowledge BaseKA = Knowledge AcquisitionIDT = Induced Decision TreesExample of a Decision TreePatrons?noyesnonesomewaitEstimate?noyes0-10>60FullAlternate?Reservation?Yes30-60noyesNonoBar?YesnoyesFri/Sat?NoYesyesnoyesHungry?yesNo10-30Alternate?yesYesnoRaining?noyesyesnoyesDefinition of A Decision TreeA decision tree is a tree where:The leaves are labeled with classifications (if the classification is yes or no. The tree is called a boolean tree)The non-leaves nodes are labeled with attributesThe arcs out of a node labeled with an attribute A are labeled with the possible values of the attribute AInductionExple Bar Fri Hun Pat TypeReswait x1 no no yes some french yes yes x4 no yes yes full thai no yes x5 no yes no full french yes no x6 x7 x8 x9 x10 x11DatapatternDatabases: what are the data that matches this pattern?databaseInduction: what is the pattern that matches these data?inductionInduction of Decision TreesObjective: find a concise decision tree that agrees with the examplesThe guiding principle we are going to use is the Ockhams razor principle: the most likely hypothesis is the simplest one that is consistent with the examplesProblem: finding the smallest decision tree is NP-completeHowever, with simple heuristics we can find a small decision tree (approximations)Induction of Decision Trees: AlgorithmAlgorithm:

Initially all examples are in the same group

Select the attribute that makes the most difference (i.e., for each of the values of the attribute most of the examples are either positive or negative)

Group the examples according to each value for the selected attribute

Repeat 1 within each group (recursive call)ExampleExple Bar Fri Hun PatAltTypewait x1 no no yes some yesFrench yes x4 no yes yes full yes Thai yes x5 no yes no full yesFrench no x6 yes no yes some noItalianyes x7 yes no no none noBurgerno x8 no no yes some no Thaiyes x9 yes yes no full noBurgerno x10 yes yes yes full yesItalianno x11 noNo no none no Thai noIDT: ExampleLets compare two candidate attributes: Patrons and Type. Which is a better attribute?Patrons?noneX7(-),x11(-)someX1(+),x3(+),x6(+),x8(+)fullX4(+),x12(+), x2(-),x5(-),x9(-),x10(-)Type? frenchX1(+),x5(-)italianX6(+),x10(-)burgerX3(+),x12(+), x7(-),x9(-)X4(+),x12(+)x2(-),x11(-)thaiExample of a Decision TreePatrons?noyesnonesomewaitEstimate?noyes0-10>60FullAlternate?Reservation?Yes30-60noyesNonoBar?YesnoyesFri/Sat?NoYesyesnoyesHungry?yesNo10-30Alternate?yesYesnoRaining?noyesyesnoyesDecision Trees in Gaminghttp://www.youtube.com/watch?v=HMdOyUp5Rvk

Black & White, developed by Lionhead Studios, and released in 2001 Used to predict a players reaction to a certain creatures actionIn this model, a greater feedback value means the creature should attackThis is done by inducing a decision treeDecision Trees in Black & WhiteExampleAttributesTargetAllegianceDefenseTribeFeedbackD1FriendlyWeakCeltic-1.0D2EnemyWeakCeltic0.4D3FriendlyStrongNorse-1.0D4EnemyStrongNorse-0.2D5FriendlyWeakGreek-1.0D6EnemyMediumGreek0.2D7EnemyStrongGreek-0.4D8EnemyMediumAztec0.0D9FriendlyWeakAztec-1.0should your creature attack a town?Decision Trees in Black & WhiteAllegianceDefenseFriendlyEnemy0.4-0.3-1.0WeakStrong0.1MediumNote that this decision tree does not even use the tribe attributeDecision Trees in Black & WhiteNow suppose we dont want the entire decision tree, but we just want the 2 highest feedback valuesWe can create a Boolean expressions, such as((Allegiance = Enemy) ^ (Defense = Weak)) v ((Allegiance = Enemy) ^ (Defense = Medium))Classification(According to the language representation)SymbolicVersion SpaceDecision TreesExplanation-Based Learning

Sub-symbolicReinforcement LearningConnectionistEvolutionaryNext class