CS6665 9 Genetic Alg

download CS6665 9 Genetic Alg

of 77

Transcript of CS6665 9 Genetic Alg

  • 8/6/2019 CS6665 9 Genetic Alg

    1/77

  • 8/6/2019 CS6665 9 Genetic Alg

    2/77

    Golden Rule - 1y Every EA solution should be viewed as based on the

    following generic hypothesis

    y The optimum value(s) of ----- can be found using a---- evolutionary approach.

    2

  • 8/6/2019 CS6665 9 Genetic Alg

    3/77

    One of the most important parts of an assignmentwill be your analysis and presentation of your results.

    EAs are stochastic algorithms, and therefore, yourresults with the same approach may be different fromsomeone elses. Always, clearly and carefully analyze

    your results.

    3

  • 8/6/2019 CS6665 9 Genetic Alg

    4/77

    Golden Rule - 2y There must be a reason for every choice made in your

    EA

    yParameter value choicesyStopping criteria

    yDecision to re-run the EA

    yDecision to mutate and re-run the EA

    4

  • 8/6/2019 CS6665 9 Genetic Alg

    5/77

  • 8/6/2019 CS6665 9 Genetic Alg

    6/77

    Parametersy If you have chosen a value for a parameter, you

    must have a reason for that choice state it.y When in doubt as to a good reason or method to choose

    a parameter value, try a few different values and use thebest as determined empirically as the reason. Here isan example of the wrong way to do it:

    y I set the normal mutation rate to 0.1.

    y W

    hat is normal? Is there an abnormal value?W

    hy 0.1? Is 0.1a probability per bit or per chromosome for mutation?

    6

  • 8/6/2019 CS6665 9 Genetic Alg

    7/77

    Stopping Criteriay In many (most) cases, when executing an EA, you will

    not know the optimum solution. Thus, the questionarises as to why you stopped the EA. Always give your

    stopping criteria and why you used it. An example of abad choice is:

    y Stopped at 2000 generations and the convergence rate was 0.99 Say what?

    7

  • 8/6/2019 CS6665 9 Genetic Alg

    8/77

    Bottom Line - Summaryy This is a computer SCIENCE class. Many of your home

    works will be experiments act like scientists, i.e.,y Be systematic, i.e. have a plan (hypothesis), follow it,

    and document what you did (keep a record for the write-up)

    y A picture (plot or table) is worth a thousand words (abstraction)

    y Everything that happens has a reason. If the occurrence

    is of significance, then why did it happen?

    8

  • 8/6/2019 CS6665 9 Genetic Alg

    9/77

    BASICSyEvolutionary Algorithm (EA) vs

    Evolutionary Computation (EC)

    yEA and EC are somewhat synonymous

    yEC is simply a computer-basedimplementation of an EA

    9

  • 8/6/2019 CS6665 9 Genetic Alg

    10/77

    Evolutionaryy What makes an algorithm evolutionary?

    y Has a population

    y Generates a new population from the old population generations/biological

    y The more fit a solution (individual), the more likely itwill be used in or influence the generation of the newpopulation

    y Fit implies the presence of a fitness measure

    y The search process, while influenced by fitness has acomponent of randomness stochastic/mutation

    10

  • 8/6/2019 CS6665 9 Genetic Alg

    11/77

    REMEMBER!!y Whenever a stochastic process is used to solve a

    problem, if the correct or optimal solution is

    unknown, you should not report the result of a singlerun. Always report results as an average of several runs,or the best of several runs.

    11

  • 8/6/2019 CS6665 9 Genetic Alg

    12/77

    AI & Evolutionary ComputationyEvolutionary Computation

    yMachine learning optimizationand classification paradigms basedon evolutionary (biological)mechanisms

    12

  • 8/6/2019 CS6665 9 Genetic Alg

    13/77

    AI & Evolutionary ComputationyEC is about self-organizationy

    Simple processes that lead tocomplex results, e.g. a geneticalgorithm (GA) is simple

    yThe whole is > the sum of its parts

    yThere is no conservation ofsimplicity Wolfram

    13

  • 8/6/2019 CS6665 9 Genetic Alg

    14/77

  • 8/6/2019 CS6665 9 Genetic Alg

    15/77

    Applications

    y Consider the following two examples of optimizationproblems. By consider, I mean think ofy Writing a computer program to solve each

    y The time for the program to runy How you will know when you have the optimum

    solution?y Optimization of some function of n variables that is

    not everywhere differentiable with some set ofconstraints on those n variables

    y The traveling salesperson problem

    15

  • 8/6/2019 CS6665 9 Genetic Alg

    16/77

    Application Function Optimizationy Lets say that we have some function

    F(x,y,z) = 2x + 3y/x2 - 2z2/(x-y)2

    and there are bounds on the ranges of allowedvalues for x, y, and z

    An optimization problem might be to find the values ofx, y, and z which gives the minimum value of F.

    16

  • 8/6/2019 CS6665 9 Genetic Alg

    17/77

    Howy do I write the program?

    y long will it take to execute?

    ywill I know when I have the optimum?y do I determine the relative quality of a solution

    y If one set of values for F gives a value less than all ofthe other values that I have tested, is it the optimum?

    17

  • 8/6/2019 CS6665 9 Genetic Alg

    18/77

    Application TSP Optimal pathy For a given table of costs find the minimum cost

    transit that visits every city once and only once

    y NP- complete

    y Grows as O(n!)

    y Can be solved using dynamic programming in O(2n n )and spaceO(2n n2 ) (better in time but worse in space

    y For 30 cities there are ~2.7X1032 possibilities

    y exhaustive search anyone?

    18

  • 8/6/2019 CS6665 9 Genetic Alg

    19/77

    Howy do I write the program?

    y long will it take to execute?

    ywill I know when I have the optimum?y do I determine the relative quality of a solution

    19

  • 8/6/2019 CS6665 9 Genetic Alg

    20/77

    Examplesy The preceding 2 examples illustrate some of the

    problems associated with computerizing analgorithmy Time to solve (TSP)

    y

    Fitness (F(x,y,z) and TSP)y What is an optimum?

    y How do we compare one solution with another?

    y Illegal solutions how do we know if a solution isillegal (parameter out of bounds or multiple visits to

    the same city) and what next?y How do we code it?

    y Are there parts of one coded solution that can be used inanother solution? IMPORTANT!

    20

  • 8/6/2019 CS6665 9 Genetic Alg

    21/77

    Robustnessand Reusabilityy One of the positives of EAs is their robustness.

    y An EA used to solve one type of problem, may be useful

    in solving a very different problem with little change tothe algorithm

    y e.g. Matlabs Optimtool

    21

  • 8/6/2019 CS6665 9 Genetic Alg

    22/77

    Searching the

    Search

    Space

    y Optimization problems can be thought of as searchproblems.y Each solution occupies a position in the search

    space.y The parameters and functions of the solution represent

    the search space

    y The real question is how do we search or travel inthis space?

    y Thought of in another way, one can ask, given acurrent location in the space, how do we decidey Are we at the solution?

    y If were not there, how do we decide where to go next?

    22

  • 8/6/2019 CS6665 9 Genetic Alg

    23/77

    Searching the

    Search

    Space

    Are we at the solution?

    If were not there, how do we decide where to go next?

    y There are many approaches to answer thesequestions. We will emphasize the ECapproaches.

    y In the search process, we would like a technique

    such that each time we try a solution, we gainuseful information that can be used in thechoice of the next solution

    y This eliminates a completely random approach

    23

  • 8/6/2019 CS6665 9 Genetic Alg

    24/77

    Example

    yFor the problem 4+x = 10, what doesa value of x = 10 tell us?

    yT

    he error = ?yThe sign of the error tells us whichway to go

    yThe magnitude of the error tells ushow far to go

    24

  • 8/6/2019 CS6665 9 Genetic Alg

    25/77

  • 8/6/2019 CS6665 9 Genetic Alg

    26/77

    SGAyAn SGA (Simple Genetic Algorithm) uses a biological

    or evolutionary approach to traverse the search space.

    y

    It was developed in the 70y Chromosomes are encoded as binary strungs

    26

  • 8/6/2019 CS6665 9 Genetic Alg

    27/77

    27

    Define Chromosome Encoding

    Scheme

    Generate initial population of chromosomes

    Compute Chromosome Fitness

    Mate Chromosomes

    Perform Crossover

    Apply Mutation

    Stopping Criteria met

    No

    DoneYes

    Define Chromosome Encoding

    And Fitness Schemes

    Perform Replacement

  • 8/6/2019 CS6665 9 Genetic Alg

    28/77

    Warning!y In order to get people started, we must fix the

    techniques used. This is only a very small snapshot of

    GAs let alone EC. However, this will give you sufficientbackground to complete the assignment.

    28

  • 8/6/2019 CS6665 9 Genetic Alg

    29/77

    The Problem

    y In order to have something tangible to refer to, wewill deal with the following optimization problem

    y The problem is to optimize the following function:(-1

  • 8/6/2019 CS6665 9 Genetic Alg

    30/77

    30

  • 8/6/2019 CS6665 9 Genetic Alg

    31/77

    Chromosomesand Encodingy GAs work from an encoding of the parameter space.

    y They also work with a population of solutions rather

    than a single solution

    31

  • 8/6/2019 CS6665 9 Genetic Alg

    32/77

    Define the ChromosomeE

    ncodingScheme

    y Chromosome: A sequence of genes in which each gene

    represents an encoding for a particular parameter valuey Gene: (Often) a binary string (of bits) which encodes a

    particular parameter value

    y Encoding Scheme (must be reversible):y It is a translation (formula) such that given a gene (string

    of bits), one can determine the represented parametervalue.

    32

  • 8/6/2019 CS6665 9 Genetic Alg

    33/77

    Chromosome Encoding Scheme

    y For this problem, since there is only one parameter,thus, the chromosome is made up of a single gene.

    y The gene value range is given in the problem statementas -1

  • 8/6/2019 CS6665 9 Genetic Alg

    34/77

    ChromosomeE

    ncodingS

    chemey Defining an encoding scheme also defines a decoding

    scheme

    Value(gene) = -1+(gene value)*(3/511)

    e.g.

    y if gene = 000000111

    ygene value = 7 ( i.e. decimal equivalent of binary000000111)

    yValue(gene)= -1+7*3/511 = -0.9589041.

    34

  • 8/6/2019 CS6665 9 Genetic Alg

    35/77

    Generate initial population of

    chromosomes

    yAn initial population is simply a randomlygenerated set of chromosomes (possible solutions)

    y Since a chromosome can be viewed as a binarystring, generating a random chromosome simplymeans generating a random string of 1s and 0s foreach chromosome of the population

    35

  • 8/6/2019 CS6665 9 Genetic Alg

    36/77

    Generate initial population ofchromosomes

    y IMPORTANT NOTE

    y In the flow chart, nothing is said about the size of thepopulation. There is no algorithm for determining thesize of the population, only a few rules of thumb

    y Generally

    y as the search space size increases, the population sizewill (should) increase

    y As the complexity of the fitness calculation functionincreases the population size will (should) decrease

    36

  • 8/6/2019 CS6665 9 Genetic Alg

    37/77

    Generate initial population of

    chromosomes

    y For binary chromosomes of fixed length n, the sizeof the search space is simply 2n

    y In this problem, the search space size is relativelysmall

    y Since there are only 9 bits 29 = 512.

    y

    thus we will use a population size of n=10y The initial population will be generated as 10 random

    binary strings, each of length 9

    37

  • 8/6/2019 CS6665 9 Genetic Alg

    38/77

  • 8/6/2019 CS6665 9 Genetic Alg

    39/77

    Compute Chromosome Fitness

    y If the first chromosome in the population was

    000000111yX=000000111=>7

    Value(X)= -1+7*3/511 = -0.9589041

    39

    5130110sin !! xxXF T

  • 8/6/2019 CS6665 9 Genetic Alg

    40/77

    Mate Chromosomesy There are several schemes used to determine which

    chromosomes to mate.

    y

    Generally some stochastic process is used to determinewhich two chromosomes to mate

    yA technique called Roulette Wheel is a fairlycommon technique

    y

    Its positive is that it is simple to implementy Its negative is that it may not maintain diversity in the

    population

    40

  • 8/6/2019 CS6665 9 Genetic Alg

    41/77

    Maintaining Population Diversityy With roulette wheel mating, over time, as some

    chromosomes get more fit, they will tend tooverwhelm the less fit chromosomes and thus be theonly ones selected for mating.

    y In a worst case scenario, you may get to a single orduplicate chromosome for the entire population.When this happens, the only searching that occurs willbe through mutation.

    41

  • 8/6/2019 CS6665 9 Genetic Alg

    42/77

    RouletteWheel Selectiony Compute the relative fitness of each chromosome as a

    fraction of the total fitness.

    y

    e.g. if there are 3 chromosomes of fitness C1=10,C2=20, and C3=30.

    y Relative fitness R1=10/60=0.167, R2=20/60 = 0.333,R3=30/60=0.5

    y

    Note this is sort of like another probability densityfunction

    42

  • 8/6/2019 CS6665 9 Genetic Alg

    43/77

    RouletteWheel Selectiony Using these three relative fitness values, now picture a

    roulette wheel with a 1 unit circumference which isdivided as follows

    y 0

  • 8/6/2019 CS6665 9 Genetic Alg

    44/77

  • 8/6/2019 CS6665 9 Genetic Alg

    45/77

    RouletteWheel Selectiony With a population of 10, we need 5 mating pairs.

    y Each mating pair will generate 2 offspring

    yWhat if both mates are the same?ySelect another mate for one of the

    duplicates

    yOR

    yMate these two- mutation may makechanges.

    45

  • 8/6/2019 CS6665 9 Genetic Alg

    46/77

  • 8/6/2019 CS6665 9 Genetic Alg

    47/77

    ApplyMutationy For every bit of every chromosome generate a random

    number from 0 to 1

    y If the random value is < mutation rate, then flip thebit; otherwise, leave it unchanged

    y Note that for a long chromosome (several bits), even asmall mutation rate will generally change at least one

    bit

    47

  • 8/6/2019 CS6665 9 Genetic Alg

    48/77

    Questiony If a chromosome is 40 bits in length, and the mutation

    rate is 0.02, what is the probability that at least one bitwill change under mutation?

    y P( at least 1 bit change) = 1 P(no changes)y P(no changes) = 0.9840 = 0.446

    y Thus, the probability of at least one bit change is 0.554

    48

  • 8/6/2019 CS6665 9 Genetic Alg

    49/77

    Perform ReplacementyAt this point we have two populations

    y Population A is the original population, i.e. the parents

    y Population B is the offspring population, i.e. thechildren

    y The goal is to come up with a population thaty Is diverse

    y Is an improvement

    y

    At least keeps the best of population

    49

  • 8/6/2019 CS6665 9 Genetic Alg

    50/77

    Perform ReplacementySome possible approaches

    yCombine (sort) population A&B and thenkeep only the best half

    yReplace parent(s) only if child is better

    ySimply replace parents with children, but

    keep best of population to date as aseparate chromosome

    50

  • 8/6/2019 CS6665 9 Genetic Alg

    51/77

  • 8/6/2019 CS6665 9 Genetic Alg

    52/77

    Continuey Now simply continue by going back and computing

    the fitness of each chromosome (its probably already

    computed, since we did that for mate selection),y With each new generation, continue until the stopping

    criteria has been met

    52

  • 8/6/2019 CS6665 9 Genetic Alg

    53/77

    Terminology

    y Chromosome is made up of (can be dividedinto)genes. One can think of a gene asencoding a trait. The different possible settingsfor a gene are called alleles. Each gene is

    located (generally) at a particular locus(position) on the chromosome. The completecollection of genetic material (all chromosomesfor a biological entity taken together) is called

    thegenome. Two individuals that have thesame genomes are said to be of the same

    genotype.

    53

  • 8/6/2019 CS6665 9 Genetic Alg

    54/77

    Terminologyy Search space: Collection of candidate solutions

    y Similarity or Distance between twochromosomes

    y For binary its often the number of changes thatmust occur in A to make it the same as B(Hammingdistance)

    y For non-binary Euclidean distance or Euclidean

    distance squared

    54

    nnbababaD !

  • 8/6/2019 CS6665 9 Genetic Alg

    55/77

    Whats best?y A GA consists of

    y Encoding

    y Initial population:

    y

    Population generation techniquey Population size

    y Mate selection algorithm

    y Crossover technique (single point, tw0o-point, uniform)

    y Mutation scheme and rate

    y Replacement scheme

    y Fitness function max or min

    y Stopping criteria

    55

  • 8/6/2019 CS6665 9 Genetic Alg

    56/77

    Whats best? EncodingyGenerally

    yA scheme that does not allow for illegalsolutions

    yA scheme that minimizes the searchspace size

    yBinary

    56

  • 8/6/2019 CS6665 9 Genetic Alg

    57/77

    yIts all about the encoding

    scheme and the fitnessfunction!

    57

  • 8/6/2019 CS6665 9 Genetic Alg

    58/77

  • 8/6/2019 CS6665 9 Genetic Alg

    59/77

    GAExample

    Coloring a grid with a GA

    y For this problem, there are ~8.47X1011 (i.e. 325 ) possiblecolorings of which about 1000 are correct.

    y Whats the probability of randomly generating a legalcoloring?

    y ~103 /1011 = 10-8 => like rolling the same number on a rollof the die 10 times in a row.

    y What encoding scheme should we use?

    y What fitness function should we use?

    59

  • 8/6/2019 CS6665 9 Genetic Alg

    60/77

    OtherMate Selection SchemesyMost of the mate selection schemes try

    to:

    yReduce stochastic errors, oryRetain best solution(s) to date

    (elitist), or

    yMaintain a diverse population ofsolutions, or

    ySpeed the mate selection process60

  • 8/6/2019 CS6665 9 Genetic Alg

    61/77

    Mate Selection Reduce Stochastic

    ErrorsyAre stochastic errors inevitable?

    y Consider 4 chromosomes with relative fitness of

    .25, .3, .4, and .05y Probabilistically, chromosome 2 (0.3 relative

    fitness) should be chosen 1.2 times out of the 4choices.

    yObviously, this cant be how can you get .2selections?

    61

  • 8/6/2019 CS6665 9 Genetic Alg

    62/77

  • 8/6/2019 CS6665 9 Genetic Alg

    63/77

  • 8/6/2019 CS6665 9 Genetic Alg

    64/77

    Elitist Schemes

    y For some schemes, one could fail to choose the bestchromosome in the population

    yHow could this happen with roulette

    wheel selection?yHow could this happen with remainder

    stochastic sampling?

    y

    Could not happen

    64

  • 8/6/2019 CS6665 9 Genetic Alg

    65/77

    Elitist Schemes

    yTake the best of population to dateand copy it into the next generation

    yMeans that over time the

    population size will growyOnly replace a parent with a child if

    the child is more fityThis is hill climbing because average

    fitness of population always increasesor stays the same

    65

  • 8/6/2019 CS6665 9 Genetic Alg

    66/77

    Tournament Selection forDiversity

    yA popular reproduction scheme to maintaindiversity is called tournament selection It worksas follows

    y Randomly select two (or n) solutions and the bestgoes into the mating pool.

    y Selection is systematic, i.e. allow each solution tobe selected exactly twicey

    Best solution in population will win twice, etc.y Any solution will have 0, 1, or two copies in the pool

    66

  • 8/6/2019 CS6665 9 Genetic Alg

    67/77

    Tournament Selection

    y Once the pool is created

    yRandomly select a pair to mate andcontinue until all pairs have been

    selectedy Deb and Goldberg claim this technique is at least as

    good as any other method in terms ofcomputational complexity and speed of

    convergence

    67

  • 8/6/2019 CS6665 9 Genetic Alg

    68/77

    CrossoveryMate selection doesnt create any new

    solutions

    yMate selection only creates duplicates

    yCreation of new solutions (search of

    solution space) is done by crossoverand mutation

    68

    C

  • 8/6/2019 CS6665 9 Genetic Alg

    69/77

    Crossover

    y Single Point Crossover(SPC) with equal lengthchromosomes

    For a binary chromosome of length n bits, generate a

    random number x such that1< x = n

    For SPC, swap bits x through n of the twochromosomes (bits are numbered 1 through n)

    69

  • 8/6/2019 CS6665 9 Genetic Alg

    70/77

    2-Point Crossovery Like single point except select two points in each

    chromosome, e.g. bits 10 and 25. Then swap bits 10-25of the two chromosomes

    70

  • 8/6/2019 CS6665 9 Genetic Alg

    71/77

    Uniform Crossovery In uniform, each bit of the children is a random

    sample from the parents

    For k=1 to number of bits in chromosome

    For each bit k of the two offspring, generate a randomnumber r

    if r

  • 8/6/2019 CS6665 9 Genetic Alg

    72/77

    Uniform CrossoveryPro

    yBetter maintains diversity

    y Generally some elitist scheme is used for childreplacement of parent

    yCon

    yCan tend to lose good chromosomes

    72

  • 8/6/2019 CS6665 9 Genetic Alg

    73/77

    Mutationy Crossover does most of the searching, even though

    mutation does some

    y

    Mutations main job is to maintain or introducediversity

    y Sometimes the mutation rate is increased whenpopulation diversity gets low Stir things up

    73

    Mutation

  • 8/6/2019 CS6665 9 Genetic Alg

    74/77

    Mutation

    yMutation is generally performed after crossover,bit-by-bit, on each of the new chromosomes.

    Given:chromosome length = n and probability ofmutation of p

    m

    ForI= 1 to ngenerate a random number Xif X

  • 8/6/2019 CS6665 9 Genetic Alg

    75/77

    Mutation

    y

    A problem with the bit-by-bit mutation method isthat it requires the generation of a random numberfor every bit of every chromosome

    y Goldberg(1989) suggested a mutation clockoperator to reduce the complexity of the mutation

    operationy After the first bit is mutated, the location of the next

    bit to mutate is created as a random function, i.e.skip some number of bits

    75

    Mutation

  • 8/6/2019 CS6665 9 Genetic Alg

    76/77

    MutationyA question that sometimes arises is what happens to

    mutation if rather than a binary chromosome we havesay a ternary chromosome?

    yEach chromosome element is a 0, 1, or 2

    yWe cant simply complement

    yCan we mutate say a 2 to a 2 now?yWhats the difference between randomly

    choosing any one of the three possible

    values versus only one of the two differentvalues?

    76

  • 8/6/2019 CS6665 9 Genetic Alg

    77/77

    Next Generation

    y Once a complete generation has been createdthe old population must in some way bemerged with the new.

    y T

    o replace all of old with new is calledgenerational replacement

    y Remember, this replacement is not done until anew population is created. Dont replace old

    chromosomes as new ones are generated.Theother replacement schemes utilize some formof elitism as we have already discussed