1 Reasons for parallelization Can we make GA faster? One of the most promising choices is to use...

Reasons for parallelization

Can we make GA faster?One of the most promising choices is to use parallel implem

entations.

The reasons for parallelization1) The reason of the nature2) The reason of GA itself

A classification of parallel GA

Basic idea: divide-and-conquer1) Global parallelization: only one population, the behavior

of the algorithm remains unchanged, easy to implement2) Coarse-grained parallel GA( 粗粒度 ): the population is div

ided into multiple subpopulations, each subpopulation evolves isolated and exchanges individuals occasionally

3) Fine-grained parallel GA( 细粒度 ): the ideal case is to have just one individual for every processing element

4) Hybrid parallel GA

Global parallelization

1) Initialization

2) Repeat the following steps

2.1) Selection

2.2) Crossover

2.3) Mutation

2.4) Calculate the fitness

Is there any difference?

1) Initialization2) Repeat the following steps2.1) Selection2.2) Crossover2.3) Mutation2.4) Calculate the fitness for i=1 to N par_do calculate the fitness of ith individual endfor

The basic characteristics

This method maintains a single population and the evaluation of the individuals is done in parallel.

Each individual competes with all the other chromosomes and also has a chance to mate with any other individual.

The genetic operations are still global.

Communication occurs only as each processor receives its subset of individuals to evaluate and when the processor return the fitness values.

Implementation

The model does not assume anything about the underlying computer architecture.

On a shared memory multiprocessor, the population can be stored in shared memory and each processor can read the individuals assigned to it and write the evaluation results back without any conflicts. It may be necessary to balance the computational load among the processors.

On a distributed memory computer, the population can be stored in one processor. This “master” processor will be responsible for sending the individuals to the other processors (the slaves) for evaluation, collecting the results, and applying the genetic operators to produce the next generation.

The genetic operators

Crossover and mutation can be parallelized using the same idea of partitioning the population and distribution the work among multiple processors.

However, these operators are so simple that it is very likely that the time required to send individuals back and forth will offset any performance gains.

The communication overhead is also a problem when selection is parallelized because most forms of selection need information about the entire population and thus require some communications.

Conclusion

Global parallel GA is easy to implement and it can be a very efficient method of parallelization when the evaluation needs considerable computations.

This method also has the advantage of preserving the search behavior of the GA, so we can apply directly all the theory for simple GA.

Unbalanced load.

Coarse grained parallel GA

1) Initialization and divide all the individuals into p subpopulation

2) for i=1 to p par-do2.1) for j=1 to n do selection, crossover, mutation calculate the fitness2.2) select some individuals as the migrants2.3) send emigrants and receive immigrants3) Go to 2)

Coarse-grained GA seems like a simple extension of the serial GA. The recipe is simple: take a few conventional (serial) GAs, run each of them on a node of a parallel computer, and at some predetermined times exchange a few individuals.

Coarse-grain parallel computers are easily available, and even if there is no parallel computer available it is easy to simulate one with a network of workstations or even in a single processor machine.

There is relatively little extra effort needed to convert a serial GA into a coarse-grained parallel GA. Most of the program of the serial GA remains the same and only a few subroutines need to be added to implement migration.

Strong capability for avoiding premature convergence while exploiting good individuals, if migration rates/patterns well chosen

Migrant Selection Policy

Who should migrate?Best guy?One random guy?Best and some random guys?Guy very different from best of receiving subpopulation?

(“similarity reduction”)If migrate in large % of population each generation, acts li

ke one big population, but with extra replacements – could actually SPEED premature convergence

Migrant Replacement Policy

Who should a migrant replace?

Random individual?

Worst individual?

Most similar individual (Hamming sense)

Similar individual via crowding?

How Many Subpopulations?

How many total evaluations can you afford?Total population size and number of generations and

“generation gap” determine run timeWhat should minimum subpopulation size be?

Smaller than 40-50 USUALLY spells trouble – rapid convergence of subpop – 100-200+ better for some problems

Divide to get how many subpopulations you can afford

Fine-grained parallel GA

1) Partition the initial population with N individuals to N processors;

2) for i=1 to N par-do2.1) Each processor select one individual from itself and its

neighbour2.2) Crossover with one individual from its neighbour and r

emain one offspring2.3) Mutation2.4) Calculate the fitness3) Go to 2)

The largest possibility of parallelization.There is intensive communication between the processors.It is common to place the individuals in a 2-D grid because

in many massively parallel computers the processing elements are connected using this topology.

Hybrid parallel algorithms

Combine the methods to parallelize GA and this results in hybrid-parallel GAs.

Some examples

This hybrid GA combines a coarse-grained GA (at the high level) and a fine-grained GA (at the low level)

Some examples

This hybrid GA combines a coarse-grained GA at the high level where each node is a global parallel GA

Some examples

This hybrid uses coarse-grained Gas at both the high and low level. At the low level the migration rate is faster and the communication topology is much denser than at the high level.

Network model

Here, k independent GAs run with independent memories, operators and function evaluations. At each generation, the best individuals discovered are broadcast to all the sub-populations.

Community model

Here, the GA is mapped to a set of interconnected communities, consisting of a set of homes connected to a centralised town.

Reproduction and function evaluations take place at home. Offspring are sent to town to find mates. After mating, "new couples" are assigned a home either in their existing community or in another community.

Why we introduce PGA?

Allow a more extensive coverage of the search space and an increased probability to find the global optimum

They could also be used for multi-objective optimisation, with each sub-population responsible for a specific objective, and co-evolution, with each sub-population responsible for a specific trait.

1 Reasons for parallelization Can we make GA faster? One of the most promising choices is to use...

Documents

Transcript of 1 Reasons for parallelization Can we make GA faster? One of the most promising choices is to use...

parallelization strategy

Parallelization & Multicore

Automatic Parallelization

Aho-Corasick algorithm parallelization

Automatic parallelization by pattern-matching · PDF fileforms automatic parallelization of numerical Fortran 77 ... direct solvers for linear equation ... automatic parallelization

Interprocedural Parallelization Analysis in SUIF

Parallelization of Image Segmentation Algorithms Shu Jiang ...plaza.ufl.edu/shujinbd/index_files/Parallel_Image_Segmentation.pdf · parallelization of image segmentation techniques:

Smith waterman algorithm parallelization

Master/Slave Speculative Parallelization

Parallelization of FFT in AFNI

Parallelization - cons.mit.edu

Modeling Parallelization and Flexibility …csjarchive.cogsci.rpi.edu/2005v29/3/s15516709HCOG0000_23/...Modeling Parallelization and Flexibility Improvements in Skill Acquisition:

MASTER/SLAVE SPECULATIVE PARALLELIZATION AND …

Test parallelization using Jenkins

Java Code Transformation for Parallelization

Automatic Loop Parallelization using STM

.Net Multithreading and Parallelization

Parallel Monte-Carlo Tree Search - Maastricht University · Parallel Monte-Carlo Tree Search 63 Fig.2. (a) Leaf parallelization (b) Root parallelization (c) Tree parallelization with

Turbodecodingalgorithm parallelization

Loop parallelization & pipelining