Upper Bounds on the Time and Space Complexity of Optimizing Additively Separable Functions

Post on 14-Jan-2016

53 views 1 download

description

Upper Bounds on the Time and Space Complexity of Optimizing Additively Separable Functions. Matthew J. Streeter Carnegie Mellon University Pittsburgh, PA matts@cs.cmu.edu. Outline. Introduction Definitions & notation Detecting linkage Algorithm, analysis & performance Conclusions. - PowerPoint PPT Presentation

Transcript of Upper Bounds on the Time and Space Complexity of Optimizing Additively Separable Functions

Upper Bounds on the Time and Space Complexity of Optimizing Additively Separable Functions

Matthew J. StreeterCarnegie Mellon University

Pittsburgh, PAmatts@cs.cmu.edu

Outline

• Introduction

• Definitions & notation

• Detecting linkage

• Algorithm, analysis & performance

• Conclusions

Introduction

• An additively separable function f of order k is one that can be expressed as:

where each fi depends on at most k characters of s, and each character contributes to at most one fi

• Studied extensively in EC literature, particularly in relation to competent GAs.

f(s) = fi (s)i

Introduction

David Goldberg,The Design of Innovation (p. 51-2)

“[W]e would like a procedure that scales polynomially, as O(jb) with b as small a number as possible (current estimates of b suggest that subquadratic—b≤2—solutions are possible).”

Introduction

• Previous bound: time O(j2); space O(1) (Munemoto & Goldberg 1999)

• New bound: time O(j*ln(j)); space O(1)

Definitions & notation

• si = ith character of binary string s

• s[ic] = a copy of s with si set to c

fi(s) = f(s[i (si)]) - f(s)

= effect on fitness of flipping ith bit (Munemoto & Goldberg 1999)

Definitions & notation

• Linkage: positions i and j are linked, written (i, j), if there is some string s such that:

fi(s[j0]) fi(s[j1])

• Grouping: i and j are grouped, written (i, j), if i=j or if there is some sequence i0, i1, ..., in such that:

i0 = i

in = j

(im, im+1) for 0 m < n

Definitions & notation

• Linkage group: a non-empty set g such that if i g then j g iff. (i, j)

• Linkage group partition: the unique set f = {g1, g2, ..., gn} of linkage groups of f.

Example

f(s) = s1s2 + s2s3 + s4s5

• Linkage:(1, 2) (2, 3) (4, 5)

(2, 1) (3, 2) (5, 4)

• Linkage groups: {1,2,3} and {4,5}, so f = {{1,2,3}, {4,5}}

• f is an additively separable function of order 3

Relationship to additively separable functions

• If f = {g1, g2, ..., gn} then f can be written as:

where each fi depends only on the positions in gi

So f is additively separable of orderk = max1≤i≤n |gi|

• So once we know f, we can find the global optimum of f in time O(2k*j) by local search

f(s) = fi (s)i

Algorithm overview

• Start with a random string and the trivial linkage group partition = {{1}, {2}, ..., {j}}.

• Repeatedly perform a randomized test to detect pairs of positions that are linked

• Every time we find a new link (i,j), merge i’s and j’s subset to form a new subset g’, and use local search to make optimal w.r.t. g’

• Once = f, we will have found a globally optimal string

Detecting linkage: O(j2) approach

• For fixed i and j, pick a random string s and check if fi(s[j0]) fi(s[j1]).

• Test requires 4 function evaluations, and is conclusive with probability at least 2-k.

• Leads to an algorithm that requires O(2k*j2) function evaluations. (Munemoto & Goldberg 1999)

Detecting linkage: O(j*ln(j)) approach

• For fixed i, generate two random strings s and t that have the same character at i, and check whether fi(s) fi(t).

• Suppose fi(s) fi(t). Let d be the hamming distance from s to t.– If d=1, call the position that differs j and we have (i,

j) by definition – Otherwise create a string s’ that differs from both s

and t in d/2 positions. We must have either fi(s’) fi(s) or fi(s’) fi(t), so just recurse until we get d=1.

Examplef(s) = s1s2 + s2s3 + s4s5 i = 2

Iteration 1

s f2(s)

s(1) 00000 0

t(1) 10111 2

s(1)’ 00011 0

Iteration 2

s f2(s)

s(2) 00011 0

t(2) 10111 2

s(2)’ 00111 1

Iteration 3

s f2(s)

s(2) 00111 1

t(2) 10111 2

Conclusion: (1, 2)

How to use this?

• Some links are more “obvious” than others

• Ideally we would like to only discover novel links (those that let us update )

• We would like test to be conclusive with probability at least 2-k

Discovering novel links

• Binary search starting at s and t will always return a position j where s and t disagree (sj tj)

• So, when looking for a link from i, just make sure s and t agree on all the positions in whichever subset in contains i

Probability that test is conclusive

• Let g be the subset in that contains i, and let gf be i’s true linkage group (g gf)

• Can show that with probability at least 2-|g|, will choose an s such that for some t, fi(t) fi(s)

• Because there only |gf| - |g| positions left in t that affect fi(t), test will be conclusive with probability at least:

2− g

2− g f − g( ) = 2

− g f ≥ 2−k

Finding f

• On each iteration, let i run from 1 to j and perform test for link out of i

Analysis

• Each conclusive test requires time O(ln(j)) and we do at most j-1 of them, so total time is O(j*ln(j))

• Time for local search is O(2k*j)

• Only thing left is time for inconclusive tests, each of which is O(1).

• Need to know the number t of rounds needed to discover the correct with probability ≥ p

Calculating number of rounds

• To discover f it is sufficient that each position participate in 1 conclusive test

• If this hasn’t happened yet, it happens with probability at least 2-k

• This means we get a lower bound by analyzing the following algorithm:

for n from 1 to t do:for i from 1 to j do:

with probability 2-k, mark iif all positions are marked, return ‘success’

return ‘failure’

Calculating number of rounds

• The probability that this algorithm succeeds isp = (1-(1-2-k)t)

• To succeed with probability p, we must set

t

• Some calculus shows that this is O(2k*ln(j))

ln 1− p1

l

⎝ ⎜

⎠ ⎟

⎜ ⎜

⎟ ⎟

ln 1− 2−k( )

Performance

• Near-linear scaling as expected

• Have solved up to 100,000 bit problems with this algorithm

• Ran on folded trap functions with k=5, 5 ≤ j ≤ 1000

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Limitations

• Function must be strictly additively separable

• On any real problem, this algorithm will become an exhaustive search

• Can start to address this using averaging and thresholding (Munemoto & Goldberg 1999)

• I believe these limitations can be overcome

Conclusions

• New upper bounds on time complexity (O(2k*j*ln(j))) and space complexity (O(1)) of optimizing additively separable functions

• Algorithm not practical as-is

• Linkage detection algorithm presented here could be used to construct more powerful competent GAs