Multi-objective Optimization of Two-Stage Helical Gear Train using NSGA-II
Multi-objective optimization NSGA-II
Transcript of Multi-objective optimization NSGA-II
http://www.lamda.nju.edu.cn/qianc/
Last class
β’ Multi-objective optimization
β’ NSGA-II
β’ SMS-EMOA
β’ MOEA/D
Popular variants
of MOEA
Heuristic Search and Evolutionary Algorithms
Chao Qian οΌι±θΆ οΌ
Associate Professor, Nanjing University, China
Email: [email protected]
Homepage: http://www.lamda.nju.edu.cn/qianc/
Lecture 12: Evolutionary Algorithms for Constrained Optimization
http://www.lamda.nju.edu.cn/qianc/
Constrained optimization
ππππ₯βπ³ π π₯
π . π‘. ππ π₯ = 0, 1 β€ π β€ π;
βπ π₯ β€ 0, π + 1 β€ π β€ π
General formulation:
objective function
equality constraints
inequality constraints
The goal: find a feasible solution minimizing the objective π
A solution is (in)feasible if it does (not) satisfy the constraints
http://www.lamda.nju.edu.cn/qianc/
Example β knapsack
Knapsack problem: given π items, each with a weight π€π and a value π£π, to select a subset of items maximizing the sum of values while keeping the summed weights within some capacity ππππ₯
argπππ₯πβ{0,1}π βπ=1π π£ππ₯π π . π‘. βπ=1
π π€ππ₯π β€ ππππ₯
π₯π = 1: the π-th item is included
objective function constraint
http://www.lamda.nju.edu.cn/qianc/
Example β minimum spanning tree
Minimum spanning tree problem:
given an undirected connected graph πΊ = (π, πΈ) on π vertices and π edges with positive weights π€:πΈ β R+, to find a connected subgraph πΈβ² β πΈ with the minimum weight
argππππβ{0,1}π βπ=1π π€ππ₯π π . π‘. π π = 1
π₯π = 1: the π-th edge is selectedobjective function constraint
π π : the number of connected components
http://www.lamda.nju.edu.cn/qianc/
Example β traveling salesman
Traveling salesman problem:
given an undirected connected graph πΊ = (π, πΈ) on π vertices and π edges with positive weights π€:πΈ β R+, to find a Hamilton cycle with the minimum weight
argππππ₯ π€(π₯) π . π‘. π₯ is a Hamilton cycle
Objective function: the sum of the edge weights on the cycle
Constraint: visit each vertex exactly once, starting and ending in the same vertex
http://www.lamda.nju.edu.cn/qianc/
EAs for constrained optimization
How to deal with constraints when EAs are used for constrained optimization?
The optimization problems in real-world applications
often come with constraints
http://www.lamda.nju.edu.cn/qianc/
Constraint handling strategies
The final output solution must satisfy the constraints
Common constraint handling strategies
β’ Penalty functions
β’ Repair functions
β’ Restricting search to the feasible region
β’ Decoder functions
http://www.lamda.nju.edu.cn/qianc/
Penalty functions
unconstrained
πππ π π₯ + βπ=1π ππ β ππ(π₯)
the π-th constraint violation degree
ππ π₯ = ππ π₯ 1 β€ π β€ π
max{0, βπ(π₯)} π + 1 β€ π β€ π
constrained
πππ π π₯
π . π‘. ππ π₯ = 0, 1 β€ π β€ π;
βπ π₯ β€ 0, π + 1 β€ π β€ π
Penalty functions: add penalties on the fitness of infeasible solutions
http://www.lamda.nju.edu.cn/qianc/
Penalty functions
unconstrained
πππ π π₯ + βπ=1π ππ β ππ(π₯)
constrained
πππ π π₯
π . π‘. ππ π₯ = 0, 1 β€ π β€ π;
βπ π₯ β€ 0, π + 1 β€ π β€ π
Penalty functions: add penalties on the fitness of infeasible solutions
Requirement: the optimal solutions of the original and transformed problems should be consistent
β’ e.g., all ππ are equal, and large enough: compare the constraint violation degrees first; if they are the same, compare the objective values π
http://www.lamda.nju.edu.cn/qianc/
Penalty functions
Minimum spanning tree problem:
given an undirected connected graph
πΊ = (π, πΈ) on π vertices and π edges
with positive weights π€:πΈ β R+, to
find a connected subgraph πΈβ² β πΈ
with the minimum weight
argππππβ{0,1}π βπ=1π π€ππ₯π π . π‘. π π = 1
Fitness function:
πππ π π β 1 β π€π’π + βπ:π₯π=1π€π
Original objective function
Constraint violation degree π2 β π€πππ₯
http://www.lamda.nju.edu.cn/qianc/
Repair functions
Repair functions: repair infeasible solutions to feasible
Example - Knapsack: given π items, each with a weight π€π and a value π£π, to select a subset of items maximizing the sum of values while keeping the summed weights within some capacity ππππ₯
1 1 0 1 1 0 1 1
1 1 0 1 1 0 0 1
argπππ₯πβ{0,1}π βπ=1π π£ππ₯π π . π‘. βπ=1
π π€ππ₯π β€ ππππ₯
Repairing: scan from left to right, and keep the value 1 if the summed weight does not exceed ππππ₯
π£π: 4,2,6,10,4,3,7,2; π€π: 2,3,3,8,6,5,7,1; ππππ₯ = 25
infeasible
feasible
http://www.lamda.nju.edu.cn/qianc/
Restricting search to the feasible region
Restricting search to the feasible region: preserving feasibility by special initialization and reproduction
Example - traveling salesman: given an undirected connected graph πΊ = (π, πΈ) on π vertices and π edges with positive weights π€:πΈ β R+, to find a Hamilton cycle with the minimum weight
argππππ₯ π€(π₯) π . π‘. π₯ is a Hamilton cycle
1 6 2 5 7 4 8 3
Integer vector representation:the order of visiting vertexes
Initialize with permutation;Apply mutation and recombination operators for permutation representation
Permutation is feasible
http://www.lamda.nju.edu.cn/qianc/
Decoder functions
Decoder functions: map each genotype to a feasible phenotype
Example - Knapsack: given π items, each with a weight π€π and a value π£π, to select a subset of items maximizing the sum of values while keeping the summed weights within some capacity ππππ₯
1 1 0 1 1 0 1 1
1 1 0 1 1 0 0 1
argπππ₯πβ{0,1}π βπ=1π π£ππ₯π π . π‘. βπ=1
π π€ππ₯π β€ ππππ₯
Decoding: scan from left to right, and keep the value 1 if the summed weight does not exceed ππππ₯
π£π: 4,2,6,10,4,3,7,2; π€π: 2,3,3,8,6,5,7,1; ππππ₯ = 25
genotype
phenotype
http://www.lamda.nju.edu.cn/qianc/
Constraint handling strategies
The final output solution must satisfy the constraints
Common constraint handling strategies
β’ Penalty functions
β’ Repair functions
β’ Restricting search to the feasible region
β’ Decoder functions
Other effective constraint handling strategies?
http://www.lamda.nju.edu.cn/qianc/
(1+1)-EA for MST
Minimum spanning tree (MST):
β’ Given: an undirected connected graph πΊ = (π, πΈ) on π vertices and π edges with positive integer weights π€:πΈ β β+
β’ The Goal: find a connected subgraph πΈβ² β πΈ with the minimum weight
argππππβ{0,1}π βπ=1π π€ππ₯π π . π‘. π π = 1
π₯π = 1 means that edge ππ is selected
π β {0,1}π β a subgraph
Formulation:
http://www.lamda.nju.edu.cn/qianc/
(1+1)-EA: Given a pseudo-Boolean function π:
1. π β randomly selected from {0,1}π.2. Repeat until some termination criterion is met3. πβ² β flip each bit of π with probability 1/π.4. if π πβ² β€ π(π)5. π = πβ².
(1+1)-EA for MST
Theorem. [Neumann & Wegener, TCSβ07; Doerr et al., Algorithmicaβ12] The expected running time of the (1+1)-EA solving the MST problem is π(π2(log π + logπ€πππ₯)).
Fitness function:
πππ π π β 1 β π€π’π + βπ:π₯π=1π€π
Original objective function
Constraint violation degree π2 β π€πππ₯
Using the strategy of penalty functions
http://www.lamda.nju.edu.cn/qianc/
MST by MOEAs
Bi-objective reformulation πππ (π π , βπ:π₯π=1π€π)
Theorem. [Neumann & Wegener, Nature Computingβ05] The expected running time of the GSEMO solving the MST problem is π(ππ (π + logπ€πππ₯)).
argππππβ{0,1}π βπ=1π π€ππ₯π π . π‘. π π = 1
Penalty functions: π(π2(log π + logπ€πππ₯))
Bi-objective reformulation: π(ππ(π + logπ€πππ₯))
Bi-objective reformulation is better for dense graphs, e.g., π β Ξ(π2)
http://www.lamda.nju.edu.cn/qianc/
MST by MOEAs
Bi-objective reformulation πππ (π π , βπ:π₯π=1π€π)
argππππβ{0,1}π βπ=1π π€ππ₯π π . π‘. π π = 1
GSEMO: Given a pseudo-Boolean function vector π:
1. π β randomly selected from {0,1}π.2. π β {π}.3. Repeat until some termination criterion is met4. Choose π from π uniformly at random.5. πβ² β flip each bit of π with probability 1/π.6. if β π β π such that π βΊ πβ²7. π:= π β π β π| πβ² βΌ π βͺ {πβ²}.
Keep the non-dominatedsolutions generated so-far
http://www.lamda.nju.edu.cn/qianc/
Proof
Main idea:
(1) obtain the empty subgraph 0π
(2) obtain a minimum spanning tree
The analysis of phase (1): πππ (π π , π€ π = βπ:π₯π=1π€π)
Using multiplicative drift analysis:
β’ design the distance function: π π = πππ π€ π π β π}
β’ analyze the expected drift:
Ξ π ππ‘ β π ππ‘+1 ππ‘ = π] β₯1
πβ1
π(1 β
1
π)πβ1 β βπ=1
|πβ|π€ πβ β π€(ππ)
select the solution πβ with the smallest π€(π) value, and flip only one 1-bit
the minimum weight in the population
the resulting solution
http://www.lamda.nju.edu.cn/qianc/
Proof
=1
πβ1
π(1 β
1
π)πβ1β π€ πβ
=1
πβ1
π(1 β
1
π)πβ1β π ππ‘
Main idea:
(1) obtain the empty subgraph 0π
(2) obtain a minimum spanning tree
The analysis of phase (1): πππ (π π , π€ π = βπ:π₯π=1π€π)
Using multiplicative drift analysis:
β’ design the distance function: π π = πππ π€ π π β π}
β’ analyze the expected drift:
Ξ π ππ‘ β π ππ‘+1 ππ‘ = π] β₯1
πβ1
π(1 β
1
π)πβ1 β βπ=1
|πβ|π€ πβ β π€(ππ)
http://www.lamda.nju.edu.cn/qianc/
Proof
Main idea:
(1) obtain the empty subgraph 0π
(2) obtain a minimum spanning tree
The analysis of phase (1): πππ (π π , π€ π = βπ:π₯π=1π€π)
Using multiplicative drift analysis:
β’ design the distance function: π π = πππ π€ π π β π}
β’ analyze the expected drift: Ξ π ππ‘ β π ππ‘+1 ππ‘ = π] β₯1
πππβ π ππ‘
Upper bound on the expected running time:
βπ π0(π) β1 + ln (π π /ππππ)
πΏ
π π β€ ππ€πππ₯ ππππ β₯ 1
β€ πππ(1 + ln (ππ€πππ₯))
http://www.lamda.nju.edu.cn/qianc/
Proof
Main idea:
(1) obtain the empty subgraph 0π
(2) obtain a minimum spanning tree
The analysis of phase (1): πππ (π π , π€ π = βπ:π₯π=1π€π)
Using multiplicative drift analysis:
β’ design the distance function: π π = πππ π€ π π β π}
β’ analyze the expected drift: Ξ π ππ‘ β π ππ‘+1 ππ‘ = π] β₯1
πππβ π ππ‘
Upper bound on the expected running time:
βπ π0(π) β1 + ln (π π /ππππ)
πΏβ€ πππ(1 + ln (ππ€πππ₯))
β π(ππ (log π + logπ€πππ₯))
http://www.lamda.nju.edu.cn/qianc/
Proof
The analysis of phase (2):
β’ the found Pareto optimal solutions will always be kept
β’ follow the path: ππ β ππβπ β β― β ππ β ππ
the probability is at least : 1
πβ1
π(1 β
1
π)πβ1
The expected running time is at most: (π β 1) β πππ β π(ππ2)
The expected running time of phase (1): π(ππ(log π + logπ€πππ₯))
The total expected running time: π(ππ(π + logπ€πππ₯))
ππ: the Pareto optimal solution with π connected components
0π
a minimum spanning tree
β₯1
πππ
πππ (π π , π€ π = βπ:π₯π=1π€π)
http://www.lamda.nju.edu.cn/qianc/
MST by MOEAs
Bi-objective reformulation πππ (π π , βπ:π₯π=1π€π)
argππππβ{0,1}π βπ=1π π€ππ₯π π . π‘. π π = 1
Penalty functions: π(π2(log π + logπ€πππ₯))
Bi-objective reformulation: π(ππ(π + logπ€πππ₯))
Bi-objective reformulation is better for dense graphs, e.g., π β Ξ(π2)
Theorem. [Neumann & Wegener, Nature Computingβ05] The expected running time of the GSEMO solving the MST problem is π(ππ (π + logπ€πππ₯)).
http://www.lamda.nju.edu.cn/qianc/
More examples
Penalty functions Bi-objective reformulation
Set cover ππ₯ππππππ‘πππ π(ππ(log ππππ₯ + log π))
[Friedrich et al., ECJβ10]
Minimum cut ππ₯ππππππ‘πππ π(πΉπ(log ππππ₯ + log π))
[Neumann et al., Algorithmicaβ11]
Minimum labelspanning tree
Ξ©(ππ’π) π(π2log π)
[Lai et al., TECβ14]
Problem
Minimum costcoverage
ππ₯ππππππ‘πππ π(ππ log π + logπ€πππ₯ + π )
[Qian et al., IJCAIβ15]
Better
http://www.lamda.nju.edu.cn/qianc/
Bi-objective reformulation
1. transform the original constrained optimization problem into a bi-objective optimization problem
bi-objective
πππ (π π₯ , βπ=1π ππ(π₯))
constraint violation degree
ππ π₯ = ππ π₯ 1 β€ π β€ π
max{0, βπ(π₯)} π + 1 β€ π β€ π
constrained
πππ π π₯
π . π‘. ππ π₯ = 0, 1 β€ π β€ π;
βπ π₯ β€ 0, π + 1 β€ π β€ π
Main idea:
http://www.lamda.nju.edu.cn/qianc/
Bi-objective reformulation
1. transform the original constrained optimization problem into a bi-objective optimization problem
Main idea:
2. employ a multi-objective EA to solve the transformed problem
bi-objective
πππ (π π₯ , βπ=1π ππ(π₯))
3. output the feasible solution from the generated non-dominated solution set
constraint violation degree = 0
constrained
πππ π π₯
π . π‘. ππ π₯ = 0, 1 β€ π β€ π;
βπ π₯ β€ 0, π + 1 β€ π β€ π
http://www.lamda.nju.edu.cn/qianc/
Constraint handling strategies
The final output solution must satisfy the constraints
Common constraint handling strategies
β’ Penalty functions
β’ Repair functions
β’ Restricting search to the feasible region
β’ Decoder functions
β’ Bi-objective reformulation
Search only in the feasible region
allow infeasible solutions in the search
Better algorithms?
http://www.lamda.nju.edu.cn/qianc/
Subset selection
Subset selection is to select a subset of size π from a total set of π items for optimizing some objective function
Formally stated: given all items π = {π£1, β¦ , π£π}, an objective functionπ: 2π β R and a budget π, to find a subset π β π such that
πππ₯πβπ π π π . π‘. π β€ π.
Ground set π Subset π β πmax π(π)
π β€ π
http://www.lamda.nju.edu.cn/qianc/
Application - sparse regression
Sparse regression [Tropp, TITβ04] : select a few observation variables to best approximate the predictor variable by linear regression
observation variables predictor variable π§
Item π£π: an observation variable
Objective π: squared multiple correlation π π§,π2 =
Var π§ β MSEπ§,πVar π§
variance mean squared error
a subset π of observation variables
http://www.lamda.nju.edu.cn/qianc/
Application - influence maximization
Influence maximization [Kempe et al., KDDβ03] : select a subset of users from a social network to maximize its influence spread
Influential users
Item π£π: a social network user
Objective π: influence spread, measured by the expected number of social network users activated by diffusion
http://www.lamda.nju.edu.cn/qianc/
Application - document summarization
Document summarization [Lin & Bilmes, ACLβ11] : select a few sentences to best summarize the documents
Item π£π: a sentence
Objective π: summary quality
http://www.lamda.nju.edu.cn/qianc/
Application - sensor placement
Sensor placement [Krause & Guestrin, IJCAIβ09 Tutorial] : select a few places to install sensors such that the information gathered is maximized
Water contamination detection Fire detection
Item π£π: a place to install a sensor Objective π: entropy
http://www.lamda.nju.edu.cn/qianc/
Subset selection
Subset selection
Machine learning
Natural language
processingNetworks
Document summarization Sensor placement
Data mining
Sparse regression Influence maximization
[Mathematical Programming 1978]
π:monotone and submodular
The greedy algorithmοΌ
(1 β 1/π)-approximation
George Nemhauser
John Von Neumann
Theory Prize
Best Paper/Test of Time Award:
[Kempe et al., KDDβ03]
[Das & Kempe, ICMLβ11]
[Iyer & Bilmes, NIPSβ13]
http://www.lamda.nju.edu.cn/qianc/
Subset representation
A subset π β π can be naturally represented by a Boolean
vector π β {0,1}π
β’ the π-th bit π₯π = 1 if the item π£π β π; π₯π = 0 otherwise
β’ π = {π£π β£ π₯π = 1}
π = {π£1, π£2, π£3, π£4, π£5} a subset π β π a Boolean vector π β {0,1}5
β 00000
{π£1} 10000
{π£2, π£3, π£5} 01101
{π£1, π£2, π£3, π£4, π£5} 11111
http://www.lamda.nju.edu.cn/qianc/
POSS algorithm
POSS algorithm [Qian, Yu and Zhou, NIPSβ15]
Transformation:
Initialization: put the special solution {0}π
into the population π
Parent selection & Reproduction: pick asolution π randomly from π, and flip each bitof π with prob. 1/π to generate a new solution
Evaluation & Survivor selection: if thenew solution is not dominated, put it intoπ and weed out bad solutions
Output: select the best feasible solution
πππ₯πβπ π π π . π‘. π β€ π original
ππππβπ (βπ π , |π|) bi-objective
http://www.lamda.nju.edu.cn/qianc/
Sparse regression
Sparse regression: given all observation variables π = {π£1, β¦ , π£π}, apredictor variable π§ and a budget π, to find a subset π β π such that
πππ₯πβπ π π§,π2 =
Var π§ β MSEπ§,πVar π§
π . π‘. π β€ π
observation variables predictor variable π§
Var π§ : variance of π§ MSEπ§,π: mean squared error of predicting π§by using observation variables in π
a subset π of observation variables
http://www.lamda.nju.edu.cn/qianc/
Experimental results
greedy algorithms relaxation methods
POSS is significantly better than all the compared state-of-the art algorithms on all data sets
the size constraint: k= π the number of iterations of POSS: πππππ
exhaustive search
β denotes that POSS is significantly better by the π‘-test with confidence level 0.05
http://www.lamda.nju.edu.cn/qianc/
Theorem 1. For subset selection with monotone objective functions, POSS using πΈ π β€ 2ππ2π finds a solution π with π β€ π and
π π β₯ (1 β πβπΎ) β OPT.
Theoretical analysis
POSS can achieve the optimal polynomial-time approximation guarantee
the optimal polynomial-time approximation guarantee for monotone π [Harshaw et al., ICMLβ19]
http://www.lamda.nju.edu.cn/qianc/
π(π βͺ { π£}) β π(π) β₯πΎ
π(OPT β π(π))
Proof
the optimal function valuesubmodularity ratio [Das & Kempe, ICMLβ11]
Roughly speaking, the improvement by adding a specific item is proportional to the current distance to the optimum
Lemma 1. For any π β π, there exists one item π£ β π β π such that
http://www.lamda.nju.edu.cn/qianc/
Proof
Main idea:
β’ consider a solution π with |π| β€ π and π(π) β₯ 1 β 1 βπΎ
π
πβ OPT
π = 0 π = π
initial solution 00β¦0 1 β 1 βπΎ
π
π
π(π βͺ { π£}) β π(π) β₯πΎ
π(OPT β π(π))
Lemma 1. For any π β π, there exists one item π£ β π β π such that
π 00β¦0 = 0
|00β¦0| = 0β₯ 1 β πβπΎ
(1 β 1/π)π β€ 1/π
= 1 β 1 β1
π/πΎ
π/πΎ β πΎ
a subset
let π = π/πΎ
http://www.lamda.nju.edu.cn/qianc/
Proof
Main idea:
β’ consider a solution π with |π| β€ π and π(π) β₯ 1 β 1 βπΎ
π
πβ OPT
π = 0 π = π
initial solution 00β¦0 1 β 1 βπΎ
π
π
β₯ 1 β πβπΎ
the desired approximation guarantee
π(π βͺ { π£}) β π(π) β₯πΎ
π(OPT β π(π))
Lemma 1. For any π β π, there exists one item π£ β π β π such that
π 00β¦0 = 0
|00β¦0| = 0
οΌ
a subset
http://www.lamda.nju.edu.cn/qianc/
Proof
Main idea:
β’ consider a solution π with |π| β€ π and π(π) β₯ 1 β 1 βπΎ
π
πβ OPT
β’ in each iteration of POSS:
select π from the population π
flip one specific 0-bit of π to 1-bit
πβ² = π + 1 β€ π + 1 and π(πβ²) β₯ 1 β 1 βπΎ
π
π+1β OPT
π(π βͺ { π£}) β π(π) β₯πΎ
π(OPT β π(π))
Lemma 1. For any π β π, there exists one item π£ β π β π such that
(i.e., add the specific item π£ in Lemma 1)
a subset
http://www.lamda.nju.edu.cn/qianc/
Proof
π(π βͺ { π£}) β π(π) β₯πΎ
π(OPT β π(π))
Lemma 1. For any π β π, there exists one item π£ β π β π such that
π πβ² β π(π) β₯πΎ
πβ OPT β π π
π(π) β₯ 1 β 1 βπΎ
π
π
β OPT
π πβ² β₯ 1 βπΎ
ππ π +
πΎ
πβ OPT
π πβ² β₯ 1 βπΎ
π1 β 1 β
πΎ
π
π
β OPT +πΎ
πβ OPT = 1 β 1 β
πΎ
π
π+1
β OPT
http://www.lamda.nju.edu.cn/qianc/
Proof
Main idea:
β’ consider a solution π with |π| β€ π and π(π) β₯ 1 β 1 βπΎ
π
πβ OPT
β’ in each iteration of POSS:
select π from the population π,
flip one specific 0-bit of π to 1-bit,
πβ² = π + 1 β€ π + 1 and π(πβ²) β₯ 1 β 1 βπΎ
π
π+1β OPT
π(π βͺ { π£}) β π(π) β₯πΎ
π(OPT β π(π))
Lemma 1. For any π β π, there exists one item π£ β π β π such that
(i.e., add the specific item π£ in Lemma 1)
π π + 1 the probability: 1
πβ
1
ππ
a subset
the probability: 1
π
the probability: 1
π1 β
1
π
πβ1β₯
1
ππ
http://www.lamda.nju.edu.cn/qianc/
Proof
π π + 1 the probability: 1
πβ
1
ππ
1
2πππ
π β€ 2π
π(π βͺ { π£}) β π(π) β₯πΎ
π(OPT β π(π))
Lemma 1. For any π β π, there exists one item π£ β π β π such that
Main idea:
β’ consider a solution π with |π| β€ π and π(π) β₯ 1 β 1 βπΎ
π
πβ OPT
β’ in each iteration of POSS:
Exclude solutions with size at least 2π
The solutions in π are always incomparable
For each size in {0,1, β¦ , 2π β 1}, there exists at most one solution in π
a subset
http://www.lamda.nju.edu.cn/qianc/
Proof
π π + 1 the probability: 1
πβ
1
ππ
1
2πππ
π β€ 2π
π π + 1 the expected number of iterations: 2πππ
π = 0 π the expected number of iterations: π β 2πππ
π(π βͺ { π£}) β π(π) β₯πΎ
π(OPT β π(π))
Lemma 1. For any π β π, there exists one item π£ β π β π such that
Main idea:
β’ consider a solution π with |π| β€ π and π(π) β₯ 1 β 1 βπΎ
π
πβ OPT
β’ in each iteration of POSS:
a subset
http://www.lamda.nju.edu.cn/qianc/
Theoretical analysis:The advantage of bi-objective reformulation for handling constraints
Algorithm design:
The POSS algorithm for subset selection
Document summarization Sensor placementInfluence maximizationSparse regression
http://www.lamda.nju.edu.cn/qianc/
POSS algorithm
POSS algorithm [Qian, Yu and Zhou, NIPSβ15]
Transformation:
Parent selection & Reproduction:pick a solution π randomly fromπ, and flip each bit of π with prob.1/π to generate a new solution
πππ₯πβπ π π π . π‘. π β€ π original
ππππβπ (βπ π , |π|) bi-objective
Using bit-wise mutation only
http://www.lamda.nju.edu.cn/qianc/
PORSS algorithm
PORSS algorithm [Qian, Bian and Feng, AAAIβ20]
Transformation:
Parent selection & Reproduction:
β’ pick two solutions randomly from πβ’ apply recombination operatorβ’ apply bit-wise mutation operator
πππ₯πβπ π π π . π‘. π β€ π original
ππππβπ (βπ π , |π|) bi-objective
http://www.lamda.nju.edu.cn/qianc/
PORSS algorithm
β’ One-point crossover
1 0 1 1 1 0 0 0
0 0 1 0 1 0 1 0
1 0 1 1 1 0 1 0
0 0 1 0 1 0 0 0
Parent1
Parent2
Offspring1
Offspring2
β’ Uniform crossover
1 0 1 1 1 0 0 0
0 0 1 0 1 0 1 0
1 0 1 1 1 0 1 0
0 0 1 0 1 0 0 0
Parent1
Parent2
Offspring1
Offspring2
http://www.lamda.nju.edu.cn/qianc/
PORSS using one-point crossover
PORSS using uniform crossover
PORSS performs the best
the size constraint: π = π State-of-the-art algorithms
PORSS algorithm
http://www.lamda.nju.edu.cn/qianc/
Summary
β’ Constrained optimization
β’ Constraint handling strategies
β Penalty functions
β Repair functions
β Restricting search to the feasible region
β Decoder functions
β Bi-objective reformulationGive an example of algorithm design guided by theoretical analysis
http://www.lamda.nju.edu.cn/qianc/
References
β’ A. E. Eiben and J. E. Smith. Introduction to Evolutionary Computing. Chapter13.
β’ T. Friedrich, J. He, N. Hebbinghaus, F. Neumann and C. Witt. Approximatingcovering problems by randomized search heuristics using multi-objectivemodels. Evolutionary Computation, 2010, 18(4): 617-633
β’ B. Doerr, D. Johannsen and C. Winzen. Multiplicative drift analysis.Algorithmica, 2012, 64: 673-697
β’ X. Lai, Y. Zhou, J. He and J. Zhang. Performance analysis of evolutionaryalgorithms for the minimum label spanning tree problem. IEEE Transactionson Evolutionary Computation, 2014, 18(6): 860-872
β’ F. Neumann and I. Wegener. Minimum spanning trees made easier via multi-objective optimization. Natural Computing, 2006, 5(3): 305-319
http://www.lamda.nju.edu.cn/qianc/
References
β’ F. Neumann and I. Wegener. Randomized local search, evolutionaryalgorithms, and the minimum spanning tree problem. Theoretical ComputerScience, 2007, 378(1): 32-40
β’ F. Neumann, J. Reichel and M. Skutella. Computing minimum cuts byrandomized search heuristics. Algorithmica, 2011, 59(3): 323-342
β’ C. Qian, Y. Yu and Z.-H. Zhou. On constrained Boolean Pareto optimization.In: Proceedings of the 24th International Joint Conference on Artificial Intelligence(IJCAI'15), 2015, pages 389-395, Buenos Aires, Argentina
β’ C. Qian, Y. Yu and Z.-H. Zhou. Subset selection by Pareto optimization. In:Advances in Neural Information Processing Systems 28 (NIPS'15), 2015, pages1765-1773, Montreal, Canada
β’ C. Qian, C. Bian and C. Feng. Subset selection by Pareto optimization withrecombination. In: Proceedings of the 34th AAAI Conference on ArtificialIntelligence (AAAI'20), New York, NY, 2020, pp.2408-2415.