|
COT 5405
Advanced Algorithms
Chris Lacher
Notes GA: Introduction to Genetic Algorithms
|
begin with random population
repeat
selection
crossover
mutation
==> new population
until fitness of best individual is optimized
Population
- Alphabet aka alleles
DNA: {A,T,C,G} [nucleotides A (adenosine), T (thymidine), C (cytidine), G (guanosine)]
Protein: {A,B,C,D,E,F,G,H,I,J,K,L,N,O,P,Q,R,S,T} [20 amino acids]
Genetic Code
Generic: {0,1}
- Chromosome = fixed-length sequence from alphabet
Protein example: IHCCAASASDMIKPQFHFOSEBBDCBDBABIBIABDMKPKICBEBHVGGGS
Generic example: 01101100010101000101
- Population = set of n chromosomes of same length l
Generic example: {00000010,11101110,00100000,00110100}
- length l = 50 - 1000 (typical)
size n = 50 - 1000 (typical)
begin with random population
repeat
selection
crossover
mutation
==> new population
until fitness of best individual is optimized
Fitness Function
- Function F defined for every chromosome (of length l)
- Interpretation: F(c1) <
F(c2) implies c1 is less fit than c2
- Example: F(c) = number of 1s in c
Fitness-Proportionate Selection
- Called viability selection in biology
- Implementation: roulette wheel sampling
- Each individual of population assigned a sector of wheel proportional to it's fitness
- Spin wheel to select an individual
begin with random population
repeat
selection
crossover
mutation
==> new population
until fitness of best individual is optimized
Crossover
begin with random population
repeat
selection
crossover
mutation
==> new population
until fitness of best individual is optimized
Mutation
- With probability pm, mutate at each locus
- For binary alphabet, mutation is a bit flip
- pm = 0.001 (typical)
Fitness Optimization Criteria
- Stop when "close" to perfect
- Weak link: cannot know how close or far away "perfect" is
- Practice: Take many runs, vary parameters, keep record of best
(c, f(c)) for each run.
Algorithm
main
{
input
{
L, // chromosome length
N, // population size // make this an even number for convenience
P, // randomly generated set of n chromosomes of length L
p_c, // crossover probability
p_m, // mutation probability
F // fitness function
}
max_fitness = maximum value of F(c) for c in P;
c_max = chromosome in P where maximum is achieved;
while (max_fitness < fitness_goal)
{
P1 = generate_new_population(P);
max_fitness = maximum value of F(c) for c in P1;
c_max = chromosome in P1 where maximum is achieved;
P = P1;
}
output
{
P,
c_max,
F(c_max)
}
}
generate_new_population(P)
{
for (i = 0; i < N/2; ++i)
{
choose two elements c1, c2 of P using roulette wheel sampling;
with probability p_c, crossover(c1,c2);
mutate (c1); mutate (c2);
insert c1 and c2 in P1;
}
return P1
}
crossover (&c1 &c2) // passed by reference
{
loc = random [0,L);
for (i = 0; i < loc; ++i)
c1[i]=c2[i];
for (i = loc; i < L; ++i)
c2[i] = c1[i];
}
mutate (&c) // passed by reference
{
for (i = 0; i < L; ++i)
with probability p_m flip c[i];
}
Example
L = 8
N = 4
p_c = 0.6
p_m = 0.1
P = { 00000110 , 11101110 , 00100000 , 00110100 }
F(c) = number of '1'
P[0] F ROULETTE XC P[1] F ROULETTE XC P[2] F ROULETTE XC P[3] F
-------- - -------- -- -------- - -------- -- -------- - -------- -- -------- -
00000110 2 11101110 y3 11110100 5 11101110 y5 11101100 5 11110110 y6 11110110 6
11101110 6 00110100 00101110 4 11110100 11110110 6 11101110 11101110 6
00100000 1 11101110 n 11101110 6 11101110 y2 11101110 6 00101110 y4 00100110 3
00110100 3 00000110 00000110 2 00101110 00101110 4 11110110 11111110 7
ROULETTE XC P[5] F ROULETTE XC P[6] F ROULETTE XC P[7] F
-------- -- -------- - -------- -- -------- - -------- -- -------- -
11111110 y2 11110110 6 11111110 n 11111110 7 11111110 y3 11111110 7
11110110 11111110 7 11111110 11111110 7 11111110 11111110 7
11111110 n 11111110 7 11101110 y7 11111110 7 11111110 y7 11111110 7
11101110 11101110 6 11111110 11101110 6 11101110 11101110 6
- Note that P[7] = P[6]
- From P[7], two "bad" genes are propagated without possibility of change by
crossover: sites 4, 8
- The site 4 can be removed only by the failure of the individual to propagate
- which is likely to happen eventually because of roulette wheel sampling. (Possible,
but much less likely, is that the other genotype will die out.)
- Add mutation: Eventually mutation will occur at site 8, producing an
optimally fit individual in the population (assuming the unlikely possibility of
death of 11111110 has not occured).
Applying GA
- Problem Coding
- Find alphabet and length that represents problem instances
- Reversible mapping {problem instances} <--> {chromosomes}
- Sometimes this is straightforward, sometimes it is a leap, and sometimes the
straightforward approach is not the best
- Fitness Function
- Evaluate representative chromosome
- Function F:{chromosomes} --> {numbers}
- F(c) high value ==> instance represented by c is good
Theory
- Schemata - chromosomes with "wild cards" - 10*0*1110*
- Order of a schema H o(H) = number of fixed positions
o(10*0*1110*) = 7
- Defining length of a schema H d(H) = distance between first and last fixed position
d(10*0*1110*) = 9
- m(H,t) = number of representatives of H in population at time t
- f(H) = average fitness of chromosomes in H
- f(P) = average fitness over entire population P
Schema Theorem (Holland, 1975)
m(H, t + 1) >= m(H,t) [f(H)/F(P)] [1 - pc
d(H)/(l - 1) - o(H)pm]
Standard Interpretation: The number of representatives of a schema in a population
grows in proportion to the fitness of the schema. This interpretation has come into question
in recent years. The theory of GA in particular and evolutionary computation in
general is an active field of research.
Variations
- "Fitness" may be inversely related to F(c) (low is good) - rescale
for Roulette
- Crossover often needs coding-specific definition or exceptions
- Multipoint crossover
- Variable mutation rate (inversely proportional to Hamming distance between parents)
Representative Application Areas
- Artificial Life
- Lisp Programs
- Cellular Automata
- Real Life
- Learning and survival: models of Baldwin Effect
- Population biology simulations
- Ecosystems
- Science
- Protein Structure
- Nonlinear Opitimization
- OR, Prediction
- AI
- Evolving Learning Algorithms: backprop parameters; reinforcement learning, others
- Neural Network Architecture
Evolving Neural Architectures
- Direct Encoding
- Problem Mapping:
adjacency (connection) matrix <==> network topology
feed-forward network <==> lower-diagonal matrix
alphabet = {0,1}
chromosome = rows of adjacency matrix concatenated
- Crossover:
exchange rows
row = input to a single node
- Mutation: standard
- Fitness: for a given problem,
backprop for a fixed number of epochs using training set
evaluate on test set
sum square error = fitness (low is good)
- Montana & Davis
- Grammatical Encoding
Grammatical Encoding - Detailed Example
Alphabet = {S,A,B,C,D,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p}
Chromosome = SXXXX XXXXX XXXXX XXXXX XXXXX ...
(X = any alphabet character) (spaces for visual convenience only)
Grammar Terminals (fixed):
a -> 0 0 b -> 0 0 c -> 0 0 d -> 0 0 e -> 0 1
0 0 0 1 1 0 1 1 0 0
f -> 0 1 g -> 0 1 h -> 0 1 i -> 1 0 j -> 1 0
1 0 0 1 1 1 0 0 0 1
k -> 1 0 l -> 1 0 m -> 1 1 n -> 1 1 o -> 1 1
1 0 1 1 0 0 0 1 1 0
p -> 1 1
1 1
Example non-terminals (variable):
S -> A B A -> c a B -> a a C -> f g D -> c a
C D f a a a o k p c
represented by chromosome SABCDAcafaBaaaaCfgokDcapc
Example chromosome SABCD Aaaca Baaaa Cokab Daaoa
decodes as follows:
S -> A B -> a a a a -> 0 0 0 0 0 0 0 0 = adjacency matrix
C D c a a a 0 0 0 0 0 0 0 0
o k a a 0 0 0 0 0 0 0 0
a b o a 1 0 0 0 0 0 0 0
1 1 1 0 0 0 0 0
1 0 1 0 0 0 0 0
0 0 0 0 1 1 0 0
0 0 0 1 0 1 0 0
Need ad hoc rules such as:
- Take first production for each non-terminal, discard the rest
- Discard backward connections (make matrix lower diagonal)
- discard intra-layer connections
Research Area: Applying GA to Acyclic Architectures
- Terminals - use with known interesting subnets (e.g., Expert Net Nodes)
- Matrix Blocks
- Applicable to Neural or Computational Nets
- Fitness - possibly penalize for number of connections, selecting for sparseness
References
- Golgberg, David E., Genetic Algorithms in Search, Optimization, and
Machine Learning, Addison-Wesley, 1989.
- Mitchell, Melanie, An Introduction to Genetic Algorithms, MIT Press, 1996
- Artificial Life Home Page
- Lacher, R.C.,
Expert networks: Paradigmatic conflict, technological rapprochement,
Minds and Machines 3 (1993) 53--71