xueyuediana 2018-07-16
本教程将基于一个简单的示例在Python中实现遗传算法优化技术,在这个示例中,我们试图最大化一个方程的输出。本教程使用十进制表示的基因,单点交叉,和均匀变异。
遗传算法(GA)的流程图如图1所示。遗传算法中涉及的每个步骤都有一些变化。
图1.遗传算法流程图
例如,基因的表示有不同类型,例如二进制,十进制,整数等。每种类型都有不同的对待。存在不同类型的突变,例如位翻转,交换,反转,均匀,非均匀,高斯,收缩等。此外,交叉具有不同的类型,例如混合,单点,两点,均匀等。本教程不会实现所有这些,只是实现了GA中涉及的每个步骤的一种类型。本教程使用基因的十进制表示,单点交叉和均匀变异。
本教程首先介绍我们将要实现的等式。等式如下所示:
Y = w1x1 + w2x2 + w3x3 + w4x4 + w5x5 + w6x6
该方程有6个输入(x1到x6)和6个权重(w1到w6),如图所示,输入值为(x1,x2,x3,x4,x5,x6)=(4,-2,7,5,11, 1)。我们正在寻找最大化这种方程的参数(权重)。最大化这种方程的想法似乎很简单。正输入乘以最大可能的正数,负数乘以最小可能的负数。但我们希望实现的想法是如何让GA做到这一点,以便知道使用正输入和负输入的负权重更好。让我们开始实施GA。
首先,让我们创建一个包含6个输入的列表和一个用于保存权重数量的变量,如下所示:
# Inputs of the equation.
equation_inputs = [4,-2,3.5,5,-11,-4.7]
# Number of the weights we are looking to optimize.
num_weights = 6
下一步是定义初始总体。根据权重的数量,每个染色体(溶液或个体)肯定有6个基因,每个权重对应一个基因。但是问题是每个群体有多少个解决方案?没有固定的值,我们可以选择适合我们问题的值。但是我们可以保留它的通用性,以便在代码中修改它。接下来,我们创建一个变量来表示每个群体的解决方案数量,另一个变量来表示群体的大小,最后,一个变量来表示实际的初始群体数量:
import numpy
sol_per_pop = 8
# Defining the population size.
pop_size = (sol_per_pop,num_weights) # The population will have sol_per_pop chromosome where each chromosome has num_weights genes.
#Creating the initial population.
new_population = numpy.random.uniform(low=-4.0, high=4.0, size=pop_size)
导入numpy库后,我们可以使用numpy.random.uniform函数随机创建初始种群。根据所选参数,它将具有形状(8,6)。这是8条染色体,每条染色体有6个基因,每个权重一个。运行此代码后,群体如下:
[[-2.19134006 -2.88907857 2.02365737 -3.97346034 3.45160502 2.05773249]
[ 2.12480298 2.97122243 3.60375452 3.78571392 0.28776565 3.5170347 ]
[ 1.81098962 0.35130155 1.03049548 -0.33163294 3.52586421 2.53845644]
[-0.63698911 -2.8638447 2.93392615 -1.40103767 -1.20313655 0.30567304]
[-1.48998583 -1.53845766 1.11905299 -3.67541087 1.33225142 2.86073836]
[ 1.14159503 2.88160332 1.74877772 -3.45854293 0.96125878 2.99178241]
[ 1.96561297 0.51030292 0.52852716 -1.56909315 -2.35855588 2.29682254]
[ 3.00912373 -2.745417 3.27131287 -0.72163167 0.7516408 0.00677938]]
请注意,它是随机生成的,因此在再次运行时肯定会发生变化。
在准备群体之后,接下来将遵循图1中的流程图。基于适应度函数,我们将选择当前群体中的最佳个体作为交配的父母。接下来是应用GA变体(交叉和变异)来产生下一代的后代,通过附加父母和后代来创建新的群体,并且重复这些步骤进行多次迭代/生成。下一个Python代码应用以下步骤:
import GA
num_generations = 5
num_parents_mating = 4
for generation in range(num_generations):
# Measuring the fitness of each chromosome in the population.
fitness = GA.cal_pop_fitness(equation_inputs, new_population)
# Selecting the best parents in the population for mating.
parents = GA.select_mating_pool(new_population, fitness,
num_parents_mating)
# Generating next generation using crossover.
offspring_crossover = GA.crossover(parents,
offspring_size=(pop_size[0]-parents.shape[0], num_weights))
# Adding some variations to the offsrping using mutation.
offspring_mutation = GA.mutation(offspring_crossover)
# Creating the new population based on the parents and offspring.
new_population[0:parents.shape[0], :] = parents
new_population[parents.shape[0]:, :] = offspring_mutation
现在的代数是5。在本教程中,它被选为小以显示所有代的结果。
第一步是使用GA求总体中的每个解的适应度值。cal_pop_fitness函数。该函数在GA模块内的实现如下:
def cal_pop_fitness(equation_inputs, pop):
# Calculating the fitness value of each solution in the current population.
# The fitness function calculates the sum of products between each input and its corresponding weight.
fitness = numpy.sum(pop*equation_inputs, axis=1)
return fitness
除了总体之外,适应度函数还接受方程输入值(x1到x6)。根据我们的函数,将适应度值计算为每个输入与其对应基因(权重)之间的乘积(SOP)之和。根据每个群体的解决方案数量,将有一些SOP。正如我们之前在变量命名sol_per_pop中将解决方案的数量设置为8 ,将有8个SOP,如下所示:
[-63.41070188 14.40299221 -42.22532674 18.24112489 -45.44363278 -37.00404311 15.99527402 17.0688537 ]
注意,适应度值越高,解决方案就越好。
在计算所有解决方案的适应度值之后,下一步是根据下一个函数GA.select_mating_pool,在匹配池中选择其中最好的作为父类。该函数接受群体、适合度值和所需的父母数量。它返回所选的父类。其在GA模块内的实现如下:
def select_mating_pool(pop, fitness, num_parents):
# Selecting the best individuals in the current generation as parents for producing the offspring of the next generation.
parents = numpy.empty((num_parents, pop.shape[1]))
for parent_num in range(num_parents):
max_fitness_idx = numpy.where(fitness == numpy.max(fitness))
max_fitness_idx = max_fitness_idx[0][0]
parents[parent_num, :] = pop[max_fitness_idx, :]
fitness[max_fitness_idx] = -99999999999
return parents
根据变量num_parents_match中定义的父类的数量,该函数创建一个空数组来保存它们,就像在这一行中一样:
parents = numpy.empty((num_parents,pop.shape [1]))
循环遍历当前群体,该函数获得最高适应度值的索引,因为它是根据此行选择的最佳解决方案:
max_fitness_idx = numpy.where(fitness == numpy.max(fitness))
此索引用于使用以下行检索与此适合度值对应的解决方案:
parent [parent_num,:] = pop [max_fitness_idx,:]
为了避免再次选择这种解决方案,它的适应度值被设置为一个非常小的值,这个值很可能不会被再次选中,即- 99999999999999999。最后返回父数组,根据我们的示例如下所示:
[[-0.63698911 -2.8638447 2.93392615 -1.40103767 -1.20313655 0.30567304]
[ 3.00912373 -2.745417 3.27131287 -0.72163167 0.7516408 0.00677938]
[ 1.96561297 0.51030292 0.52852716 -1.56909315 -2.35855588 2.29682254]
[ 2.12480298 2.97122243 3.60375452 3.78571392 0.28776565 3.5170347 ]]
注意,这三个父类是当前群体中最好的个体,基于他们的适合度值分别为18.24112489,17.0688537,15.99527402和14.40299221。
下一步是使用这样选择的亲本进行交配以产生后代。根据功能,交配从交叉操作开始GA.crossover。此函数接受父母和后代的大小。它使用后代大小来了解从这些父母那里生产的后代的数量。这样的功能在GA模块内部实现如下:
def crossover(parents, offspring_size):
offspring = numpy.empty(offspring_size)
# The point at which crossover takes place between two parents. Usually, it is at the center.
crossover_point = numpy.uint8(offspring_size[1]/2)
for k in range(offspring_size[0]):
# Index of the first parent to mate.
parent1_idx = k%parents.shape[0]
# Index of the second parent to mate.
parent2_idx = (k+1)%parents.shape[0]
# The new offspring will have its first half of its genes taken from the first parent.
offspring[k, 0:crossover_point] = parents[parent1_idx, 0:crossover_point]
# The new offspring will have its second half of its genes taken from the second parent.
offspring[k, crossover_point:] = parents[parent2_idx, crossover_point:]
return offspring
该函数首先根据后代大小创建一个空数组,如下所示:
offspring = numpy.empty(offspring_size)
因为我们使用单点交叉,所以我们需要指定交叉发生的点。选择该点以根据此行将解决方案划分为两个相等的一半:
crossover_point = numpy.uint8(offspring_size [1] / 2)
然后我们需要选择两个父母进行交叉。根据这两行选择这些父母的指数:
parent1_idx = k%parents.shape [0]
parent2_idx =(k + 1)%parents.shape [0]
父母的选择方式类似于戒指。第一个指标为0和1的物种首先被选择产生两个后代。如果仍有剩余的后代要繁殖,那么我们选择双亲1和双亲2的后代。如果我们需要更多的后代,那么我们选择第二个有索引2和3的父母。通过索引3,我们到达了最后一个父元素。如果我们需要生成更多的后代,那么我们选择带有索引3的父节点,然后返回索引0的父节点,以此类推。
将交叉操作应用于父项后的解决方案存储在offspring 变量中,它们如下所示:
[[-0.63698911 -2.8638447 2.93392615 -0.72163167 0.7516408 0.00677938]
[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.35855588 2.29682254]
[ 1.96561297 0.51030292 0.52852716 3.78571392 0.28776565 3.5170347 ]
[ 2.12480298 2.97122243 3.60375452 -1.40103767 -1.20313655 0.30567304]]
接下来是offspring 使用GA.mutationGA模块内部的函数将第二个GA变体mut变量应用于存储在变量中的交叉结果。这种函数接受交叉后代并在应用均匀突变后返回它们。该功能实现如下:
def mutation(offspring_crossover):
# Mutation changes a single gene in each offspring randomly.
for idx in range(offspring_crossover.shape[0]):
# The random value to be added to the gene.
random_value = numpy.random.uniform(-1.0, 1.0, 1)
offspring_crossover[idx, 4] = offspring_crossover[idx, 4] + random_value
return offspring_crossover
它循环遍历每个后代,并根据以下行添加一个均匀生成的随机数,范围从-1到1:
random_value = numpy.random.uniform(-1.0,1.0,1)
然后根据此行将这样的随机数添加到后代的索引4的基因中:
offspring_crossover [idx,4] = offspring_crossover [idx,4] + random_value
请注意,索引可以更改为任何其他索引。应用突变后的后代如下:
[[-0.63698911 -2.8638447 2.93392615 -0.72163167 1.66083721 0.00677938]
[3.00912373 -2.745417 3.27131287 -1.56909315 -1.94513681 2.29682254]
[1.96561297 0.51030292 0.52852716 3.78571392 0.45337472 3.5170347]
[2.12480298 2.97122243 3.60375452 -1.40103767 -1.5781162 0.30567304]]
这些结果被添加到变量中offspring_crossover并由函数返回。
在这一点上,我们成功地从4个选定的父母中产生了4个后代,我们准备创造下一代的新种群。
请注意,GA是一种基于随机的优化技术。它试图通过对它们应用一些随机变化来增强当前的解决方案。因为这些变化是随机的,我们不确定它们会产生更好的解决方案。出于这个原因,优选在新的群体中保留先前的最佳解决方案(父母)。在最糟糕的情况下,当所有新的后代都比这样的父母更糟糕时,我们将继续使用这样的父母。因此,我们保证新一代将至少保留以前的良好结果,并且不会变得更糟。新人口将拥有前父母的前4个解决方案。最后4个解决方案来自应用交叉和变异后创建的后代:
new_population [0:parents.shape [0]
,:] = parent new_population [parents.shape [0]:,:] = offspring_mutation
通过计算第一代所有解决方案(父母和后代)的适应度,他们的适应度如下:
[18.24112489 17.0688537 15.99527402 14.40299221 -8.46075629 31.73289712 6.10307563 24.08733441]
之前的最高适应度是18.24112489,但现在是31.7328971158。这意味着随机变化趋向于更好的解决方案。这很棒。但这种结果可以通过更多代来增强。以下是另外4代的每个步骤的结果:
Generation : 1
Fitness values:
[ 18.24112489 17.0688537 15.99527402 14.40299221 -8.46075629 31.73289712 6.10307563 24.08733441]
Selected parents:
[[ 3.00912373 -2.745417 3.27131287 -1.56909315 -1.94513681 2.29682254]
[ 2.12480298 2.97122243 3.60375452 -1.40103767 -1.5781162 0.30567304]
[-0.63698911 -2.8638447 2.93392615 -1.40103767 -1.20313655 0.30567304]
[ 3.00912373 -2.745417 3.27131287 -0.72163167 0.7516408 0.00677938]]
Crossover result:
[[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.5781162 0.30567304]
[ 2.12480298 2.97122243 3.60375452 -1.40103767 -1.20313655 0.30567304]
[-0.63698911 -2.8638447 2.93392615 -0.72163167 0.7516408 0.00677938]
[ 3.00912373 -2.745417 3.27131287 -1.56909315 -1.94513681 2.29682254]]
Mutation result:
[[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.2392086 0.30567304]
[ 2.12480298 2.97122243 3.60375452 -1.40103767 -0.38610586 0.30567304]
[-0.63698911 -2.8638447 2.93392615 -0.72163167 1.33639943 0.00677938]
[ 3.00912373 -2.745417 3.27131287 -1.56909315 -1.13941727 2.29682254]]
Best result after generation 1 : 34.1663669207
Generation : 2
Fitness values:
[ 31.73289712 24.08733441 18.24112489 17.0688537 34.16636692 10.97522073 -4.89194068 22.86998223]
Selected Parents:
[[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.2392086 0.30567304]
[ 3.00912373 -2.745417 3.27131287 -1.56909315 -1.94513681 2.29682254]
[ 2.12480298 2.97122243 3.60375452 -1.40103767 -1.5781162 0.30567304]
[ 3.00912373 -2.745417 3.27131287 -1.56909315 -1.13941727 2.29682254]]
Crossover result:
[[ 3.00912373 -2.745417 3.27131287 -1.56909315 -1.94513681 2.29682254]
[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.5781162 0.30567304]
[ 2.12480298 2.97122243 3.60375452 -1.56909315 -1.13941727 2.29682254]
[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.2392086 0.30567304]]
Mutation result:
[[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.20515009 2.29682254]
[ 3.00912373 -2.745417 3.27131287 -1.40103767 -0.73543721 0.30567304]
[ 2.12480298 2.97122243 3.60375452 -1.56909315 -0.50581509 2.29682254]
[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.20089639 0.30567304]]
Best result after generation 2: 34.5930432629
Generation : 3
Fitness values:
[ 34.16636692 31.73289712 24.08733441 22.86998223 34.59304326 28.6248816 2.09334217 33.7449326 ]
Selected parents:
[[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.20515009 2.29682254]
[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.2392086 0.30567304]
[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.20089639 0.30567304]
[ 3.00912373 -2.745417 3.27131287 -1.56909315 -1.94513681 2.29682254]]
Crossover result:
[[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.2392086 0.30567304]
[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.20089639 0.30567304]
[ 3.00912373 -2.745417 3.27131287 -1.56909315 -1.94513681 2.29682254]
[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.20515009 2.29682254]]
Mutation result:
[[ 3.00912373 -2.745417 3.27131287 -1.40103767 -2.20744102 0.30567304]
[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.16589294 0.30567304]
[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.37553107 2.29682254]
[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.44124005 2.29682254]]
Best result after generation 3: 44.8169235189
Generation : 4
Fitness values
[ 34.59304326 34.16636692 33.7449326 31.73289712 44.81692352
33.35989464 36.46723397 37.19003273]
Selected parents:
[[ 3.00912373 -2.745417 3.27131287 -1.40103767 -2.20744102 0.30567304]
[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.44124005 2.29682254]
[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.37553107 2.29682254]
[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.20515009 2.29682254]]
Crossover result:
[[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.37553107 2.29682254]
[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.20515009 2.29682254]
[ 3.00912373 -2.745417 3.27131287 -1.40103767 -2.20744102 0.30567304]]
Mutation result:
[[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.13382082 2.29682254]
[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.98105233 2.29682254]
[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.27638584 2.29682254]
[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.70558545 0.30567304]]
Best result after generation 4: 44.8169235189
在上述5代之后,最佳结果现在具有等于44.8169235189的适应值,而第一代之后的最佳结果是18.24112489。
最佳解决方案具有以下权重:
[3.00912373 -2.745417 3.27131287 -1.40103767 -2.20744102 0.30567304]
实现GA的完整代码如下:
import numpy
import GA
"""
The y=target is to maximize this equation ASAP:
y = w1x1+w2x2+w3x3+w4x4+w5x5+6wx6
where (x1,x2,x3,x4,x5,x6)=(4,-2,3.5,5,-11,-4.7)
What are the best values for the 6 weights w1 to w6?
We are going to use the genetic algorithm for the best possible values after a number of generations.
"""
# Inputs of the equation.
equation_inputs = [4,-2,3.5,5,-11,-4.7]
# Number of the weights we are looking to optimize.
num_weights = 6
"""
Genetic algorithm parameters:
Mating pool size
Population size
"""
sol_per_pop = 8
num_parents_mating = 4
# Defining the population size.
pop_size = (sol_per_pop,num_weights) # The population will have sol_per_pop chromosome where each chromosome has num_weights genes.
#Creating the initial population.
new_population = numpy.random.uniform(low=-4.0, high=4.0, size=pop_size)
print(new_population)
num_generations = 5
for generation in range(num_generations):
print("Generation : ", generation)
# Measing the fitness of each chromosome in the population.
fitness = GA.cal_pop_fitness(equation_inputs, new_population)
# Selecting the best parents in the population for mating.
parents = GA.select_mating_pool(new_population, fitness,
num_parents_mating)
# Generating next generation using crossover.
offspring_crossover = GA.crossover(parents,
offspring_size=(pop_size[0]-parents.shape[0], num_weights))
# Adding some variations to the offsrping using mutation.
offspring_mutation = GA.mutation(offspring_crossover)
# Creating the new population based on the parents and offspring.
new_population[0:parents.shape[0], :] = parents
new_population[parents.shape[0]:, :] = offspring_mutation
# The best result in the current iteration.
print("Best result : ", numpy.max(numpy.sum(new_population*equation_inputs, axis=1)))
# Getting the best solution after iterating finishing all generations.
#At first, the fitness is calculated for each solution in the final generation.
fitness = GA.cal_pop_fitness(equation_inputs, new_population)
# Then return the index of that solution corresponding to the best fitness.
best_match_idx = numpy.where(fitness == numpy.max(fitness))
print("Best solution : ", new_population[best_match_idx, :])
print("Best solution fitness : ", fitness[best_match_idx])
GA模块如下:
import numpy
def cal_pop_fitness(equation_inputs, pop):
# Calculating the fitness value of each solution in the current population.
# The fitness function caulcuates the sum of products between each input and its corresponding weight.
fitness = numpy.sum(pop*equation_inputs, axis=1)
return fitness
def select_mating_pool(pop, fitness, num_parents):
# Selecting the best individuals in the current generation as parents for producing the offspring of the next generation.
parents = numpy.empty((num_parents, pop.shape[1]))
for parent_num in range(num_parents):
max_fitness_idx = numpy.where(fitness == numpy.max(fitness))
max_fitness_idx = max_fitness_idx[0][0]
parents[parent_num, :] = pop[max_fitness_idx, :]
fitness[max_fitness_idx] = -99999999999
return parents
def crossover(parents, offspring_size):
offspring = numpy.empty(offspring_size)
# The point at which crossover takes place between two parents. Usually it is at the center.
crossover_point = numpy.uint8(offspring_size[1]/2)
for k in range(offspring_size[0]):
# Index of the first parent to mate.
parent1_idx = k%parents.shape[0]
# Index of the second parent to mate.
parent2_idx = (k+1)%parents.shape[0]
# The new offspring will have its first half of its genes taken from the first parent.
offspring[k, 0:crossover_point] = parents[parent1_idx, 0:crossover_point]
# The new offspring will have its second half of its genes taken from the second parent.
offspring[k, crossover_point:] = parents[parent2_idx, crossover_point:]
return offspring
def mutation(offspring_crossover):
# Mutation changes a single gene in each offspring randomly.
for idx in range(offspring_crossover.shape[0]):
# The random value to be added to the gene.
random_value = numpy.random.uniform(-1.0, 1.0, 1)
offspring_crossover[idx, 4] = offspring_crossover[idx, 4] + random_value
return offspring_crossover