Python中的遗传算法完整实现

xueyuediana 2018-07-16

本教程将基于一个简单的示例在Python中实现遗传算法优化技术,在这个示例中,我们试图最大化一个方程的输出。本教程使用十进制表示的基因,单点交叉,和均匀变异。

Python中的遗传算法完整实现

遗传算法概述

遗传算法(GA)的流程图如图1所示。遗传算法中涉及的每个步骤都有一些变化。

Python中的遗传算法完整实现

图1.遗传算法流程图

例如,基因的表示有不同类型,例如二进制,十进制,整数等。每种类型都有不同的对待。存在不同类型的突变,例如位翻转,交换,反转,均匀,非均匀,高斯,收缩等。此外,交叉具有不同的类型,例如混合,单点,两点,均匀等。本教程不会实现所有这些,只是实现了GA中涉及的每个步骤的一种类型。本教程使用基因的十进制表示,单点交叉和均匀变异。

示例

本教程首先介绍我们将要实现的等式。等式如下所示:

Y = w1x1 + w2x2 + w3x3 + w4x4 + w5x5 + w6x6

该方程有6个输入(x1到x6)和6个权重(w1到w6),如图所示,输入值为(x1,x2,x3,x4,x5,x6)=(4,-2,7,5,11, 1)。我们正在寻找最大化这种方程的参数(权重)。最大化这种方程的想法似乎很简单。正输入乘以最大可能的正数,负数乘以最小可能的负数。但我们希望实现的想法是如何让GA做到这一点,以便知道使用正输入和负输入的负权重更好。让我们开始实施GA。

首先,让我们创建一个包含6个输入的列表和一个用于保存权重数量的变量,如下所示:

# Inputs of the equation.

equation_inputs = [4,-2,3.5,5,-11,-4.7]

# Number of the weights we are looking to optimize.

num_weights = 6

下一步是定义初始总体。根据权重的数量,每个染色体(溶液或个体)肯定有6个基因,每个权重对应一个基因。但是问题是每个群体有多少个解决方案?没有固定的值,我们可以选择适合我们问题的值。但是我们可以保留它的通用性,以便在代码中修改它。接下来,我们创建一个变量来表示每个群体的解决方案数量,另一个变量来表示群体的大小,最后,一个变量来表示实际的初始群体数量:

import numpy

sol_per_pop = 8

# Defining the population size.

pop_size = (sol_per_pop,num_weights) # The population will have sol_per_pop chromosome where each chromosome has num_weights genes.

#Creating the initial population.

new_population = numpy.random.uniform(low=-4.0, high=4.0, size=pop_size)

导入numpy库后,我们可以使用numpy.random.uniform函数随机创建初始种群。根据所选参数,它将具有形状(8,6)。这是8条染色体,每条染色体有6个基因,每个权重一个。运行此代码后,群体如下:

[[-2.19134006 -2.88907857 2.02365737 -3.97346034 3.45160502 2.05773249]

[ 2.12480298 2.97122243 3.60375452 3.78571392 0.28776565 3.5170347 ]

[ 1.81098962 0.35130155 1.03049548 -0.33163294 3.52586421 2.53845644]

[-0.63698911 -2.8638447 2.93392615 -1.40103767 -1.20313655 0.30567304]

[-1.48998583 -1.53845766 1.11905299 -3.67541087 1.33225142 2.86073836]

[ 1.14159503 2.88160332 1.74877772 -3.45854293 0.96125878 2.99178241]

[ 1.96561297 0.51030292 0.52852716 -1.56909315 -2.35855588 2.29682254]

[ 3.00912373 -2.745417 3.27131287 -0.72163167 0.7516408 0.00677938]]

请注意,它是随机生成的,因此在再次运行时肯定会发生变化。

在准备群体之后,接下来将遵循图1中的流程图。基于适应度函数,我们将选择当前群体中的最佳个体作为交配的父母。接下来是应用GA变体(交叉和变异)来产生下一代的后代,通过附加父母和后代来创建新的群体,并且重复这些步骤进行多次迭代/生成。下一个Python代码应用以下步骤:

import GA

num_generations = 5

num_parents_mating = 4

for generation in range(num_generations):

# Measuring the fitness of each chromosome in the population.

fitness = GA.cal_pop_fitness(equation_inputs, new_population)

# Selecting the best parents in the population for mating.

parents = GA.select_mating_pool(new_population, fitness,

num_parents_mating)

# Generating next generation using crossover.

offspring_crossover = GA.crossover(parents,

offspring_size=(pop_size[0]-parents.shape[0], num_weights))

# Adding some variations to the offsrping using mutation.

offspring_mutation = GA.mutation(offspring_crossover)

# Creating the new population based on the parents and offspring.

new_population[0:parents.shape[0], :] = parents

new_population[parents.shape[0]:, :] = offspring_mutation

现在的代数是5。在本教程中,它被选为小以显示所有代的结果。

第一步是使用GA求总体中的每个解的适应度值。cal_pop_fitness函数。该函数在GA模块内的实现如下:

def cal_pop_fitness(equation_inputs, pop):

# Calculating the fitness value of each solution in the current population.

# The fitness function calculates the sum of products between each input and its corresponding weight.

fitness = numpy.sum(pop*equation_inputs, axis=1)

return fitness

除了总体之外,适应度函数还接受方程输入值(x1到x6)。根据我们的函数,将适应度值计算为每个输入与其对应基因(权重)之间的乘积(SOP)之和。根据每个群体的解决方案数量,将有一些SOP。正如我们之前在变量命名sol_per_pop中将解决方案的数量设置为8 ,将有8个SOP,如下所示:

[-63.41070188 14.40299221 -42.22532674 18.24112489 -45.44363278 -37.00404311 15.99527402 17.0688537 ]

注意,适应度值越高,解决方案就越好。

在计算所有解决方案的适应度值之后,下一步是根据下一个函数GA.select_mating_pool,在匹配池中选择其中最好的作为父类。该函数接受群体、适合度值和所需的父母数量。它返回所选的父类。其在GA模块内的实现如下:

def select_mating_pool(pop, fitness, num_parents):

# Selecting the best individuals in the current generation as parents for producing the offspring of the next generation.

parents = numpy.empty((num_parents, pop.shape[1]))

for parent_num in range(num_parents):

max_fitness_idx = numpy.where(fitness == numpy.max(fitness))

max_fitness_idx = max_fitness_idx[0][0]

parents[parent_num, :] = pop[max_fitness_idx, :]

fitness[max_fitness_idx] = -99999999999

return parents

根据变量num_parents_match中定义的父类的数量,该函数创建一个空数组来保存它们,就像在这一行中一样:

parents = numpy.empty((num_parents,pop.shape [1]))

循环遍历当前群体,该函数获得最高适应度值的索引,因为它是根据此行选择的最佳解决方案:

max_fitness_idx = numpy.where(fitness == numpy.max(fitness))

此索引用于使用以下行检索与此适合度值对应的解决方案:

parent [parent_num,:] = pop [max_fitness_idx,:]

为了避免再次选择这种解决方案,它的适应度值被设置为一个非常小的值,这个值很可能不会被再次选中,即- 99999999999999999。最后返回父数组,根据我们的示例如下所示:

[[-0.63698911 -2.8638447 2.93392615 -1.40103767 -1.20313655 0.30567304]

[ 3.00912373 -2.745417 3.27131287 -0.72163167 0.7516408 0.00677938]

[ 1.96561297 0.51030292 0.52852716 -1.56909315 -2.35855588 2.29682254]

[ 2.12480298 2.97122243 3.60375452 3.78571392 0.28776565 3.5170347 ]]

注意,这三个父类是当前群体中最好的个体,基于他们的适合度值分别为18.24112489,17.0688537,15.99527402和14.40299221。

下一步是使用这样选择的亲本进行交配以产生后代。根据功能,交配从交叉操作开始GA.crossover。此函数接受父母和后代的大小。它使用后代大小来了解从这些父母那里生产的后代的数量。这样的功能在GA模块内部实现如下:

def crossover(parents, offspring_size):

offspring = numpy.empty(offspring_size)

# The point at which crossover takes place between two parents. Usually, it is at the center.

crossover_point = numpy.uint8(offspring_size[1]/2)

for k in range(offspring_size[0]):

# Index of the first parent to mate.

parent1_idx = k%parents.shape[0]

# Index of the second parent to mate.

parent2_idx = (k+1)%parents.shape[0]

# The new offspring will have its first half of its genes taken from the first parent.

offspring[k, 0:crossover_point] = parents[parent1_idx, 0:crossover_point]

# The new offspring will have its second half of its genes taken from the second parent.

offspring[k, crossover_point:] = parents[parent2_idx, crossover_point:]

return offspring

该函数首先根据后代大小创建一个空数组,如下所示:

offspring = numpy.empty(offspring_size)

因为我们使用单点交叉,所以我们需要指定交叉发生的点。选择该点以根据此行将解决方案划分为两个相等的一半:

crossover_point = numpy.uint8(offspring_size [1] / 2)

然后我们需要选择两个父母进行交叉。根据这两行选择这些父母的指数:

parent1_idx = k%parents.shape [0]

parent2_idx =(k + 1)%parents.shape [0]

父母的选择方式类似于戒指。第一个指标为0和1的物种首先被选择产生两个后代。如果仍有剩余的后代要繁殖,那么我们选择双亲1和双亲2的后代。如果我们需要更多的后代,那么我们选择第二个有索引2和3的父母。通过索引3,我们到达了最后一个父元素。如果我们需要生成更多的后代,那么我们选择带有索引3的父节点,然后返回索引0的父节点,以此类推。

将交叉操作应用于父项后的解决方案存储在offspring 变量中,它们如下所示:

[[-0.63698911 -2.8638447 2.93392615 -0.72163167 0.7516408 0.00677938]

[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.35855588 2.29682254]

[ 1.96561297 0.51030292 0.52852716 3.78571392 0.28776565 3.5170347 ]

[ 2.12480298 2.97122243 3.60375452 -1.40103767 -1.20313655 0.30567304]]

接下来是offspring 使用GA.mutationGA模块内部的函数将第二个GA变体mut变量应用于存储在变量中的交叉结果。这种函数接受交叉后代并在应用均匀突变后返回它们。该功能实现如下:

def mutation(offspring_crossover):

# Mutation changes a single gene in each offspring randomly.

for idx in range(offspring_crossover.shape[0]):

# The random value to be added to the gene.

random_value = numpy.random.uniform(-1.0, 1.0, 1)

offspring_crossover[idx, 4] = offspring_crossover[idx, 4] + random_value

return offspring_crossover

它循环遍历每个后代,并根据以下行添加一个均匀生成的随机数,范围从-1到1:

random_value = numpy.random.uniform(-1.0,1.0,1)

然后根据此行将这样的随机数添加到后代的索引4的基因中:

offspring_crossover [idx,4] = offspring_crossover [idx,4] + random_value

请注意,索引可以更改为任何其他索引。应用突变后的后代如下:

[[-0.63698911 -2.8638447 2.93392615 -0.72163167 1.66083721 0.00677938]

[3.00912373 -2.745417 3.27131287 -1.56909315 -1.94513681 2.29682254]

[1.96561297 0.51030292 0.52852716 3.78571392 0.45337472 3.5170347]

[2.12480298 2.97122243 3.60375452 -1.40103767 -1.5781162 0.30567304]]

这些结果被添加到变量中offspring_crossover并由函数返回。

在这一点上,我们成功地从4个选定的父母中产生了4个后代,我们准备创造下一代的新种群。

请注意,GA是一种基于随机的优化技术。它试图通过对它们应用一些随机变化来增强当前的解决方案。因为这些变化是随机的,我们不确定它们会产生更好的解决方案。出于这个原因,优选在新的群体中保留先前的最佳解决方案(父母)。在最糟糕的情况下,当所有新的后代都比这样的父母更糟糕时,我们将继续使用这样的父母。因此,我们保证新一代将至少保留以前的良好结果,并且不会变得更糟。新人口将拥有前父母的前4个解决方案。最后4个解决方案来自应用交叉和变异后创建的后代:

new_population [0:parents.shape [0]

,:] = parent new_population [parents.shape [0]:,:] = offspring_mutation

通过计算第一代所有解决方案(父母和后代)的适应度,他们的适应度如下:

[18.24112489 17.0688537 15.99527402 14.40299221 -8.46075629 31.73289712 6.10307563 24.08733441]

之前的最高适应度是18.24112489,但现在是31.7328971158。这意味着随机变化趋向于更好的解决方案。这很棒。但这种结果可以通过更多代来增强。以下是另外4代的每个步骤的结果:

Generation : 1

Fitness values:

[ 18.24112489 17.0688537 15.99527402 14.40299221 -8.46075629 31.73289712 6.10307563 24.08733441]

Selected parents:

[[ 3.00912373 -2.745417 3.27131287 -1.56909315 -1.94513681 2.29682254]

[ 2.12480298 2.97122243 3.60375452 -1.40103767 -1.5781162 0.30567304]

[-0.63698911 -2.8638447 2.93392615 -1.40103767 -1.20313655 0.30567304]

[ 3.00912373 -2.745417 3.27131287 -0.72163167 0.7516408 0.00677938]]

Crossover result:

[[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.5781162 0.30567304]

[ 2.12480298 2.97122243 3.60375452 -1.40103767 -1.20313655 0.30567304]

[-0.63698911 -2.8638447 2.93392615 -0.72163167 0.7516408 0.00677938]

[ 3.00912373 -2.745417 3.27131287 -1.56909315 -1.94513681 2.29682254]]

Mutation result:

[[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.2392086 0.30567304]

[ 2.12480298 2.97122243 3.60375452 -1.40103767 -0.38610586 0.30567304]

[-0.63698911 -2.8638447 2.93392615 -0.72163167 1.33639943 0.00677938]

[ 3.00912373 -2.745417 3.27131287 -1.56909315 -1.13941727 2.29682254]]

Best result after generation 1 : 34.1663669207

Generation : 2

Fitness values:

[ 31.73289712 24.08733441 18.24112489 17.0688537 34.16636692 10.97522073 -4.89194068 22.86998223]

Selected Parents:

[[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.2392086 0.30567304]

[ 3.00912373 -2.745417 3.27131287 -1.56909315 -1.94513681 2.29682254]

[ 2.12480298 2.97122243 3.60375452 -1.40103767 -1.5781162 0.30567304]

[ 3.00912373 -2.745417 3.27131287 -1.56909315 -1.13941727 2.29682254]]

Crossover result:

[[ 3.00912373 -2.745417 3.27131287 -1.56909315 -1.94513681 2.29682254]

[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.5781162 0.30567304]

[ 2.12480298 2.97122243 3.60375452 -1.56909315 -1.13941727 2.29682254]

[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.2392086 0.30567304]]

Mutation result:

[[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.20515009 2.29682254]

[ 3.00912373 -2.745417 3.27131287 -1.40103767 -0.73543721 0.30567304]

[ 2.12480298 2.97122243 3.60375452 -1.56909315 -0.50581509 2.29682254]

[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.20089639 0.30567304]]

Best result after generation 2: 34.5930432629

Generation : 3

Fitness values:

[ 34.16636692 31.73289712 24.08733441 22.86998223 34.59304326 28.6248816 2.09334217 33.7449326 ]

Selected parents:

[[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.20515009 2.29682254]

[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.2392086 0.30567304]

[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.20089639 0.30567304]

[ 3.00912373 -2.745417 3.27131287 -1.56909315 -1.94513681 2.29682254]]

Crossover result:

[[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.2392086 0.30567304]

[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.20089639 0.30567304]

[ 3.00912373 -2.745417 3.27131287 -1.56909315 -1.94513681 2.29682254]

[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.20515009 2.29682254]]

Mutation result:

[[ 3.00912373 -2.745417 3.27131287 -1.40103767 -2.20744102 0.30567304]

[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.16589294 0.30567304]

[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.37553107 2.29682254]

[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.44124005 2.29682254]]

Best result after generation 3: 44.8169235189

Generation : 4

Fitness values

[ 34.59304326 34.16636692 33.7449326 31.73289712 44.81692352

33.35989464 36.46723397 37.19003273]

Selected parents:

[[ 3.00912373 -2.745417 3.27131287 -1.40103767 -2.20744102 0.30567304]

[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.44124005 2.29682254]

[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.37553107 2.29682254]

[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.20515009 2.29682254]]

Crossover result:

[[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.37553107 2.29682254]

[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.20515009 2.29682254]

[ 3.00912373 -2.745417 3.27131287 -1.40103767 -2.20744102 0.30567304]]

Mutation result:

[[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.13382082 2.29682254]

[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.98105233 2.29682254]

[ 3.00912373 -2.745417 3.27131287 -1.56909315 -2.27638584 2.29682254]

[ 3.00912373 -2.745417 3.27131287 -1.40103767 -1.70558545 0.30567304]]

Best result after generation 4: 44.8169235189

在上述5代之后,最佳结果现在具有等于44.8169235189的适应值,而第一代之后的最佳结果是18.24112489。

最佳解决方案具有以下权重:

[3.00912373 -2.745417 3.27131287 -1.40103767 -2.20744102 0.30567304]

完整的Python实现

实现GA的完整代码如下:

import numpy

import GA

"""

The y=target is to maximize this equation ASAP:

y = w1x1+w2x2+w3x3+w4x4+w5x5+6wx6

where (x1,x2,x3,x4,x5,x6)=(4,-2,3.5,5,-11,-4.7)

What are the best values for the 6 weights w1 to w6?

We are going to use the genetic algorithm for the best possible values after a number of generations.

"""

# Inputs of the equation.

equation_inputs = [4,-2,3.5,5,-11,-4.7]

# Number of the weights we are looking to optimize.

num_weights = 6

"""

Genetic algorithm parameters:

Mating pool size

Population size

"""

sol_per_pop = 8

num_parents_mating = 4

# Defining the population size.

pop_size = (sol_per_pop,num_weights) # The population will have sol_per_pop chromosome where each chromosome has num_weights genes.

#Creating the initial population.

new_population = numpy.random.uniform(low=-4.0, high=4.0, size=pop_size)

print(new_population)

num_generations = 5

for generation in range(num_generations):

print("Generation : ", generation)

# Measing the fitness of each chromosome in the population.

fitness = GA.cal_pop_fitness(equation_inputs, new_population)

# Selecting the best parents in the population for mating.

parents = GA.select_mating_pool(new_population, fitness,

num_parents_mating)

# Generating next generation using crossover.

offspring_crossover = GA.crossover(parents,

offspring_size=(pop_size[0]-parents.shape[0], num_weights))

# Adding some variations to the offsrping using mutation.

offspring_mutation = GA.mutation(offspring_crossover)

# Creating the new population based on the parents and offspring.

new_population[0:parents.shape[0], :] = parents

new_population[parents.shape[0]:, :] = offspring_mutation

# The best result in the current iteration.

print("Best result : ", numpy.max(numpy.sum(new_population*equation_inputs, axis=1)))

# Getting the best solution after iterating finishing all generations.

#At first, the fitness is calculated for each solution in the final generation.

fitness = GA.cal_pop_fitness(equation_inputs, new_population)

# Then return the index of that solution corresponding to the best fitness.

best_match_idx = numpy.where(fitness == numpy.max(fitness))

print("Best solution : ", new_population[best_match_idx, :])

print("Best solution fitness : ", fitness[best_match_idx])

GA模块如下:

import numpy

def cal_pop_fitness(equation_inputs, pop):

# Calculating the fitness value of each solution in the current population.

# The fitness function caulcuates the sum of products between each input and its corresponding weight.

fitness = numpy.sum(pop*equation_inputs, axis=1)

return fitness

def select_mating_pool(pop, fitness, num_parents):

# Selecting the best individuals in the current generation as parents for producing the offspring of the next generation.

parents = numpy.empty((num_parents, pop.shape[1]))

for parent_num in range(num_parents):

max_fitness_idx = numpy.where(fitness == numpy.max(fitness))

max_fitness_idx = max_fitness_idx[0][0]

parents[parent_num, :] = pop[max_fitness_idx, :]

fitness[max_fitness_idx] = -99999999999

return parents

def crossover(parents, offspring_size):

offspring = numpy.empty(offspring_size)

# The point at which crossover takes place between two parents. Usually it is at the center.

crossover_point = numpy.uint8(offspring_size[1]/2)

for k in range(offspring_size[0]):

# Index of the first parent to mate.

parent1_idx = k%parents.shape[0]

# Index of the second parent to mate.

parent2_idx = (k+1)%parents.shape[0]

# The new offspring will have its first half of its genes taken from the first parent.

offspring[k, 0:crossover_point] = parents[parent1_idx, 0:crossover_point]

# The new offspring will have its second half of its genes taken from the second parent.

offspring[k, crossover_point:] = parents[parent2_idx, crossover_point:]

return offspring

def mutation(offspring_crossover):

# Mutation changes a single gene in each offspring randomly.

for idx in range(offspring_crossover.shape[0]):

# The random value to be added to the gene.

random_value = numpy.random.uniform(-1.0, 1.0, 1)

offspring_crossover[idx, 4] = offspring_crossover[idx, 4] + random_value

return offspring_crossover

相关推荐