Many years ago I used to read articles at a site called Generation5
, and one of the articles was a Hello World program for Genetic Algorithms
. I have since then seen people solving this with different programming languages and I myself have tried the example in Python and F#, but now it’s time to do it in Go.
If you haven’t read about Genetic Algorithms (GA) before then you will get a brief introduction below.
The idea is to mimic the biological evolution to evolve an answer to a problem, i.e. survival of the fittest
. This is done by creating a population
of chromosomes
. Each chromosome is representing a possible answer to the problem. The population will be creating a new generation by producing new offsprings
, which is done with the help of a fitness selection function
and a crossover
function. This shall hopefully create a new generation with chromosomes that are closer to the answer of a problem. There can also be some random mutation
of new offsprings.
This might be a little bit abstract, but I hope some code might make things more clear.
I will implement a genetic algorithm that will evolve an answer that will represent the string “Hello Go”. The first step therefore is to decide how to represent the chromosome in the code.
type gene byte
type individual struct {
chromosome []gene
fitness int
}
First I’m creating a type definition for the gene
, that will be the building block for the chromosome, and a struct to represent the individual
in a population. With these in place we can define target
which can be be implemented like this:
target := []gene{'H', 'e', 'l', 'l', 'o', ' ', 'G', 'o'}
So the goal is to evolve an answer that looks like the target above. Each character in our chromosome is representing a gene, each gene being an instance of a particular allele
(value). We will only use visible characters as our alleles.
Now when we know how our chromosomes should look like, lets see how we can determine its fitness in a population. The fitness calculation function is the main step in the natural selection of new offsprings.
func calculateFitness(target []gene, chromosome []gene) int {
fitness := 0
for i := 0; i < len(target); i++ {
diff := int16(target[i]) - int16(chromosome[i])
fitness += int(abs(diff))
}
return fitness
}
Here will we use the target to calculate the fitness of a chromosome. We are calculating the delta value between each gene (character’s ascii value) with the target one and then sum all the deltas to a fitness value.
The following chromosome, “Heklo Fo”, would then have the fitness value of 2, since both k and F are just one step away from its right value.
func TestCalculateFitness(t *testing.T) {
target := []gene{'H', 'e', 'l', 'l', 'o', ' ', 'G', 'o'}
chromosome := []gene{'H', 'e', 'k', 'l', 'o', ' ', 'F', 'o'}
fitness := calculateFitness(target, chromosome)
if fitness != 2 {
t.Errorf("Expected fitness to be 0, but got %d", fitness)
}
}
Next step to look at is how we can create new offsprings, which is called crossover
or recombination
in a GA. This is done by taking two chromosomes (parents) and mix theirs genes with each other, see the example below:
Mom: Heklo Fo
Dad: Ghjkn E$
Child: Hekln E$
The locus
(index) where the crossover should occur between dad and mom is decided completely random. The selection of who will be parents is also done randomly, however we will only pick parents with a fitness value among the 50% best chromosomes. This could be expressed like this in our code:
func crossover(population []individual, target []gene) individual {
chromosomeLength := len(population[0].chromosome)
topHalf := population[:len(population)/2]
parent1 := topHalf[rand.Intn(len(topHalf))]
parent2 := topHalf[rand.Intn(len(topHalf))]
locus := rand.Intn(chromosomeLength)
child := make([]gene, 0, chromosomeLength)
child = append(child, parent1.chromosome[:locus]...)
child = append(child, parent2.chromosome[locus:]...)
fitness := calculateFitness(target, child)
return individual{chromosome: child, fitness: fitness}
}
One more thing that we should do in our selection of new offsprings is to select an elite of chromosomes. This is done by keeping the 10% best chromosomes of the current population. By doing this we will make sure that the next generation have chromosomes that are at least as good as its parents generation.
func evolveGeneration(population []individual, target []gene) []individual {
eliteSize := len(population) / 10 // 10% of the population
sort.Slice(population, func(i, j int) bool {
return population[i].fitness < population[j].fitness
})
newPopulation := make([]individual, 0, len(population))
newPopulation = append(newPopulation, population[:eliteSize]...)
for i := eliteSize - 1; i < len(population); i++ {
newPopulation = append(newPopulation, crossover(population, target))
}
return newPopulation
}
We could also add mutation to the mix when creating a new generation of chromosomes. This is done to avoid that the GA will get stuck in a local maximum in the search space of the problem. However I haven’t added this to the solution.
The complete solution for the Hello Go
problem will look like this:
package main
import (
"fmt"
"math/rand"
"sort"
)
type gene byte
type individual struct {
chromosome []gene
fitness int
}
func abs(x int16) int16 {
if x < 0 {
return -x
}
return x
}
func calculateFitness(target []gene, chromosome []gene) int {
fitness := 0
for i := 0; i < len(target); i++ {
diff := int16(target[i]) - int16(chromosome[i])
fitness += int(abs(diff))
}
return fitness
}
func newChromosome(size int) []gene {
min := 32 // from ASCII space
max := 123 // until ASCII z
chromosome := make([]gene, size)
for i := 0; i < size; i++ {
chromosome[i] = gene(rand.Intn(max-min) + min)
}
return chromosome
}
func NewPopulation(size int, target []gene) []individual {
population := make([]individual, size)
for i := 0; i < size; i++ {
chromosome := newChromosome(len(target))
fitness := calculateFitness(target, chromosome)
population[i] = individual{chromosome: chromosome, fitness: fitness}
}
return population
}
func crossover(population []individual, target []gene) individual {
chromosomeLength := len(population[0].chromosome)
topHalf := population[:len(population)/2]
parent1 := topHalf[rand.Intn(len(topHalf))]
parent2 := topHalf[rand.Intn(len(topHalf))]
locus := rand.Intn(chromosomeLength)
child := make([]gene, 0, chromosomeLength)
child = append(child, parent1.chromosome[:locus]...)
child = append(child, parent2.chromosome[locus:]...)
fitness := calculateFitness(target, child)
return individual{chromosome: child, fitness: fitness}
}
func evolveGeneration(population []individual, target []gene) []individual {
eliteSize := len(population) / 10 // 10% of the population
sort.Slice(population, func(i, j int) bool {
return population[i].fitness < population[j].fitness
})
newPopulation := make([]individual, 0, len(population))
newPopulation = append(newPopulation, population[:eliteSize]...)
for i := eliteSize - 1; i < len(population); i++ {
newPopulation = append(newPopulation, crossover(population, target))
}
return newPopulation
}
func evolve(population []individual, target []gene, generations int) {
newPopulation := make([]individual, len(population))
copy(newPopulation, population)
for i := range generations {
newPopulation = evolveGeneration(newPopulation, target)
bestIndividual := newPopulation[0]
fmt.Println("Best individual:", string(bestIndividual.chromosome), "Fitness:", bestIndividual.fitness)
if bestIndividual.fitness == 0 {
fmt.Println("Generation:", i)
fmt.Println("Found solution:", string(bestIndividual.chromosome))
return
}
}
}
func main() {
fmt.Println("Evolve...")
target := []gene{'H', 'e', 'l', 'l', 'o', ' ', 'G', 'o'}
populationSize := 2048
generations := 1000
population := NewPopulation(populationSize, target)
evolve(population, target, generations)
}
One part I haven’t mention earlier is how we are initializing our population. This is done with the help of the function NewPopulation
which is using the NewChromosome
function to randomly create a chromosome. The gene is randomly selected by using a random number between 32 and 123 which is the span for "visible" ascii characters.
Here is the output of the program.
The code has some room for improvements and there are some optimizations we could do to this program. Perhaps some mutation would be interesting to add, or the size of the population could be adjusted, or another size of the elite group, or perhaps we should use another representation of the chromosome (genes as bits instead).
However, the best way to get a deeper understanding of genetic algorithms is to start experimenting with these attributes to see how it affects the end result.
Good luck :-)