Genetic Algorithm In Solving Equations

Tuesday, August 2, 2016

Genetic algorithms is another model of machine learning that has been inspired by nature. Although it is not that much used in modern days because much has been overshadowed by deep learning and neural networks.

In nature and in theory of evolution there is only the survival of the fittest, where only the fittest genes are brought forwards in generations and crossing over of these genes in reproduction and mutations make the subsequent generations stronger and the weaker genes die off.

The same is applied in a genetic algorithm where they are used to solve for np hard problems. However from what I understood in a genetic algorithm you need to give it a target to achieve, and unlike a neural network it can't predict and it is not good for pattern recognition.

But like I said it's very good at solving np hard problems, for example lets say

$$x + y + z + a + b = 40 $$

and you need to find the values for $x,y,x,a,b$ you can implement a genetic algorithm for this purpose. And not only this simple equation you use it for any complex equation.

In every genetic algorithm there are few basic steps,

First you start with a population of a given number of chromosomes, in our example each chromosome will have four values which are generated randomly, which represent $x,y,x,a,b$ respectively.

So a single chromosome can be like this, we can also represent them in binary form,

chromosome1 = [5,12,2,1]

And then we evaluated the fitness of each chromosome by using a fitness function/cost function, which gives us a value in how far each chromosome is from the target, in a genetic algorithm it is the most important part is coming up with the best fitness function. So the closer the chromosome is to the target the fitter the chromosome is and fitness gets reduced as a chromosome gets further away from the target.

In this example we add up the four values in a chromosome and deduct it from the target to see how far it is from the target.

chromosome1 = [5,12,2,1]
5+12+2+1 = 20
40-20 = 20

Just like that calculate the fitness of all the chromosomes and then we select the fittest chromosome from them (the ones that are closest to the target).

In selecting the fittest chromosomes we commonly use a roulette wheel algorithm to pick the ones with the highest probability.

Then just like what happens in real DNA we splice two fit chromosomes, we splice the arrays at given points. And we join them to form a new chromosome.

chromosome1 = [5,12,2,1] = 20
chromosome2 = [8,7,3,1] = 18
chromosome1 >< chromosome2

[5,12] ><[8,7]
New chromosome = [5,12,8,7] = 32 

So you can see the child chromosome is fitter and more closer to the target than its parent chromosomes.

Just like that a selected chromosomes are crossed and child generation is formed, and the weaker chromosomes are removed from the population.

Also we can add some mutations to the chromosomes, adding random values in between them, we can set a given value to the number of mutations. Like 0.3 percent of  chromosomes should be mutated.

And this cycle repeated until the target is achieved or a given number of generations are spawned.

It's pretty basic and very simple to implement. But there are some limitations,

  • There should be a target, or result. Because we are calculating the fitness of the  population chromosomes it's difficult to implement one without knowing the end result which means it is not good on predicting or recognizing patterns unless you use it together with a neural network.
  • The genetic algorithms are very inefficient, as the number of chromosomes increase the time it is taken for come to a result also increase, imagine having 10,000 chromosomes and going through 10,000 of them mutating, crossing over and assessing fitness and repeating the cycle for anther 1000 generations takes a lot of time.
But it's also good to learn them and they are mostly used in research areas like protein folding, and other medical research work.

I used the following research paper in implementing this, https://arxiv.org/pdf/1308.467

The following genetic algorithm was used for matching attacks on clash of clans game, where the attackers strengths were given a value and the defenders defense was given a value. And the algorithm was used to decide who should attack who.

I will not go into detail about the Clash of Clans problem because it is still an undergoing interesting research for a later time.

However this algorithm is not complete, because it tends to give duplicate attacks, and also matches some absurd attacks that are not practical. Just like in our above example the final answer can be

answer = [40,0,0,0] = 40
x = 40
y = 0
z = 0
a = 0
b = 0

This answer is also correct but it is not the answer that we are looking for, because the genetic algorithm doesn't have a common sense to whether the answer is practical or not, it is only interesting in achieving the target target no matter what.

The code for the genetic algorithm is available on GitHub : https://gist.github.com/rukshn/f5cc571aeb2ca149d472b7701bf75734

No comments :

Post a Comment