SIMPLE GENETIC ALGORITHM IMPLEMENTATION IN VB.NET AND IT'S APPLICATION TO LINEAR LEAST SQUARES APPROXIMATION PROBLEM.

Genetic algorithm is widely used for complex optimization problems solving. It is based on idea of natural selection. I'll not spend time describing theory. Just follow links below:

Generation5 website on Artificial Intelligence

Marek Obitko site on Genetic algorithms

When I became interested in GA (genetic algorithm) I spent some time looking for detailed description o GA. In presented here article I follow simple genetic algorithm given in Melanie Mitchell book "An Introduction to Genetic Algorithms". I present this algorithm implementation in VB.NET and widely known optimization problem solution using this implementation. Let me start from algorithm pseudocode:

Input:
N - size of initial population;
Pc - probability of crossover;
Pm - probability of mutation;

Pseudocode of Simple Genetic Algorithm

population = GenerateInitialPopulation(N)
Loop:  
Calculate fitness for each chromosome in population.
Repeat until size(newpopulation) < N
{
  Select a pair of parents from the current population according to their fitness (larger fitness - larger selection probability).
Selected parents are not removed from the current population.
With probability Pc apply crossover operation to form two offsprings.
if no crossover take place - form two offsprings that are copies of their parents.
Mutate the two offsprings at each locus (element of chromosome) with probability Pm.
Place the resulting offsprings in the new population
}  
If N is odd - remove random chromosome.

Replace the current population with the new population.
Goto Loop.

I've implemented this algorithm with one but really important change - the best object always survive (so called elitism).


Chromosome representation


We'll start with some abstract chromosome representation. All we basically need from chromosome - is ability:

 

- to reproduce itself (make exact copy)

 

- to perform crossover operation with other chromosome

 

- to mutate at some point (locus)

One of possible way to declare such requirements is to use interface.

 
Namespace GeneticAlgorithm

    Public Interface IChromosome

        Sub RandomInit()
        Sub CrossOver(ByVal mother As IChromosome, ByVal offspring1 As IChromosome, ByVal offspring2 As IChromosome)
        Sub Copy(ByVal dest As IChromosome)
        Sub MutateAt(ByVal locus As Integer)
        Function ValidLocus(ByVal locus As Integer) As Boolean

    End Interface


End Namespace

 

Fitness evaluation function intentionally is not included into IChromosome interface. The best explanation for this is presented below (in LSQ problem example). This is because in some cases chromosome fitness value depends not only on chromosome itself. So fitness evaluation function is included into BaseGeneticAlgorithm class. In cases when it's possible to evaluate fitness processing only chromosome object data we simply delegate this work to chromosome object itself.

 The next step is abstract genetic algorithm implementation. We need this implementation to be flexible as possible (to be more exact - as we can do that). So all we need from GA implementation is:

 

- to make all GA stuff

 

- to be able to create chromosomes (factory)

 

- to be able to compute chromosome fitness. (in some cases GA may delegate this function to chromosome itself but in certain cases it is impossible).

   
 
'
'  BaseGA.vb - Base class for simple genetic algorithm implementation.
'
'  Project: Framework
'  Author: S.Zabinskis
'  December, 2006
'
Imports System.Math
Imports System.Collections.Generic


Namespace GeneticAlgorithm

    Public MustInherit Class BaseGeneticAlgorithm(Of T As IChromosome)

#Region "Local Types"
        Public Class ValuedObject(Of X)
            Public _value As Double
            Public _rndval As Double
            Public _object As X
            Public Sub New(ByVal obj As X, ByVal value As Double)
                _object = obj
                _value = value
            End Sub
        End Class

        Private Class Mixer : Implements IComparer(Of ValuedObject(Of T))

            Public Function Compare(ByVal x As ValuedObject(Of T), ByVal y As ValuedObject(Of T)) As Integer Implements System.Collections.Generic.IComparer(Of ValuedObject(Of T)).Compare
                If x._rndval = y._rndval Then
                    Return 0
                End If
                Return IIf(x._rndval > y._rndval, 1, -1)
            End Function
        End Class

#End Region

#Region "Local Variables"
        Protected Shared _random As Random = New Random((DateTime.Now.Hour * 3600 + DateTime.Now.Minute * 60 + DateTime.Now.Second) * 1000 + DateTime.Now.Millisecond)
        Protected _population As New List(Of ValuedObject(Of T))
        Protected _iteration As Integer = 0
#End Region

#Region "Abstract methods"
        Public MustOverride Function EvalFitness(ByVal obj As T) As Double
        Protected MustOverride Function CanStop(ByVal iteration As Integer, ByVal population As List(Of T)) As Boolean
        Protected MustOverride Function PerformMutate() As Boolean
        Protected MustOverride Function PerformCrossover() As Boolean
        Public MustOverride Function CreateChromosome() As T
#End Region

#Region "Methods"

        Protected Function GetFitness(ByVal obj As T) As Double
            For Each vo As ValuedObject(Of T) In _population
                If vo._object.Equals(obj) Then
                    Return vo._value
                End If
            Next
            Return EvalFitness(obj)
        End Function

        Private Function NormalizePopulation(ByVal population As IEnumerable(Of T), ByVal reverse_flag As Boolean) As Boolean
            Dim minftn As Double = Double.MaxValue
            Dim maxftn As Double = Double.MinValue
            Dim sumftn As Double = 0
            Dim fitness As Double = Double.NaN

            Dim N As Integer = 0
            _population.Clear()

            For Each obj As T In population
                If reverse_flag Then
                    fitness = -EvalFitness(obj)
                Else
                    fitness = EvalFitness(obj)
                End If
                If maxftn < fitness Then
                    maxftn = fitness
                End If
                If minftn > fitness Then
                    minftn = fitness
                End If
                sumftn += fitness
                Dim vo As New ValuedObject(Of T)(obj, fitness)
                vo._rndval = BaseGeneticAlgorithm(Of T)._random.NextDouble()
                _population.Add(vo)
                N += 1
            Next
            sumftn -= minftn * N

            Dim normf As Double = 1.0 / sumftn
            For Each vo As ValuedObject(Of T) In _population
                vo._value = (vo._value - minftn) * normf
            Next
            'Debug.WriteLine("end of NormalizePopulation")
            Return N > 1
        End Function

        Protected Overridable Sub ShufflePopulation()
            _population.Sort(New Mixer())
        End Sub


        Private Function Pick(ByVal population As List(Of ValuedObject(Of T))) As T
            Dim p As Double = _random.NextDouble
            Dim sum As Double = 0
            Dim o As T = Nothing

            For Each obj As ValuedObject(Of T) In population
                sum += obj._value
                If sum >= p Then
                    Return obj._object
                End If
                o = obj._object
            Next
            Return o
        End Function


        Protected Sub Mutate(ByVal obj As T)
            Dim locus As Integer = 0
            While obj.ValidLocus(locus)
                If PerformMutate() Then
                    obj.MutateAt(locus)
                End If
                locus += 1
            End While
        End Sub

        Private Function PickBest(ByVal population As List(Of ValuedObject(Of T))) As T
            Dim maxf As Double = Double.MinValue
            Dim o As T = Nothing
            For Each obj As ValuedObject(Of T) In population
                If obj._value > maxf Then
                    maxf = obj._value
                    o = obj._object
                End If
            Next
            Return o
        End Function

        Private Function NextPopulation(ByVal reverse_flag As Boolean) As List(Of T)

            Dim newpopulation As New List(Of T)
            Dim N As Integer = _population.Count
            Dim counter As Integer = 0

            ShufflePopulation()

            ' the best survives always
            Dim bestObj As T = PickBest(_population)
            newpopulation.Add(bestObj)
            Dim offspring1 As T = Nothing
            Dim offspring2 As T = Nothing


            While counter < N

                Dim parent1 As T = Pick(_population)
                Dim parent2 As T = Pick(_population)
                While parent1.Equals(parent2)
                    parent2 = Pick(_population)
                End While

                offspring1 = CreateChromosome()
                offspring2 = CreateChromosome()

                If PerformCrossover() Then
                    parent1.CrossOver(parent2, offspring1, offspring2)
                Else
                    parent1.Copy(offspring1)
                    parent1.Copy(offspring2)
                End If
                Mutate(offspring1)
                Mutate(offspring2)
                newpopulation.Add(offspring1)
                newpopulation.Add(offspring2)
                counter += 2

                While counter > N
                    newpopulation.RemoveAt(counter - 1)
                    counter -= 1
                End While
            End While

            If NormalizePopulation(newpopulation, reverse_flag) Then
                Return newpopulation
            End If
            Return Nothing
        End Function


        Public Sub Run(ByVal initial_population As IEnumerable(Of T), Optional ByVal reverse_flag As Boolean = False)
            If NormalizePopulation(initial_population, reverse_flag) Then
                _iteration = 0
                Dim population As List(Of T) = Nothing
                Do
                    population = NextPopulation(reverse_flag)
                    _iteration += 1
                Loop Until CanStop(_iteration, population)
            End If
        End Sub

#End Region

    End Class


End Namespace
   

As you can see, there are few "degrees of freedom" here:
- generic IChromosome-like parameter T
- virtual functions
Virtual functions are for:

Public Function CreateChromosome() As T

The most important function that may create chromosomes of type T

Public Function EvalFitness(ByVal obj As T) As Double Also important function that evaluates fitness of chromosome
Protected Function Function PerformMutate() As Boolean Decide if we are going to perform mutation
Protected Function PerformCrossover() As Boolean Decide if we are going to perform crossover
Protected Function CanStop(ByVal iteration As Integer, ByVal population As List(Of T)) As Boolean Decide whether it is time to stop

Please realize that it is the very first iteration to something like GA framework. Now it may be used only for learning purposes. And of course VB.NET really isn't right choice for intensive computations. Download source code of base classes here.

Application of GA to well known numerical math problem (Linear Least Squares approximation of set of points)

Lets generate set of points with polynomial y=x^2 - 2*x + 1 on interval [0,1]. We will try to use genetic algorithm to find least squares approximation of this point set with 2nd degree polynomial: a(0)*x^2 + a(1)*x + a(2). Ideally we should get a(2)=1.0, a(1)=-2.0, a(2)=1.0. First, let us assume that all approximating polynomial coefficients belong to interval [-4.0, 4.0]. Now, follow classical bit-string approach - represent approximating polynomial as one long bit string, containing 3 shorter bit strings for every coefficient. Each shorter bit string, representing polynomial coefficient is mapped to [-4.0, 4.0] range using expression:

value = -4.0 + (8.0 * b ) / (2^m-1)


where b is 'integer' value of bit-string and m is number of bits. b changes from 0 to 2^m-1, so corresponding coefficient will cover [-4.0, 4.0] range. For 8-bit representation we have for each coefficient:

value = - 4.0 + (8.0 * b) / 255

where b = 0,1,2, ...., 255.
I did very simple application to demonstrate how "bit-string" GA solves this problem.

User may set :
- population size;
- max.number of iterations;
- probabilities of crossover and mutation;
- range for approximation polynomial coefficients:
- size of bit string.

Download video clip illustrating algorithm in progress.