Introduction
In The Blind Watchmaker, Richard Dawkins discusses a very simple genetic algorithm (GA) to evolve the phrase "METHINKS IT IS LIKE A WEASEL" starting from randomly generated strings. Despite the fact that Dr. Dawkins was very careful to point out the limitations of this simulation:
Although the monkey/Shakespeare model is useful for explaining the distinction between single-step selection and cumulative selection, it is misleading in important ways. One of these is that, in each generation of selective 'breeding', the mutant 'progeny' phrases were judged according to the criterion of resemblance to a distant ideal target, the phrase METHINKS IT IS LIKE A WEASEL. Life isn't like that. Evolution has no long-term goal. There is no long-distance target, no final perfection to serve as a criterion for selection, although human vanity cherishes the absurd notion that our species is the final goal of evolution. In real life, the criterion for selection is always short-term, either simple survival or, more generally, reproductive success.
Intelligent Design proponents and other creationists have made a lot of noise about how it doesn't prove anything about evolution. Several have made factually incorrect claims regarding the implementation of the simulation in their attempts to discredit it.
I've implemented a variant of Dr. Dawkins' simulation to make it clear that, while simplistic, the Weasel Program does demonstrate exactly what he says it does.
The Goal
The purpose of the Weasel Program is to show that cumulative selection can search a solution space much faster than single-step selection. The biological analogy, of course, is that differential reproductive success operating on small variations leads to significant changes in populations over generations.
A secondary goal of the Weasel Program is to address the common misconception of evolution as "random chance" promulgated by many creationists. The cumulative selection demonstrated by this simple simulation shows how just one of many observed mechanisms results in evolution.
Variations
My version of the Weasel Program uses the GA Engine I wrote to solve Dave Thomas' Steiner Network design challenge. This engine operates on bit strings, so I use bits instead of characters. Each character in the string is represented by the seven bits of its ASCII representation. This significantly increases the search space — there are 128 possible values for each character rather than the 27 used by Dr. Dawkins.
Another significant variation is that reproduction in this GA engine is stochastic. The most fit individual in a population is not guaranteed to be selected as a parent of the next generation population. In the tournament selection model used, each parent is generated by picking two genomes at random from the population and selecting the most fit of those two.
Finally, it is important to note that, contrary to the claims of some of Dr. Dawkins' detractors, no bits or characters are "locked in" at any time. All bits are subject to mutation. Any character may be disrupted either by mutation or crossover.
Implementation
As described in the GA engine description, applying the GA engine to a problem requires following a few simple steps:
- Create a class to represent the characteristics of the problem The problem class
- Implement a method to create instances of the problem class The
- Implement the required generic functions for the problem:
genome-length
The genome length is simply the number of characters in the
target string times seven.
fitness
The fitness of a genome is the number of bits that match the
target genome. A higher value means the genome is more fit.
A value equal to the length of the genome indicates the
solution has been found.
fitness-comparator
This problem can use the - Implement a
terminatorfunction
This problem can use the - Run
solve
weasel-problem contains only the
target string and the target string's bit vector genome.
make-weasel-problem function creates an instance
of weasel-problem from a target string and ensures
that the bit vector genome is also set.
greater-comparator
provided by the GA engine.
fitness-terminator provided
by the GA engine.
Results
My first simulation used a population of 1000 and a mutation rate of
2%:
The fitness of the best individual rapidly climbed to above 180.
After hovering betweeen 190 and 194 for several dozen generations, an
optimal genome was found at generation 458.
(solve-weasel (make-weasel-problem "Methinks it is like a weasel") 1000 0.02)
Summary
This variation of the Weasel Program searches a much larger solution space than the original. Dr. Dawkins' version used a set of 27 characters (the uppercase alphabet plus a space). There are 27^28 unique strings of the correct length using that set of characters, or roughly 10^40. When using 7 bits for each character, the number of unique strings of the correct length is 2^196, which is around 10^59. The power of differential reproductive success is shown by the fact that at most 229,500 different strings out of that enormous space were evaluated before the optimal solution was found.
Dr. Dawkins' Weasel Program does exactly what he claims: it demonstrates the power of cumulative search over single-step search. Further, it provides a gentle introduction to one of the important mechanisms of biological evolution, addressing the all too common misconception that evolution is based on "blind chance."
What the Weasel Program does not do, or claim to do, is demonstrate the capabilities of general purpose genetic algorithms. The use of an explicit target helps simplify the program, making it more useful as a teaching aid. The capabilities of more general GAs are better demonstrated in domains such as the Steiner Problem.
Source Code
The code is available in two files:
This requires the GA Engine code:Please contact me by email if you have any questions, comments, or suggestions.