Fun with Markov Network Brains

An introduction for dummy programmers

Over the course of about 1000 generations, a Markov Network Brain evolves dumb bugs (left) into something capable of finding food immediately (right). The “bugs” achieve this with no awareness of their environment beyond their physical bodies and antennae

Click here for the demo (known to work with FF 68.0.1, not 100% cross-browser)

A couple of months ago, I was at a wedding and (naturally) the conversation turned toward the capabilities and limitations of Artificial Intelligence.

I am not an AI expert, and my exposure to it has been sketchy at best. I have implemented algorithms but have never really done a deep dive into any of them to really understand the mechanics. So when the conversation brought up the concept of genetic, or evolutionary, algorithms, both myself and the other fellow had to confess that we really didn’t know that much about them.

What little we did discuss got me curious, and on the plane ride home I came across an article on Markov Network Brains(MNB). When I first read about MNBs something very deep resonated with agriculture and medical experience from my past, and I wanted to go deeper into the mechanics behind them.

A gut feeling of familiarity was only enough to whet the appetite; what I really needed was a simple implementation that I could step through, to observe the changes as they happened. I am a fan of Browser-side JavaScript for solving problems (there is always a compiler and debugger handy), so was fortunate to stumble across an incomplete JavaScript implementation on Github.

My initial intention was simply to step through the code to understand the process; however, after addressing a couple of minor visual bugs, I found I was having to do some heavy lifting to get it to work. A deep tear-down was necessary, and resulted in a fun little simulation.

What better way to really understand what was going on.

A Layman’s Description

Markov Network Brains are evolutionary algorithms that are based on the modern models of genetics and evolution. The same natural processes that allow bacteria to become drug resistant, can be used to breed animals to a specific purpose; or to breed a computer program well suited to solving a specific problem.

The point to any software algorithm is for the machine to learn to solve a problem. In traditional programming, we do this by having very clever humans write a computer program that inspects some set of input values and creates a new set of output values (technically: a function).

MNBs (really Machine Learning algorithms in general) are no different: we have a problem that needs solving, a process for solving it, and we base it on some inputs. What differentiates an MNB is that we do not directly create the program, we allow it to be randomly generated, and then slowly bring it closer to solving the problem by automatically testing small random changes. (Actually, when phrased that way, it doesn’t sound different at all)

There are three inter-related, but independent components that are important to understand with MNBs: Genome, Brain, and Breeder. The ability for the algorithm to learn (become better able to solve the problem) is tied to the way three components work together:

Three main components
  1. Genome: this is like the programmer’s un-compiled code.
  2. Brain: one could think of this as an executable, compiled software. It also has memory allocated for storing information.
  3. Breeder: the developer, judging whether the code is successful or not

Like any software, these three parts are distinct, but strongly inter-related. Also, like any software, the interest parts happen at the transitions between the parts.

Genome Creation

The Genome is initially created as nothing more than a random array [genome.js:23]. This, like a DNA genome, is a list that will be used to create, and recreate actors. In our case it will be compiled into The Brain.

Genome → Brain

A Genome is compiled into a Brain by reading the genome and using the data as the basis for allocating a quantity of memory, initialising the memory values, and allocating transforms for the memory [brain.js:18].

The transforms, or gates, are predefined functions like or, and, or xor transforms (other transforms are possible, use your imagination) [gates.js:5]. Lastly, the genome is used to map memory elements as inputs and outputs for each transform [brain.js:104].

Remember that these values were randomly generated, so (at least on the first pass) these transforms, as well as the quantity of memory, are selected randomly. Basically, you have generated a completely random program acting on random memory elements.

An infinite number of monkeys typing on an infinite number of typewriters…

Brain → Breeding

Once The Brain has executed, it will have generated some outputs. It is up to The Breeder to judge whether the outputs were of any value or not, or more importantly, which of these executions were most valuable. In a more complex environment, it is also reasonable for The Breeder to observe the The Brain in action.

The score distribution of the bugs starts at -0.02σ0.2 and improves to 0.25σ0.1 after 24 generations. A plateau was reached at about generation 17.

To achieve this we need to run several different Brains many times. Out of these many runs, most will be useless, but some will actually be useful. Like an animal breeder we can select the most useful of these genomes, and use them as the basis for better genomes, discarding the rest [evolve.js:55].

Breeding → Genome

Once The Breeder has selected the most successful of the programs, these programs can be used as the basis for trying new variations.

This is done through a reproduction process where genomes are randomly intermixed with one another (sexual reproduction) to produce a new algorithm that has a new mix of decision making processes [evolve.js:93].

The key is that each of these newly created genomes is a little less random than its predecessors; good decision making structures are kept, and bad ones are culled. Genomes identified as “good” are recombined with one another to see if they result in something even better. Over time this process will result in progressively improving programs that get closer to solving the problem.

A second plateau was reached around generation 120, before the last improvement at around generation 230 (left) where it reached a score of 0.85σ0.2. Due to the element of luck, a different run (right) took nearly 10 times longer to discover the second plateau.


As with any system that has randomness involved, debugging can be painful. There is always a question of whether an observed behaviour is a result of luck (good or bad), or faulty programming.

The problem is most observable with bugs being randomly placed right on top of food, and then doing nothing. These bugs receive high reward, purely based on luck. Bugs that are actively moving in search of food end up being culled for not being as successful. This element of luck is undesirable.

A lucky bug: the blue bug has no neural activity but was randomly populated right on top of a food source

As a result of pondering this problematic element of luck, two unique additions were added that are worth noting: culling of useless brains, and reuse of successful genetics. Both of these are based on animal husbandry practices.

Early Culling

If the randomly generated program does not result in any output, it is of no value to us.

The brains created are constrained to an array of approximately 50 elements of memory. Given 3 outputs, that is only a 6% chance that randomly generated programs will result in any meaningful output. Larger memory will make these odds even worse.

To overcome this, I ended up creating a check in the generation routine. Immediately after creating a new brain, its transforms are scanned to determine if they will take any action [evolve.js:161]. If none of the transforms in the brain ever write to the output segment, the bug is immediately discarded and a new bug is generated in its place [evolve.js:87].


While investigating the element of luck, it occurred to me that animal breeders keep track of their most successful breeding stock: animals with good parentage are likely to produce better offspring than animals with poor parentage. To simulate this, I introduced a concept of karma. Karma is a score that is attached to the genome rather than the bug itself. It is calculated at the end of a cycle by taking the bug’s score and averaging it with its genome’s score [evolve.js:32]. Newly created genomes inherit their predecessor’s karmic score by averaging the score of parent genomes [evolve.js:101].

Each brain is monitored for activity to help distinguish activity and decision making from “luck”. From left to right, the “karma” and “bug” score, sensors, memory state, and outputs (speed change, lean left, lean right)

When it comes time to compare the genomes for effectiveness, karma is used. By evaluating the overall genome rather than the bug itself, this allows for genomes that are unlucky to get another chance to prove themselves. Continued lack of success will result in karma slowly declining (eventually resulting in a cull), while a single “bad” generation will not cause an otherwise successful genome to be lost.

Unfortunately, I have no clear evidence that either of these were useful, or effective, as these were introduced to compensate for what turned out to be a defect in the brain processing itself. While logically sound, the only evidence is the long delay in the very first generation created. This long delay is partially caused by thousands of bugs being rejected, I can only imagine what running each of these useless bugs through a couple thousand cycles of activity would cost.


At some point in my reading, I came across a statement that evolutionary programming has low value because similar results can be achieved faster using other techniques (there is a counter-argument that they can discover solutions humans cannot consider). This may be true, but I had a lot of fun building this simulation and am still fascinated by Markov Network Brains.

Writing software is enjoyable. There is beauty and elegance to the processes involved that sometimes get lost in work deadlines and customer expectations. Sometimes the act of writing software is just an expression of beauty and creativity. Playing with this simulation was just that: a bit of art for art’s sake.

From a philosophical standpoint, the act of truly understanding the evolutionary processes involved in this algorithm have given me a new perspective on complex, self-forming, systems. From interpersonal relationships in the office, to black-market economics, to the way students learn, to political alliances, I now just “see it” slightly differently than I did.

It was interesting to watch my own biases in decision making. My original hypothesis was that the bugs would evolve a spiral search pattern, and what I perceived as defects in their behaviour led me to increase punishment for touching the boundary, in an effort to force them to conform. In the end, I was surprised to wake up one morning and discover the bugs had evolved a pattern of ignoring the pain, and using the wall as a way to orient themselves in their environment. My own little Stanford Experiment.

Lastly, this project has been a reminder of how useful it is to build throw-away programs. When learning something new, it is far better to build a small piece of code, than to tackle a giant problem. A small, simplified, model can be held in your head while you learn. When your organisation requires a reliable solution, reach for a battle-tested library; when you want to understand, build from scratch.

To quote Feynman:

What I cannot create, I do not understand

(Ironically? Poetically? I found that quote via Chris Adami’s blog while searching for the references for this article … Adami is the guy who started this whole mess for me).

Deep understanding is a funny thing, sometimes you come to learn that the battle-tested library isn’t as useful as you thought.

Your Next steps

If you are interested in Markov Network Brains, you should

  1. Open the simulation
  2. Press F12
  3. Put a break-point somewhere in the code
  4. Start stepping through it

Stepping through running code and observing the changes is the best way to learn about a program’s behaviour.

Fork and Fix

A co-worker was asking me about this program, and started suggesting all kinds of great ideas I could implement to make it more interesting. Suddenly I realised that I have taken everything I want from this little toy. I’m going to move on to other projects. Instead I suggested he should make the changes!

Fork the project from the same point I did, and build it up yourself. Or Fork my version, and make a cool modification … either way, I would love to see what you come up with.

  • Change the physics (collisions, spherical world, …)
  • Make the bugs aware of one another (watch competitive behaviour evolve? co-operative behaviour?)
  • Find a more challenging problem for them to solve (randomise the food, introduce fight and flight to make them prey, …)
  • Put the brain in a separate thread (someone please do this) or on the GPU
  • Probabilistic Logic vs Binary
  • … trust me: the list could go on forever …

Further Reading

You could also read more about Markov Network Brains from people that actually know what they are talking about. I have taken some liberties with the metaphors I have used, learning the shared metaphors and terminology would be useful:

Having discovered a passion for business data analysis in my teens, I love to share the beauty of data and complex systems with other devs and clients alike.