Build a Simulation

Author

Darren Irwin

Now that we’ve learned a lot of programming methods, let’s build a more complex simulation. We could choose all sorts of things to simulate. Here I have chosen a somewhat whimsical example of a hypothetical biological process that has some resemblance to known examples (e.g. male lizards that have three mating strategies, each of which is better than one of the others). We’ll use this to explore the steps involved in building a simulation.

Rock—paper—scissors on a landscape

Our task: Imagine a population of individuals spread out on a landscape. Individuals come in three possible behavioural morphs, with each having one strategy. We’ll call these rock, paper, and scissors, via analogy to the popular game. Individuals move randomly on the landscape. When they encounter each other, they compete in such a way where:

paper defeats rock
scissors defeats paper
rock defeats scissors

The losing individual disappears, and the winning individual reproduces, causing one more individual to appear of the same winning type.

Our goal is to write a program that will carry out the simulation and allow us to see the dynamics that these simple rules produce on the landscape.

First step: consider data structures and the high-level logic of our program

There are many ways to solve any particular programming problem. Our goal is to come up with something that is reasonably efficient and easily understood by others reading our code (and ourselves in the future ;). Let’s first think of how to convert the language given by our task above into more precise concepts that we can specify in code.

“landscape”: We could conceptualize this as a square of space (bounded by 0 and 1), with locations given by an x value and a y value (both floating-point numbers).

“population of individuals”: We could view this as a list or vector, with the first element representing the first individual, the second representing the second, and so forth. Different vectors could represent different aspects of these individuals (one vector for the x values, one for the y values, etc.).

“strategies”: We can encode these as single letters (i.e. Chars), with the strategies for the whole population being a vector of these.

“move randomly on the landscape”: We can move individuals by adding a random number drawn from a normal distribution (with the standard deviation specified in a way that keeps movements small).

“encounter each other”: This is vague as stated, and there are many ways we could make this more concrete. One way is to pick a random individual that will interact, and then determine the closest individual to that first one. These two then compete.

“compete”: The strategies of the two individuals will be compared, and the winner determined by the rules above. Then, we need to replace the losing individual with a new individual with the strategy of the winner. (We realize though that in terms of the coding, this is the same as just changing the strategy of the losing individual. We will make use of this as it means we don’t have to eliminate cells from our data structure, only change them.)

“see the dynamics”: This implies we want to watch the simulation play out over a long period of time. So we need to use a loop to run the steps for many many iterations. We also need to show the results visually. We could either watch the simulation in real time or produce a movie by saving frames and then assembling them (fortunately, this is easy as we’ll see below).

Now, build the program in steps, using functions for organization

After we’ve spent some time (see above) thinking about how we will conceptualize our simulation in terms of data structures and tasks, we are ready to start writing code.

When I started programming, I tended to write long complex programs as a single long script that does everything. I’ve gradually learned that organizing programs into functions, each of which does one clear task, leads to many benefits (e.g., good logic, undertandability of code, and efficiency). In Julia, this is especially true.

So, let’s start writing!

Set up a landscape with randomly-placed individuals, each playing one of the three strategies.

N = 5000  # the number of individuals in the simulation
x_locations = rand(N)
y_locations = rand(N)
strategies = rand(['R', 'S', 'P'], N)

5000-element Vector{Char}:
 'S': ASCII/Unicode U+0053 (category Lu: Letter, uppercase)
 'S': ASCII/Unicode U+0053 (category Lu: Letter, uppercase)
 'S': ASCII/Unicode U+0053 (category Lu: Letter, uppercase)
 'R': ASCII/Unicode U+0052 (category Lu: Letter, uppercase)
 'R': ASCII/Unicode U+0052 (category Lu: Letter, uppercase)
 'P': ASCII/Unicode U+0050 (category Lu: Letter, uppercase)
 'P': ASCII/Unicode U+0050 (category Lu: Letter, uppercase)
 'P': ASCII/Unicode U+0050 (category Lu: Letter, uppercase)
 'S': ASCII/Unicode U+0053 (category Lu: Letter, uppercase)
 'R': ASCII/Unicode U+0052 (category Lu: Letter, uppercase)
 'P': ASCII/Unicode U+0050 (category Lu: Letter, uppercase)
 'P': ASCII/Unicode U+0050 (category Lu: Letter, uppercase)
 'S': ASCII/Unicode U+0053 (category Lu: Letter, uppercase)
 ⋮
 'S': ASCII/Unicode U+0053 (category Lu: Letter, uppercase)
 'R': ASCII/Unicode U+0052 (category Lu: Letter, uppercase)
 'R': ASCII/Unicode U+0052 (category Lu: Letter, uppercase)
 'R': ASCII/Unicode U+0052 (category Lu: Letter, uppercase)
 'R': ASCII/Unicode U+0052 (category Lu: Letter, uppercase)
 'P': ASCII/Unicode U+0050 (category Lu: Letter, uppercase)
 'R': ASCII/Unicode U+0052 (category Lu: Letter, uppercase)
 'R': ASCII/Unicode U+0052 (category Lu: Letter, uppercase)
 'S': ASCII/Unicode U+0053 (category Lu: Letter, uppercase)
 'P': ASCII/Unicode U+0050 (category Lu: Letter, uppercase)
 'R': ASCII/Unicode U+0052 (category Lu: Letter, uppercase)
 'S': ASCII/Unicode U+0053 (category Lu: Letter, uppercase)

The above created 3 vectors, each of length N. These three store the three pieces of info for the individuals in our simulation (x location, y location, and strategy).

Set up a visualization of the population

We envision a 2-dimensional plot representing the geographic range, with dots representing individuals at their locations. We can use 3 colors to represent the three strategies. Let’s define the color key:

plot_colors = Dict('R'=>"red", 'P'=>"green2", 'S'=>"blue")

Dict{Char, String} with 3 entries:
  'P' => "green2"
  'R' => "red"
  'S' => "blue"

Here we used an object we might not have encountered before—a Dictionary or Dict—which contains key-value pairs. You can think of a Dict as a lookup table. For example we can lookup the color that we have designated for the paper strategy P like this:

plot_colors['P']

"green2"

Now let’s write a function to plot the population:

using Plots
function show_sim(x_locations, y_locations, strategies, plot_colors)
    sim_plot = scatter([], [])  # starts an empty plot
    for i in unique(strategies)
        selection = (strategies.==i)
        scatter!(x_locations[selection], y_locations[selection]; markercolor=plot_colors[i], markersize=2, markerstrokewidth = 0, markerstrokealpha = 0, markershape = :circle, legend=false)
    end
    sim_plot  # shows the plot
end

show_sim (generic function with 1 method)

The above simply defined our plot-showing function. Now let’s call it to show the initial state of our simulated population:

plot1 = show_sim(x_locations, y_locations, strategies, plot_colors)

We can see that we have succeeded in setting up a population of individuals with random locations and strategies. Hooray!

We now have to write some code to make individuals interact:

Write a function competing pairs of individuals, and determining the winner

Write a “roshambo” function to determines the winner of two strategies

Your function should take two input arguments (named s1 and s2), with each being a strategy (either 'R', 'P', or 'S'), and then output 0 if there is no winner, 1 if s1 is the winner, or 2 if s2 is the winner.

Test the function above:

roshambo('S', 'R')

roshambo('P', 'R')

roshambo('S', 'S')

We now have a short expression that tells us which individual wins. This is an example of abstraction (a key concept in computer science), because we don’t have to think about the detailed code in the function—we know it just works.

Now let’s build code that chooses a random individual and the individual closest to it (so that we can then compete them).

Write function to get the closest individual to a given individual

function find_closest(x_locations, y_locations, ind)
    x_focal = x_locations[ind] 
    y_focal = y_locations[ind]
    dists = sqrt.((x_locations .- x_focal).^2 .+ (y_locations .- y_focal).^2)
    dists[ind] = 1000  # set this individual's distance to itself very high,
                       # so it is not chosen in the next line as closest
    findmin(dists)[2]  # gets index for the minimum value 
end

find_closest (generic function with 1 method)

Above, we have developed functions for choosing the closest individual (to a given individual) and for determining the winner when those two individuals interact. We also need to build a way to move individuals on the landscape:

Write function to disperse an individual:

sigma_disp = 0.005
function disperse_ind!(x_locations, y_locations, ind, sigma_disp)
    new_x = -9.9 # initialize with arbitrary wrong value
    new_y = -9.9 
    # try new locations, until in range (then break out of loop)
    while !(0 < new_x < 1) || !(0 < new_y < 1)  
        new_x = x_locations[ind] + (sigma_disp * randn())
        new_y = y_locations[ind] + (sigma_disp * randn())
    end
    # assign new location to individual in population
    x_locations[ind] = new_x  
    y_locations[ind] = new_y
end

disperse_ind! (generic function with 1 method)

We include the ! symbol as part of the name of the disperse_ind!() function, to convey that the function modifies some of the input objects.

Put the parts together, into one function that runs the whole simulation

function run_sim!(iterations)
    for i in 1:iterations
        rand_ind = rand(1:length(x_locations))  #choose the index of a random individual
        closest_ind = find_closest(x_locations, y_locations, rand_ind)
        winner = roshambo(strategies[rand_ind], strategies[closest_ind])
        # replace loser with winning strategy
        if winner == 1 
            strategies[closest_ind] = strategies[rand_ind]
        elseif winner == 2
            strategies[rand_ind] = strategies[closest_ind]  
        end
        # move these individuals a bit, but with boundary of 0 and 1
        disperse_ind!(x_locations, y_locations, rand_ind, sigma_disp)
        disperse_ind!(x_locations, y_locations, closest_ind, sigma_disp)
    end
    show_sim(x_locations, y_locations, strategies, plot_colors)
end

run_sim! (generic function with 1 method)

The above function runs the simulation for a given number of iterations, modifying the population vectors as it goes. At the end it returns a plot of the state of the simulation.

Let’s test the function above:

run_sim!(5000)

That shows us the result of the simulation after a given number of iterations. But we want to see the dynamics. Let’s make a movie!

Make a movie of the simulation

The Plots package, which we have already loaded, has a wonderfully simple way of writing code that generates movies:

We use the @animate macro to initialize an Animation object and store a series of plots that are frames of the animation.
We use the gif() function to convert the object to an animated gif.

Let’s use this by running our simulation for 100 small chunks of time, showing a frame after each chunk:

anim = @animate for i in 1:100
    plot_frame = run_sim!(5000) 
end
gif(anim, fps=4)

[ Info: Saved animation to /Users/darrenirwin/Documents/Github_repos_MacBookPro/JuliaProgrammingForBiologists/tmp.gif

We can now see the dynamics of the simulation much more clearly. We see the rock—paper—scissors game on a landscape with limited dispersal leads to waves of each strategy replacing other strategies over time.

Play with the above

I’ve designed this example to provide a basic structure and logic for this sort of “individuals on a landscape” sort of simulation. You can play around with any element of the above example. For example, you try the following (starting with simple and scaling up to more challenging):

Change the dispersal distance or number of individuals, and see what happens.
Make the graph fancier, with axis labels and explanation of colors, or adding the number of iterations as text somewhere.
Add a fourth strategy with its own rules, and see what happens.
Store the three population vectors in a single data structure (by using the mutable struct command to make a new type)
Change the way the population interacts with the edge of the range—for instance, what if individuals who move off the right side were to appear on the left side? This would make the range have no limits, but still finite space (the resulting geometry is called a torus—like the surface of a donut in 3D space.)
Show some different aspect of simulation behaviour—for instance, can you save the movements of a single individual over time, and show just that individual’s path at the end? Or could you show the population size of the three strategies over time?
Add something new to the simulation—for example, what if individuals were to mate when they meet, with offspring inheriting neutral alleles (meaning no selective effects) from both parents? Could you then use this framework to track allele frequencies on the landscape over time?

Think about simulating something that interests you

In this series of tutorials we’ve learned a lot of the central logic used in programming. Now it is a matter of putting it together in ways that do things that interest you. Think about some biological question that interests you and try to consider how a computer simulation might provide you with some insight regarding that question. Then, think about how to build a simple simulation. Start small and simple. Once you develop a framework, you can add complexity later.

Some simulation ideas that come to mind, to get the ball rolling:

Migration path of a bird, according to some simple rules of movement (e.g. fly southward, but with variation in exact direction and distance in each time step).
Path of growth of a cell (e.g. an axon?) according to some simple rules, perhaps in a way that avoids (or is attracted to) other cells.
Predator - prey dynamics on a landscape.
Change in allele frequencies over time, with random mating and no selection in a population of constant size (there is a name for this, which rhymes with “magnetic lift”).
Simulation of how random mutations in a certain gene sequence might affect the amino acid structor of the encoded protein.