Toss them, Coins! Home Twitter schemer.in
Random generation of events is important for modeling many statistical processes. In this post we will discuss how to simulate the tossing of an unbiased coin in Clojure. We will do this using pure functional techniques. We will also write functions to derive some interesting statistics from coin-toss experiments.
A coin toss can result in either a head or a tail. We shall assign the integer
1 to head and
0 to tail,
so that we can use the
rand-int function to simulate a single toss.
> (defn toss  (rand-int 2)) > (toss) ;;=> will output 1 or 0 with equal chance. probability=0.5
How can we simulate
N tosses? With the help
of lazy sequences.
The following function will give us as many tosses as we want!
> (defn tosses  (lazy-seq (cons (toss) (tosses)))) > (take 10 (tosses)) ;;=> (1 1 0 1 0 1 0 0 0 1)
Now we would want to answer some questions about the data gathered from all those tosses, like how many heads were generated
by the experiment? The higher-order function
can help us here:
> (reduce + 0 (take 10 (tosses))) ;;=> 5
Note You may get a different result, as the simulation includes random values.
It is well-known that the ratio of the number of heads (or tails) to the total number of tosses will approach
the number of tosses increases. To test this out, we first need a version of
reduce that returns a sequence of cumulative
results. The definition of this function follows:
(defn cumulate [f init xs] (when (seq xs) (let [r (f init (first xs))] (lazy-seq (cons r (cumulate f r (rest xs)))))))
Cumulate applies the function
init and the first element of
The result (
r) becomes the head of a lazy sequence. The tail of this sequence is constructed by
a recursive call to
r being the value of
init. This process continues
xs runs out of values. The returned sequence will contain all intermediate results of the reduction of
cumulate on the same inputs and see how the outputs differ:
> (reduce + 0 [1 2 3 4 5]) ;;=> 15 > (cumulate + 0 [1 2 3 4 5]) ;;=> (1 3 6 10 15)
The value returned by
reduce and the final value in the sequence returned by
cumulate are the same.
cumulate also captures the intermediate values of the reduction.
Cumulative summing of sequences can be expressed by the following partial application of
> (def sums (partial cumulate + 0))
We have all the tools necessary to experiment and find out if heads appear with a probability of
0.5 if a coin is
tossed for a sufficiently large number of times. The following program will conduct this experiment for us:
> (def xs (take 3000 (tosses))) > (def ss (sums xs)) > (def rs (map #(/ (float %1) %2) ss (range 1 3001))) > (drop 2500 rs) ;;=> (0.5041983206717313 0.504396482813749 0.5041949660407511 ...)
3000 tosses are made. The number of heads seen until each toss is found by calling
The higher-order function
map walks down this sequence, dividing each entry by the corresponding toss number
to produce a sequence of ratios.
As we see in the final output, the ratio of heads indeed approach
There are other interesting information that we can derive from the data returned by
For instance, can you design a function to report the excess of heads over tails for a particular toss?
This problem can be solved by finding the difference between the cumulative number of heads and the cumulative number of tails
for the specified toss. We can find the cumulative of tails by first "inverting" the tosses and then applying
the result. Here is the invert function:
> (defn inv [xs] (map #(if (zero? %) 1 0) xs)) > (def xs (take 10 (tosses))) > xs ;;=> (0 0 0 1 1 1 0 0 1 0) > (def ys (inv xs)) > ys ;;=> (1 1 1 0 0 0 1 1 0 1)
excess defined below, finds the difference between a sequence and its inverse:
> (defn excess [xs n] (let [hs (sums xs) ts (sums (inv xs))] (nth (map #(- %1 %2) hs ts) n))) > (excess xs 4) ;;=> -1 > (excess xs 5) ;;=> 0 > (excess xs 6) ;;=> -1
Exercise 1 Write a function to report the first
n excesses of tails over heads.
Exercise 2 Use Incanter to visualize the various information you gather about coin tosses. To start, you may plot the ratios obtained from the first 3000 tosses.
Note The inspiration for this post came from this lecture by Prof. Keith Smillie.