# Random Choice (случайный выбор)Bob Ippolito (@etrepum)Erlang Factory Lite 2012, Moscow

## Why this talk?

• I was rewriting an IRC bot
• He "learns" how to chat
• Brain is based on Markov chains (Це́пь Ма́ркова)
• Random choice is essential to this and many other interesting applications

## Markov text generator

• Each step in a markov chain model makes a random choice
• For the IRC bot, the number of word choices can be very large
• Previous implementation was very simple, but too inefficient
• Used too much memory, had to be retired years ago

## Really?

• (Most) ad servers also use random choice
• Mochi Media built this kind of ad server in Erlang
• Was fun to come up with an efficent way to do it
• Note that Markov models weren't used in our ad server
• … but random choice is common to both

## Erlang's random module

• Wichmann-Hill AS183 algorithm from 1982
• … designed for 16 bit computers with limited arithmetic
• It has poor results and you probably shouldn't use it
• Consider the crypto module
• … or a third party library (sfmt-erlang)
• Despite this, I will use random for the examples

## Fair coin

• uniform() returns a float 0.0 ≤ X < 1.0
• ○ (heads) when X < 0.5
• ● (tails) otherwise

## Fair coin

### X = …

● (tails), X ≥ 0.5

## Die roll

• uniform(N) returns an integer 1 ≤ X ≤ N
• uniform(N) -> 1 + trunc(N * uniform()).

## Random selection without replacement

``````choose(L) ->
Nth = random:uniform(length(L)),
{H, [Choice | T]} = lists:split(Nth, L),
{Choice, H ++ T}.``````

## Random selection without replacement

• Can be used to shuffle a list (very slowly)
• Or simulate a deck of cards
• Not used by our Markov model, but worthy of demonstration

## Weighted selection without replacement

• {sum(Weight), [{Key, Weight}, …]}
• ``````weight_split(N, L) -> weight_split(N, L, []).
weight_split(N, L, [H={K, W} | T], Acc) ->
case N - W of
N1 when N1 > 0 ->
weight_split(N1, T, [H | Acc]);
_ ->
{K, lists:reverse(Acc, T)}
end.``````

## Random selection with replacement

``````choose(L) ->
lists:nth(random:uniform(length(L)), L).
``````
• Does not change list
• Similar to rolling a die

## Weighted selection, naive

• With replacement, weight can be implemented simply
• Just add the item to the list multiple times
• [a, a, a, b, c, d]
• 50% a, ~16.6% b, …

## Optimizing for memory

• Naive solution uses a lot of memory
• Can do better by counting each unique choice
• {6, [{a, 3}, {b, 1}, {c, 1}, {d, 1}]}
• Tradeoff - it's harder to seek to Nth choice

## Alias method

• Create N coins, one for each unique choice
• Choose coin (by die roll), then flip weighted coin
• Uses O(N) memory, has O(1) selection!

## Alias method

### Coin:

a 1undefined
ba
ca
da

• For our IRC bot, the choices update for every word of text
• Alias method is O(N) to update
• High O(N) garbage, no sharing at all

## Using a tree

• Can build a tree for efficient updates
• Low O(log N) garbage, good sharing
• Slow O(N) seeks, since sort is by key

## Simple max heap

• Seems to be a good compromise between speed and memory
• Highest weights first
• Highest weights are most likely to be updated
• Worst case O(N) for every operation
• But very good common case, near head of the list

Key Weight