# Why this talk?

• I wrote an ad server in Erlang
• Randomized algorithms are useful for ad serving
• They have other fun applications too

# Erlang's `random` module

• Wichmann-Hill AS183 algorithm from 1982
• … designed for 16 bit computers with limited arithmetic
• It has poor results and you probably shouldn't use it
• Consider the `crypto` module
• … or a third party library (sfmt-erlang)
• Despite this, I will use `random` for the examples

# Fair coin

• `uniform()` returns a float
• `0.0 ≤ X < 1.0`
• heads when `X < 0.5`
• tails otherwise

# Fair coin

### X = …

● (tails), X ≥ 0.5

# Die roll

• `uniform(N)` returns an integer `1 ≤ X ≤ N`
• Also easy to implement with `uniform()`
``````uniform(N) ->
1 + trunc(N * uniform()).``````

# Random selection without replacement

• Choice removed from future selections
• Shuffle a list (very slowly)
• Simulate a deck of cards
• Choose a name from a hat

# Random selection without replacement

``````choose(L) ->
Nth = random:uniform(length(L)) - 1,
{H, [Choice | T]} = lists:split(Nth, L),
{Choice, H ++ T}.``````

# Weighted selection without replacement

• Weights are not always uniform
• Unfair dice, ad selection, sporting events…

# Weighted selection without replacement

``````%% {sum(Weight), [{Key, Weight}, …]}
wselect(N, {Sum, Pairs}) when N =< Sum ->
wselect(N, {Sum, Pairs}, []).

wselect(N, L, {S, [H={K, W} | T]}, Acc) ->
case N - W of
N1 when N1 > 0 ->
wselect(N1, T, [H | Acc]);
_ ->
{K, {S - W, lists:reverse(Acc, T)}}
end.``````

# Random selection with replacement

• Does not change list
• Similar to rolling a die
``````choose(L) ->
Nth = random:uniform(length(L)) - 1,
lists:nth(Nth, L).``````

# Weighted selection, naive

• With replacement, weight can be implemented simply
• Just add the item to the list multiple times
• `[a, a, a, b, c, d]`
• 50% `a`, ~16.6% `b`, …

# Optimizing for memory

• Naive solution uses a lot of memory
• Can do better by counting each unique choice
• Tradeoff - it's harder to seek to Nth choice
``````{6, [ {a, 3}
, {b, 1}
, {c, 1}
, {d, 1} ]}``````

# Alias method

• Create N coins, one for each unique choice
• Choose coin (by die roll), then flip weighted coin
• Uses `O(n)` memory, has `O(1)` selection!
• Can initialize in `O(n)` time (Vose)
• Algorithm doesn't fit on slide :(

# Alias method

### Coin:

a 1 undefined
b a
c a
d a

• For our IRC bot, the choices update for every word of text
• Alias method is `O(n)` to update
• High `O(n)` garbage, no sharing at all
• Hard problem to solve with Erlang's purely functional data structures

# Using a tree

• Can build a tree for efficient updates
• Fast `O(log n)` updates
• Low `O(log n)` garbage, good sharing
• Slow `O(n)` seeks, since sort is by key

# Sorted list

• Seems to be a good compromise between speed and memory (in Erlang)
• Highest weights first
• Highest weights are most likely to be updated
• Worst case `O(n)` for every operation
• But very good common case is near head of the list

Key  Weight
0 0

# Markov chain

• Finite state space
• Present, future, and past states are independent
• Graph where edges represent probability for state change

# Markov Chains for Text

• State transitions are from one word to the next
• Use special tokens for start and stop
• One weighted random selection data structure per word (and start)

# Questions?

 Slides http://bob.ippoli.to/random_choice_2013/ Source github.com/etrepum/random_choice_2013 Email [email protected] Twitter @etrepum