Why this talk?

I wrote an ad server in Erlang
Randomized algorithms are useful for ad serving
They have other fun applications too

Erlang's `random` module

Wichmann-Hill AS183 algorithm from 1982
… designed for 16 bit computers with limited arithmetic
It has poor results and you probably shouldn't use it
Consider the crypto module
… or a third party library (sfmt-erlang)
Despite this, I will use random for the examples

Fair coin

uniform() returns a float
0.0 ≤ X < 1.0
heads when X < 0.5
tails otherwise

Fair coin

X = …

○ (heads), X < 0.5

● (tails), X ≥ 0.5

Die roll

uniform(N) returns an integer 1 ≤ X ≤ N
Also easy to implement with uniform()

uniform(N) ->
    1 + trunc(N * uniform()).

Die roll

X = …

Random selection without replacement

Choice removed from future selections
Shuffle a list (very slowly)
Simulate a deck of cards
Choose a name from a hat

Random selection without replacement

choose(L) ->
  Nth = random:uniform(length(L)) - 1,
  {H, [Choice | T]} = lists:split(Nth, L),
  {Choice, H ++ T}.

Random selection without replacement

Weighted selection without replacement

Weights are not always uniform
Unfair dice, ad selection, sporting events…

Weighted selection without replacement

%% {sum(Weight), [{Key, Weight}, …]}
wselect(N, {Sum, Pairs}) when N =< Sum ->
  wselect(N, {Sum, Pairs}, []).

wselect(N, L, {S, [H={K, W} | T]}, Acc) ->
  case N - W of
    N1 when N1 > 0 ->
      wselect(N1, T, [H | Acc]);
    _ ->
      {K, {S - W, lists:reverse(Acc, T)}}
  end.

Random selection with replacement

Does not change list
Similar to rolling a die

choose(L) ->
  Nth = random:uniform(length(L)) - 1,
  lists:nth(Nth, L).

Random selection with replacement

Weighted selection, naive

With replacement, weight can be implemented simply
Just add the item to the list multiple times
[a, a, a, b, c, d]
50% a, ~16.6% b, …

Weighted selection with replacement

Optimizing for memory

Naive solution uses a lot of memory
Can do better by counting each unique choice
Tradeoff - it's harder to seek to Nth choice

{6, [ {a, 3}
    , {b, 1}
    , {c, 1}
    , {d, 1} ]}

Alias method

Create N coins, one for each unique choice
Choose coin (by die roll), then flip weighted coin
Uses O(n) memory, has O(1) selection!
Can initialize in O(n) time (Vose)
Algorithm doesn't fit on slide :(

Alias method

Die:

Coin:

head	tail
`a` 1	`undefined`
`b` ⅔	`a` ⅓
`c` ⅔	`a` ⅓
`d` ⅔	`a` ⅓

Handling updates

For our IRC bot, the choices update for every word of text
Alias method is O(n) to update
High O(n) garbage, no sharing at all
Hard problem to solve with Erlang's purely functional data structures

Using a tree

Can build a tree for efficient updates
Fast O(log n) updates
Low O(log n) garbage, good sharing
Slow O(n) seeks, since sort is by key

Sorted list

Seems to be a good compromise between speed and memory (in Erlang)
Highest weights first
Highest weights are most likely to be updated
Worst case O(n) for every operation
But very good common case is near head of the list

Sorted list

Key	Weight
0	0

Markov chain

Finite state space
Present, future, and past states are independent
Graph where edges represent probability for state change

Andrey Andreyevich Markov

Two-state Markov Chain

Markov Chain

Markov Chains for Text

State transitions are from one word to the next
Use special tokens for start and stop
One weighted random selection data structure per word (and start)

Markov Text Generator

Questions?

Slides	http://bob.ippoli.to/random_choice_2013/
Source	github.com/etrepum/random_choice_2013
Email	bob@redivi.com
Twitter	@etrepum

Random Choice

Bob Ippolito

October 2013

Why this talk?

Erlang's `random` module

Fair coin

Fair coin

X = …

Die roll

Die roll

X = …

Random selection without replacement

Random selection without replacement

Random selection without replacement

Weighted selection without replacement

Weighted selection without replacement

Random selection with replacement

Random selection with replacement

Weighted selection, naive

Weighted selection with replacement

Optimizing for memory

Alias method

Alias method

Die:

Coin:

Handling updates

Using a tree

Sorted list

Sorted list

Markov chain

Andrey Andreyevich Markov

Two-state Markov Chain

Markov Chains for Text

Markov Text Generator

Questions?

Random Choice

Bob Ippolito

October 2013

Why this talk?

Erlang's random module

Fair coin

Fair coin

X = …

Die roll

Die roll

X = …

Random selection without replacement

Random selection without replacement

Random selection without replacement

Weighted selection without replacement

Weighted selection without replacement

Random selection with replacement

Random selection with replacement

Weighted selection, naive

Weighted selection with replacement

Optimizing for memory

Alias method

Alias method

Die:

Coin:

Handling updates

Using a tree

Sorted list

Sorted list

Markov chain

Andrey Andreyevich Markov

Two-state Markov Chain

Markov Chains for Text

Markov Text Generator

Questions?

Erlang's `random` module