Random Choice

Bob Ippolito

October 2013

Why this talk?

  • I wrote an ad server in Erlang
  • Randomized algorithms are useful for ad serving
  • They have other fun applications too

Erlang's random module

  • Wichmann-Hill AS183 algorithm from 1982
  • … designed for 16 bit computers with limited arithmetic
  • It has poor results and you probably shouldn't use it
  • Consider the crypto module
  • … or a third party library (sfmt-erlang)
  • Despite this, I will use random for the examples

Fair coin

  • uniform() returns a float
  • 0.0X < 1.0
  • heads when X < 0.5
  • tails otherwise

Fair coin

0.0 0.5 1.0 0 Heads 0 Tails

X =

○ (heads), X < 0.5

● (tails), X ≥ 0.5

Die roll

  • uniform(N) returns an integer 1XN
  • Also easy to implement with uniform()
uniform(N) ->
    1 + trunc(N * uniform()).

Die roll

1 2 3 4 5 6 0 0 0 0 0 0

X =

Random selection without replacement

  • Choice removed from future selections
  • Shuffle a list (very slowly)
  • Simulate a deck of cards
  • Choose a name from a hat

Random selection without replacement

choose(L) ->
  Nth = random:uniform(length(L)) - 1,
  {H, [Choice | T]} = lists:split(Nth, L),
  {Choice, H ++ T}.

Random selection without replacement

Weighted selection without replacement

  • Weights are not always uniform
  • Unfair dice, ad selection, sporting events…

Weighted selection without replacement

%% {sum(Weight), [{Key, Weight}, …]}
wselect(N, {Sum, Pairs}) when N =< Sum ->
  wselect(N, {Sum, Pairs}, []).

wselect(N, L, {S, [H={K, W} | T]}, Acc) ->
  case N - W of
    N1 when N1 > 0 ->
      wselect(N1, T, [H | Acc]);
    _ ->
      {K, {S - W, lists:reverse(Acc, T)}}
  end.

Random selection with replacement

  • Does not change list
  • Similar to rolling a die
choose(L) ->
  Nth = random:uniform(length(L)) - 1,
  lists:nth(Nth, L).

Random selection with replacement

Weighted selection, naive

  • With replacement, weight can be implemented simply
  • Just add the item to the list multiple times
  • [a, a, a, b, c, d]
  • 50% a, ~16.6% b, …

Weighted selection with replacement

Optimizing for memory

  • Naive solution uses a lot of memory
  • Can do better by counting each unique choice
  • Tradeoff - it's harder to seek to Nth choice
{6, [ {a, 3}
    , {b, 1}
    , {c, 1}
    , {d, 1} ]}

Alias method

  • Create N coins, one for each unique choice
  • Choose coin (by die roll), then flip weighted coin
  • Uses O(n) memory, has O(1) selection!
  • Can initialize in O(n) time (Vose)
  • Algorithm doesn't fit on slide :(

Alias method

Die:

Coin:

head tail
a 1 undefined
b a
c a
d a

Handling updates

  • For our IRC bot, the choices update for every word of text
  • Alias method is O(n) to update
  • High O(n) garbage, no sharing at all
  • Hard problem to solve with Erlang's purely functional data structures

Using a tree

  • Can build a tree for efficient updates
  • Fast O(log n) updates
  • Low O(log n) garbage, good sharing
  • Slow O(n) seeks, since sort is by key

Sorted list

  • Seems to be a good compromise between speed and memory (in Erlang)
  • Highest weights first
  • Highest weights are most likely to be updated
  • Worst case O(n) for every operation
  • But very good common case is near head of the list

Sorted list

Key  Weight
0 0

Markov chain

  • Finite state space
  • Present, future, and past states are independent
  • Graph where edges represent probability for state change

Andrey Andreyevich Markov

Андре́й Андре́евич Ма́рков
Андре́й Андре́евич Ма́рков

Two-state Markov Chain

Markov Chains for Text

  • State transitions are from one word to the next
  • Use special tokens for start and stop
  • One weighted random selection data structure per word (and start)

Markov Text Generator

Questions?

Slides

http://bob.ippoli.to/random_choice_2013/

Source

github.com/etrepum/random_choice_2013

Email

bob@redivi.com

Twitter

@etrepum