(случайный выбор)

Bob Ippolito (@etrepum)

Erlang Factory Lite 2012, Moscow

- I was rewriting an IRC bot
- He "learns" how to chat
- Brain is based on Markov chains (Це́пь Ма́ркова)
- Random choice is essential to this and many other interesting applications

- Each step in a markov chain model makes a random choice
- For the IRC bot, the number of word choices can be very large
- Previous implementation was very simple, but too inefficient
- Used too much memory, had to be retired years ago

- (Most) ad servers also use random choice
- Mochi Media built this kind of ad server in Erlang
- Was fun to come up with an efficent way to do it
- Note that Markov models weren't used in our ad server
- … but random choice is common to both

- Wichmann-Hill AS183 algorithm from 1982
- … designed for 16 bit computers with limited arithmetic
- It has poor results and you probably shouldn't use it
- Consider the
`crypto`module - … or a third party library (sfmt-erlang)
- Despite this, I will use
`random`for the examples

`uniform()`returns a float`0.0 ≤ X < 1.0`- ○ (heads) when
`X < 0.5` - ● (tails) otherwise

○ (heads), X < 0.5

● (tails), X ≥ 0.5

`uniform(N)`returns an integer`1 ≤ X ≤ N``uniform(N) -> 1 + trunc(N * uniform()).`

```
choose(L) ->
Nth = random:uniform(length(L)),
{H, [Choice | T]} = lists:split(Nth, L),
{Choice, H ++ T}.
```

- Can be used to shuffle a list (very slowly)
- Or simulate a deck of cards
- Not used by our Markov model, but worthy of demonstration

`{sum(Weight), [{Key, Weight}, …]}``weight_split(N, L) -> weight_split(N, L, []). weight_split(N, L, [H={K, W} | T], Acc) -> case N - W of N1 when N1 > 0 -> weight_split(N1, T, [H | Acc]); _ -> {K, lists:reverse(Acc, T)} end.`

```
choose(L) ->
lists:nth(random:uniform(length(L)), L).
```

- Does not change list
- Similar to rolling a die

- With replacement, weight can be implemented simply
- Just add the item to the list multiple times
`[a, a, a, b, c, d]`- 50%
`a`, ~16.6%`b`, …

- Naive solution uses a lot of memory
- Can do better by counting each unique choice
`{6, [{a, 3}, {b, 1}, {c, 1}, {d, 1}]}`- Tradeoff - it's harder to seek to Nth choice

- Create N coins, one for each unique choice
- Choose coin (by die roll), then flip weighted coin
- Uses O(N) memory, has O(1) selection!

head | tail |
---|---|

a 1 | undefined |

b ⅔ | a ⅓ |

c ⅔ | a ⅓ |

d ⅔ | a ⅓ |

- For our IRC bot, the choices update for every word of text
- Alias method is
`O(N)`to update - High
`O(N)`garbage, no sharing at all

- Can build a tree for efficient updates
- Fast
`O(log N)`updates - Low
`O(log N)`garbage, good sharing - Slow
`O(N)`seeks, since sort is by key

- Seems to be a good compromise between speed and memory
- Highest weights first
- Highest weights are most likely to be updated
- Worst case
`O(N)`for every operation - But very good common case, near head of the list

Key | Weight |
---|

etrepum.github.com/rndchoice_efl_2012

github.com/etrepum/rndchoice_efl_2012

@etrepum

/

#