Eventually Consistent HTTP with Statebox and Riak

Author: Bob Ippolito (@etrepum)
Date: November 2011
Venue:QCon San Francisco 2011

Introduction

This talk isn't really about web. It's about how we model data for the web.

HTTP itself is not the interesting part of our systems. Our systems are mostly JSON over HTTP at the network boundary, nothing too clever!

Mochi's Business

Just a few years ago…

Why was this easy?

Why does this break?

Case Study: Friendwad

Friendwad Data Model

Friendwad Diagram

2011-11-14 22:16ZCanvas 1Layer 1AliceBobFollowingAliceBobFollowers

Mnesia Implementation

Why not Mnesia?

Riak

Riak Migration

Riak Migration Continued

img/facepalm.jpg

Eventual Inconsistency

Version Terminology

Adding a friend [1]

Original state o for alice, bob on read

id        | alice   | bob     |
followers | []      | []      |
following | []      | []      |
version   | o       | o       |

Adding a friend [2]

Write modified bob at version ao

id        | alice   | bob     |
followers | []      | []      |
following | []      | [alice] |
version   | o       | ao      |

Adding a friend [3]

Write modified alice at version ao

id        | alice   | bob     |
followers | [bob]   | []      |
following | []      | [alice] |
version   | ao      | ao      |

Interleaving for Fail

Concurrency Pains [1]

alice → bob (a) initial state

id        | alice   | bob     | carol |
followers | []      | []      | []    |
following | []      | []      | []    |
version   | o       | o       | o     |

Concurrency Pains [2]

bob → carol (b) initial state (all look same!)

id        | alice   | bob     | carol |
followers | []      | []      | []    |
following | []      | []      | []    |
version   | o       | o       | o     |

Concurrency Pains [3]

bob → carol writes to bob

id        | alice   | bob     | carol |
followers | []      | []      | []    |
following | []      | [carol] | []    |
version   | o       | bo      | o     |

Concurrency Pains [4]

alice → bob writes to alice

id        | alice   | bob     | carol |
followers | []      | []      | []    |
following | [bob]   | [carol] | []    |
version   | ao      | bo      | o     |

Concurrency Pains [5]

alice → bob writes to bob

id        | alice   | bob     | carol |
followers | []      | [alice] | []    |
following | [bob]   | []      | []    |
version   | ao      | ao      | o     |

Concurrency Pains [6]

bob → carol writes to carol

id        | alice   | bob     | carol |
followers | []      | [alice] | [bob] |
following | [bob]   | []      | []    |
version   | ao      | ao      | bo    |

FAIL

Concurrency ruins everything.

W     W      W
W        W  W     W
              '.  W
  .-""-._     \ \.--|
 /       "-..__) .-'
|     _         /
\'-.__,   .__.,'
 `'----'._\--'
VVVVVVVVVVVVVVVVVVVVV

Sibling Rivalry

Simple fix?

Merging ao and bo is easy! Just union over followers and following.

Simple fix? NOPE!

But edges are not insert-only! That ruins everything.

It's better, but any inconsistency is just pain waiting to happen.

Fix all of the things

Statebox Design Philosophy

What's Statebox?

Statebox Terminology

{fun ordsets:add_element/2, [kitten]}

Statebox Internals

Statebox Theory

Declarative (ordsets)

Add = fun ordsets:add_element/2,
Empty = statebox:new(fun () -> [] end),
A = statebox:modify({Add, [a]}, Empty),
B = statebox:modify({Add, [b]}, Empty),
AB = statebox:merge([A, B]),
statebox:value(AB) =:= [a, b].

Composable

Empty = statebox_orddict:from_values([]),
Union = fun statebox_orddict:f_union/2,
A = statebox:modify([Union(following, [b]),
                     Union(followers, [c])],
                    Empty),
B = statebox:modify([Union(following, [b]),
                     Union(followers, [d])],
                    Empty),
AB = statebox:merge([A, B]),
statebox:value(AB) =:= [{followers, [c, d]},
                        {following, [b]}].

Statebox Example [1]

A     :: [kitten]
[{1, Union([kitten])}]

Statebox Example [2]

A     :: [kitten]
[{1, Union([kitten])}]

B     :: [puppy]
[{2, Union([puppy])}]

Statebox Example [3]

A     :: [kitten]
[{1, Union([kitten])}]

B     :: [puppy]
[{2, Union([puppy])}]

[A,B] :: [kitten, puppy]
[{1, Union([kitten])},
 {2, Union([puppy])}]

Statebox Merge

Statebox Merge [1]

Use B's value (arbitrarily newest)

                               [puppy]

Value = [puppy]

Statebox Merge [2]

Apply ops oldest to newest (T=1)

               union([kitten], [puppy])

Value = [kitten, puppy]

Statebox Merge [3]

Apply ops oldest to newest (T=2)

union([puppy], union([kitten], [puppy]))

Value = [kitten, puppy]

statebox_riak wrapper

%% bob → alice, bob → carol
S = statebox_riak:new([{riakc_pb_socket, P},
                       {expire_ms, 5000},
                       {max_queue, 50}]),
Union = fun statebox_orddict:f_union/2,
statebox_riak:apply_bucket_ops(
    <<"users">>,
    [{[<<"alice">>, <<"carol">>],
      Union(followers, [bob])},
     {[<<"bob">>],
      Union(following, [alice, carol])}],
    S).

Restrictions

Repeatable Operations

Non-repeatable ops?

statebox_counter

counter optimizations

Other statebox usage

achievements

achievements orddict

Store oldest entry for achievement.

f_store_min(Key, New) ->
    {fun ?MODULE:op_store_min/3, [Key, New]}.

op_store_min(Key, New, D) ->
    orddict:update(
        Key,
        fun (Old) -> min(Old, New) end,
        New,
        D).

scorewad

recordset

An optionally fixed-size ordered set of complex terms.

recordset example (trivial)

Empty = recordset:new(fun erlang:'=:='/2,
                      fun erlang:'<'/2,
                      [{max_size, 2}]),
Full = lists:foldl(fun recordset:add/2,
                   Empty,
                   lists:seq(300, 400)),
[399, 400] =:= recordset:to_list(Full).

What's next?

Better than Statebox?

Why Riak could do it better

Questions?