Think Before Coding

To content | To menu | To search

Event Sourcing

Entries feed - Comments feed

Sunday, April 27, 2014

Monoidal Event Sourcing Examples

Last time we tried to imagine how we could change usual event sourcing definition$ to get a monoidal approach.

 

Here are some examples to make it work in real life.

Monoids properties

We'll try to do it properly, and test that the proposed solutions are proper monoids.

 

Let's recap what defines a monoid:

  • closure of operation
  • associativity
  • neutral element

How to test it

the closure of operation is given by the signature of the operation. In F# it should have a signature like 'a -> 'a –> 'a which represents a function that takes to arguments of type 'a and return a result of the same type 'a.

 

For associativity, we'll use the following test function:

1: 
2: 
let isAssociative (*) x y z =
   ((x * y) * z) = (x * (y * z))

It take a function that will be called '' and three values x y z. It will then test the associativity of '' using FsCheck, a F# port of QuickCheck.

 

FsCheck test the given property for a wide range of values. We'll just have to provide the operator, and FsCheck will check it the associativity holds for a lot of values.

 

For the neutral element, will use the following test function:

1: 
2: 
let isNeutralElement (*) neutral x =
    x * neutral = neutral * x

 

Here we'll have to provide the the operator - called '*' inside the function - and the neutral element.

Trivial cases

There are some obvious trivial cases.

 

Let's try to follow the number of occurrences of a specific event in state. The number of time a user did

something.

 

The map function is simply:

1: 
2: 
3: 
let map = function
    | TheUserDidSomething -> 1
    | _ -> 0

 

The combine operation is then really simple:

1: 
let combine acc v = acc + v

and the neutral element is obviously 0.

 

No need to check it with FsCheck, (N, +, 0) is a monoid..

 

Another a bit less obvious is when an event sets a value while others don't.

 

For instance, let's keep track of the last user address when a user moves.

 

For combination, we'll use the 'right' function, which always takes it's rightmost argument:

1: 
let right x y = y

 

The adress should, of course, not change on other events, and for that, we'll use an option:

1: 
2: 
3: 
let map' = function
    | UserMoved (user, newAddress) -> Some newAddress
    | _ -> None

 

The right function can then be tweaked to take the right argument only when it has a value:

1: 
2: 
3: 
4: 
let right' x y =
    match x,y with
    | x, None -> x
    | _, y -> y

right' has a signature of 'a option -> 'a option -> 'a option, so it's closed on operation.

 

It's associative since, whatever x y z, (x right' y right' z) return the last defined term, however composed.

 

None is the neutral element. Added to the left, it keeps what's on the right, added to the right, it keeps what's on the left.

 

We can check it with FsCheck:

1: 
2: 
Check.Quick (isNeutralElement right' None)
Check.Quick (isAssociative right')

A less trivial case

But what about mixing both ? Some events change the value by increments, while some other set a specific value ? Like a stopwatch that increments with a button to reset it.

 

Can we model this kind of thing as a monoid ?

 

We have an increment or a reset value, let's model it as a discriminated union:

1: 
2: 
3: 
type ChangeReset<'T> =
    | Change of 'T
    | Reset of 'T

 

 

A map function, that map events to state change, would be something like:

1: 
2: 
3: 
4: 
let map'' = function
    | TimePassed duration -> Change duration
    | ButtonPushed -> Reset 0.
    | _ -> Change 0.

 

The first two cases are a direct mapping, for other events, we use the Change 0. values that actually use the neutral element of the underlying monoid. Adding 0 will not change the value.

 

We have here an underlying monoid, here, talking about duration, we use numbers, with addition and 0.

 

But imagine we want to add items to a list, but sometime, reset the list to a specific one like the empty list.

 

we can define a high order combine operation like this:

1: 
2: 
3: 
4: 
5: 
let combine' (*) x y =
    match x,y with
    | Change xi, Change yi -> Change (xi * yi)
    | Reset xv, Change yi -> Reset (xv * yi)
    | _ , Reset yv -> Reset yv

 

It combines to changes as a new change using the underlying monoid operation - called '*' here. It combines changes as a change.

 

The second line states, that a value (Reset) combined with a change will apply the change to the value.

 

But the third line says that when a Reset value is on the right of anything, it overrides what's on it's left.

This operation is by it's signature closed on the ChangeReset<'t> type.

 

It's associative, because while combining changes, it has the associativity of the underlying monoid, and when combining Reset values it has the associativity of the right operation.

 

The neutral element is a change with the value of the neutral element of the underlying monoid.

 

We can verify it with FsCheck:

1: 
2: 
Check.Quick (isNeutralElement (combine' (+)) (Change 0))
Check.Quick (isAssociative (combine' (+)))

General structure

I had some remarks that we needed it to be a group and not only a monoid. The right' and combine function clearly don't define a group because elements don't have a inverse/opposite.

 

What would be the opposite of Reset 5 ? It set the value 5 whatever is on its left.

 

The idea is to take the set of state values S.

 

Then take the set of changes from states in S to other states in S that are represented by the events.

 

Make the union of it. SC = S U C. SC will contain more that valid aggregate states. It will also contain things that are not state, like a value that indicate that the state did not change.

 

But we will ensure the following things: the function that convert to changes will return items of C:

1: 
map: Event -> C 

 

The combine operation will be closed on SC, so it can accept States from S and change from C, and it will be closed on SC: combine:

SC -> SC -> SC

 

But it will also ensure that when a state is given on the left, it will also return a state:

combine:  S -> SC -> S

 

The right' operation ensure this. The state need a value at its start, you can add as many None on the right, it'll still have it's value, and any new value will return this value.

 

For the ChangeReset type, the State is represented by the Reset values -that are actual values- it's S, while changes are represented by the Change values that define C.

 

As long as a Reset value is given at the outmost left, the result will be a Reset value that is part of S.

With this, we don't need a group, but we can do with something that is only slightly more restraint than a monoid, only to keep the semantic of state.

But we need more that a single value !

Of course aggregate state is usually composed of more than a single value.

 

Let's start small and consider a pair.

1: 
type State<'a,'b> = 'a * 'b

 

If 'a and 'b are monoids we can easily combine them :

1: 
let combine'' (+) (*) (xa,xb) (ya,yb) = (xa + ya, xb * yb)

where + is the operation on 'a and * the operation on 'b.

 

The neutral element is simply

1: 
let neutral neutrala neutralb = (neutrala, neutralb)

You can easily check that combine is closed on operation, associative, and that neutral is the neutral element.

 

recursively, you can build a triple as a pair of a pair with a single element (('a,'b), 'c), then a quadruple and any tuple.

 

Since a structure - a record - is just a tuple with names, any struct with monoidal members is also a monoid.

 

And since in Event Sourcing, all the tricky decisions are done before emitting an event, applying the event should just be setting values, incrementing / decrementing values, adding / removing items to sets or lists, all those operation are monoidal, hence almost all aggregates should be convertible to monoidal Event Sourcing.

 

Of course it doesn't mean you should do Event Sourcing this way, but it was rather funny to explore this !

 

Have fun !

Friday, April 11, 2014

Monoidal Event Sourcing

I Could not sleep… 3am and this idea…

 

Event sourcing is about fold but there is no monoid around !

 

What’s a monoid ?

 

First, for those that didn’t had a chance to see Cyrille Martaire’s  fabulous explanation with beer glasses, or read the great series of post on F# for fun and profit, here is a quick recap:

 

We need a set.

Let’s take a simple one, positive integers.

 

And an operation, let say + which takes two positive integers.

 

It returns… a positive integer. The operation is said to be closed on the set.

 

Something interesting is that 3 + 8 + 2 = 13 = (3 + 8) + 2 = 3 + (8 + 2).

This is associativity: (x + y) + z = x + (y + z)

 

Le last interesting thing is 0, the neutral element:

x + 0 = 0 + x = x

 

(N,+,0) is a monoid.

 

Let say it again:

(S, *, ø) is a monoid if

  • * is closed on S  (* : S –> S > S)
  • * is associative ( (x * y) * z = x * (y * z) )
  • ø is the neutral element of * ( x * ø = ø * x = x )

warning: it doesn’t need to be commutative so x * y can be different from y * x !

 

Some famous monoids:

(int, +, 0)

(int, *, 1)

(lists, concat, empty list)

(strings, concat, empty string)

 

The link with Event Sourcing

 

Event sourcing is based on an application function apply : State –> Event –> State, which returns the new state based on previous state and an event.

 

Current state is then:

fold apply emptyState events

 

(for those using C# Linq, fold is the same as .Aggregate)

 

Which is great because higher level functions and all…

 

But fold is event more powerful with monoids ! For integers, fold is called sum, and the cool thing is that it’s associative !

 

With a monoid you can fold subsets, then combine them together after (still in the right order). This is what give the full power of map reduce: move code to the data. Combine in place, then combine results. As long as you have monoids, it works.

 

But apply will not be part of a monoid. It’s not closed on a set.

 

To make it closed on a set it should have the following signature:

 

apply: State –> State –> State, so we should maybe rename the function combine.

 

Let’s dream

 

Let’s imagine we have a combine operation closed on State.

 

Now, event sourcing goes from:

 

decide: Command –> State –> Event list

apply: State –> Event –> State

 

to:

decide: Command –> State –> Event list

convert: Event –> State

combine: State –> State –> State

 

the previous apply function is then just:

apply state event = combine state (convert event)

 

and fold distribution gives:

 

fold apply emptyState events = fold combine emptyState (map convert events) 

 

(where map applies the convert fonction to each item of the events list, as does .Select in Linq)

 

The great advantage here is that we map then fold which is another name for reduce !

 

Application of events can be done in parallel by chuncks and then combined !

 

Is it just a dream ?

 

Surely not.

 

Most of the tricky decisions have been taken in the decide function which didn’t change. The apply function usually just set state members to values present in the event, or increment/decrement values, or add items to a list… No business decision is taken in the apply function, and most of the primitive types in state members are already monoids under there usual operations…

 

And a group (tuple, list..) of monoid is also a monoid under a simple composition:

if (N1,*,n1) and (N2,¤,n2) are monoids then N1 * N2 is a monoid with an operator <*> ( (x1,x2) <*> (y1,y2) = (x1*y1, x2¤y2)) and a neutral element (n1,n2)…

 

To view it more easily, the convert fonction converts an event to a Delta, a difference of the State.

 

Those delta can then be aggregated/folded to make a bigger delta.

It can then be applied to a initial value to get the result !

 

 

The idea seams quite interesting and I never read anything about this.. If anyone knows prior study of the subject, I’m interested.

 

Next time we’ll see how to make monoids for some common patterns we can find in the apply function, to use them in the convert function.

Sunday, July 28, 2013

Event Sourcing vs Command Sourcing

Last month, I made a presentation about Event Sourcing - a shorter version of my DevoxxFr talk. After me, Etienne and Maher from Sfeir, did a presentation on the same subject and their architecture inspired by LMAX.

I immediately noticed their reference to this Event Sourcing page by Martin Fowler, and started to see several point of pain lurking in their long term maintenance…

I won’t make a flaming post against Martin Fowler that has written lots of interesting stuff. Even this article says nothing wrong. It just takes a way that can cause long term pains as expressed in the page itself…

Sourcing Events, but which Events ?

The article starts with a totally valid definition of Event Sourcing:

Capture all changes to an application state as a sequence of events.

The question then is… where do these events come from ?

In this case when the service is called, it finds the relevant ship and updates its location. The ship objects record the current known state of the ships.
Introducing Event Sourcing adds a step to this process. Now the service creates an event object to record the change and processes it to update the ship.

A C# sample a bit further to make it clearer:

class EventProcessor...
IList log = new ArrayList();
public void Process(DomainEvent e) {
e.Process();
log.Add(e);
}

As you can notice, the Event is produced before reaching the event processor…

Constrast this with the following version:

class Cargo
{
IList log = new List();
private State currentState;
public Cargo(IEnumberable events)
{
foreach(var @event in events)
Apply((dynamic) @event);
}
public void Arrive(Port port)
{
// logic to verify the action can be done
// based on current state and command parameters
if (IsAlreadyInPort) throw Exception();
// create an event of what happened with this action
// it should not mutate state,
// but it can capture external state when it arrives
// it's also based on current state and command parameters
var @event = new ShipArrived(id, port, DateTime.Now)
// apply change du to the event
// it should require only current state and
Apply(@event);
log.Add(@event);
// events will be published to the rest of the system
// from there.. This is where further side effect will
// occure
}
private void Apply(ShipArrived @event)
{
// no decision should happen here !
currentState.Port = @event.Port;
currentstate.LastMove = @event.Time;
}
}

From a functional point of view this pattern can be build from two pure functions:

Decide:
Command -> State -> Event list
ApplyStateChange:
State -> Event -> State

Here, stored event has been produced by the aggregate itself ! The output is stored.

Nice, but why should I care ?

After all, since Martin says the first version is ok… let’s go !

This would be without noticing several warnings in the rest of the article.

External Systems

from the same page:

One of the tricky elements to Event Sourcing is how to deal with external systems that don't follow this approach (and most don't). You get problems when you are sending modifier messages to external systems and when you are receiving queries from other systems.

Many of the advantages of Event Sourcing stem from the ability to replay events at will, but if these events cause update messages to be sent to external systems, then things will go wrong because those external systems don't know the difference between real processing and replays.

The second version doesn’t suffer this problem…

Because rebuilding the state (like done in the constructor) only use the Apply method (or the ApplyStateChange function in the functional version)..

This Apply method only works with internal state and produces no external side effects..

External Queries

Another problem arising with Martin Fowler’s proposal:

The primary problem with external queries is that the data that they return has an effect on the results on handling an event. If I ask for an exchange rate on December 5th and replay that event on December 20th, I will need the exchange rate on Dec 5 not the later one.

Here again, the second version doesn’t suffer the problem..

The data from the external system will be used to build the event. It can be directly stored in it (like the current time in the sample), but can also be used in a computation. For instance, command contains prices in USD, query an current rate from USD to EUR, compute a price in EUR and put it in the event.
The rate at the time of the computation is baked in the event ! No need to remember the rate value afterward, especially no need to complex external system gateway.

It could still be better for debugging purpose to put the used rate explicitly in the event.

But the second version intrinsically handles this issue gracefully…

External Interactions

Both queries and updates to external systems cause a lot of complication with Event Sourcing. You get the worst of both with interactions that involve both. Such an interaction might be a an external call that both returns a result (a query) but also causes a state change to the external system, such as submitting an order for delivery that return delivery information on that order. 

Problem solved by version 2….

Code Changes

So this discussion has made the assumption that the application processing the events stays the same. Clearly that's not going to be the case. Events handle changes to data, what about changes to code?
[…]
The third case is where the logic itself changes over time, a rule along the lines of "charge $10 before November 18 and $15 afterwords". This kind of stuff needs to actually go into the domain model itself. The domain model should be able to run events at any time with the correct rules for the event processing. You can do this with conditional logic, but this will get messy if you have much temporal logic. The better route is to hook strategy objects into a Temporal Property: something like chargingRules.get(aDate).process(anEvent). Take a look at Agreement Dispatcher for this kind of style.

Wooo… when I read this, it’s a red flag for me ! I never want to deal with this kind of problems !

Especially if they’re expected to happen for sure !

How does it go with the second version ?

Events are produced by the code that contains the logic. Before November 18, the events emitted where based on code that charge $10. After, the code charges $15.

When using the Apply method, it doesn’t have to know how much to charge, it’s already in saved events !

There is no need to keep an history of versions of domain logic - except in your source control !

It can even cop with changes far more complex that the one in this sample. In any case, all data needed to compute current state has been put in the event.

Correcting logic bugs

One of the advantages advanced by Martin Fowler, is that you can change how you take the decision after the fact.

But if an event is an event, it already happened, and there’s no way we can go back in time to change it. We wont be able to change external side effects anyway, so just accept it.

It’s still possible to apply compensations… like accountants. When they charged you to much, they don’t use a time machine to make has if nothing happened.. the just add a chargeback entry at the end of the ledger.

Command Sourcing ?

I call the pattern proposed by Martin Fowler Command Sourcing.

A Command is a request made to the system to do something. At this point a lot of thing can still happen. It can fail, it can be influenced by external state..

An event is something that happen and that cannot be changed.

You can protest that an Arrival Event is an event, not a command.

Sure, but for the system it’s an upstream event. Not something that happened in the system.

Where’s the difference in the second verions ?

The upstream version will go to a reactor that will produce an Arrive command (notice the present tense) inside the system.
The system will then produce a ShipArrived event (notice the passed tense). This event has been emitted by the system itself.

The Martin Fowler’s version takes a shortcut: bypassing the reactor emitting a command, but it is equivalent to sourcing commands.

Martin, this is a call to action !

Since a lot of people have read, and will read the entry on your web site, can you add something about the pattern described here to reduce the pain of people that will want to start with Event Sourcing ?