Event Sourcing and CQRS, Serialization

2009-11-05T13:00:38 / jeremie chassaing

Be sure to read the three preceding parts of the series:

Event Sourcing and CQRS, Now !
Event Sourcing and CQRS, Let’s use it
Event Sourcing and CQRS; Dispatch-options

Today, we’ll study to a required part of the event storage : Serialization/Deserialization

The easy way

The .Net framework as several serialization technologies that can be used here, Binary serialization, XML serialization or even DataContract serialization introduced with WCF.

The penalty

The particularity of Event Sourcing is that we will never delete or update stored events. They’ll be logged, insert only, once and forever.

So the log grows. grows. grows.

Event storage size will influence greatly the growth rate of the log.

Xml Serialization

If your system processes frequently lots of events, forget about XML. Far to verbose, you’ll pay the Angle Bracket Tax.

Binary Serialization

But the binary serialization still cost much, even if compact, it will contain type names and field names…

Raw Serialization

You could write serialization/deserialization code into your type.

The type can chose a format, so no extra type/field name is needed. This kind of serialization is very compact – it contains only required bits – but you cannot read data back without the deserialization code.

It can be ok if you plan to have a definite small number of well documented events. Unmanageable if your event type count will grow with time and versions.

Avoid it

Let’s consider how data are stored in a database.

A database contains tables. Tables have a schema. When storing a row, no need to repeat column names on each cell. The data layout is defined by the table schema and will be the same on each row.

We cannot do the same since events have different schemas, but we work with a limited set of events that will occur many times.

Split schema and data

We can thus store schemas aside, and specify the row data schema on each row. The event data will the be stored as raw bits corresponding to specified schema.

This way you can design tools to explore your log file with complete event representation without needing the original event class, and you got a very compact serialization. Have your cake and eat it too !

Stay tuned, the code comes tomorrow…

// thinkbeforecoding