Uniqueness validation in CQRS Architecture
By Jérémie Chassaing on Wednesday, October 28, 2009, 15:05 - Domain Driven Design - Permalink
This is a short follow up on Bjarte’s Post.
There’s an important thing to consider when needing set validation : why ?
Why do these things need to be considered together and cannot just be handled separately ?
We can distinct two different parameters in uniqueness, Cardinality and Scope.
Cardinality
There are mainly two types of cardinality :
1 Cardinality
Only one employee can be the boss.
The model could provide a IsBoss property on every employee… But constancy would be very hard to achieve, especially in a CQRS architecture.
We should read the preceding rule as :
The company has only one boss. The boss is an employee.
Now, we can model a Boss property on the Company Aggregate Root that will reference the employee that is the boss. Changing the boss can now be an atomic and consistent operation.
We can see that we had to introduce a upper level to manage it (we’ll se it in the Scope section).
n Cardinality
Employee should have different user names.
We can clearly see here that user names must be different because they’ll act as identifiers. This is the goal of almost any uniqueness constraint. The property will be used as a key in a lookup.
The 1 (or 2 or 3) cardinality also act this way. It’s a way to tag an entity. You can ask “who is the boss ?” and get the answer by a simple lookup at the Boss property that acts like a bucket in a hash table.
Scope
There is no such thing as global scope
Even when we say, “Employee should have different user names”, there is a implicit scope, the Company.
Even when we say, “You Id Card number should be unique”, understand, “at the Country scope”.
Even when we say, “Your DNA should be unique”, understand, “At our life understanding scope”.
Find the scope and see the volume of data whose uniqueness should be enforced.
As we said, properties that have a uniqueness constraint are usually used as lookup values to find those entities. As such they rarely take part in the child entity domain logic.
Instead of having a UserName property on the Employee entity, why not have a UserNames key/value collection on the Company that will give the Employee for a given user name ?
If the expected Employee count is expected to be in a limited range, this is the most appropriate solution.
If the number can grow, loading it in memory on each Company hydratation is a bit heavy, so keep the directory on disk (using a table with a unique key in a RDBMS as suggested by Bjarte) or any other way that provide the expected performance.
Conclusion
In every case, when a uniqueness constraint appear on a property, the property does not belong the the entity itself but should be viewed as a key to access the entity from the upper level scope.
Do you have examples that cannot be solved this way ?
Comments
Great post :)
Its great to have this kind of blog post out there for everyone. I know that the scope issue was one that gave me some grief when I first started down the road to CQRS. Greg often has to repeat himself time and again on the DDD Yahoo Group, so having a blog post may help people's Google-fu to answer this question themselves.
Hi Jeremy,
I remember talking about this with you. Crowling over the web about CQRS trying to understand more deeply the concepts, I re-read this post. I also read you discussion in DDD Yahoo group about "A possible solution for uniqueness validation in CQS". (http://tech.groups.yahoo.com/group/...)
I agree with you with most arguments, but one let me sceptical: it is about how do you implement "...keep the directory on disk..." (just before your conclusion) or "...May be one of these new key-value stores could help with that...." (as bodrin writes in DDD list) ? Particularly when using Event Sourcing ? Is it several indexes managed by your AR repository ? Does it mean that it is CommandHandler that calls your Repository, or does your Company has access to the Repository or somewhat else... ? Could you be more precise on this point please ?
Note : I see a solution around Udi comment on the DDD list: rely on the constraint violation management of the store (event or state in fact). It seems really simple using a RDBMS system but I am wondering if you were using something that fakes the same constraint violation management in a custom event store...
Clément
Hi Jérémie,
I've really enjoyed reading this series of posts, and this one was no exception! I particularly like the idea that there is no such thing as global scope, and the company=>boss example is perfect to demonstrate this.
This post really got me thinking.
"Instead of having a UserName property on the Employee entity, why not have a UserNames key/value collection on the Company that will give the Employee for a given user name ?"
If I've understood Udi's posts on CQRS, I think he'd probably advocate the collection of Usernames being part of the Query-side, rather than the Command side. I've heard him mention before that the query side is often used to facilitate the process of choosing a unique username - the query store checks the username as the user is filling in the "new user" form, identifying that a username already exists and suggesting alternatives.
Of course this approach isn't bullet-proof, and it will still remain the responsibility of another component to handle the enforcing of the constraint.
The choice of WHERE to put this logic is a question that is commonly debated.
Some argue that since uniqueness of usernames is required for technical reasons (identifing a specific user) rather than for business reasons this logic falls outside of the domain to handle.
Others may argue that this logic should fall in the domain - perhaps under a bounded context of managing user accounts.
In either case, since we have a technical problem (concurrecy conflicts) and we have several possible solutions, the decision of whether on not they are suitable will probably constrained by the expected frequency of the problem occuring. This sounds to me like the kind of thing that would appear in a SLA.
I guess then the solution chosen to enforcing the uniqueness constraint will depend on the agreed SLA. Perhaps it is acceptable that a command may fail hard (due to the RDBMS failing) on the few cases of concurrency conflicts - it might only be on a 0.0001% of cases.
Alternatively we may decide that it is unacceptable to allow this to occur due to the frequency of this occuring. We could choose to maintain the list of usernames in the Company aggregate, but scale out our system such that all "new user" requests in the username range A-D are handled by a specific server. If we decide to enforce this constraint outside of our domain, we can offload this work to the command handler.
What do you think?
Craig
You should check the likeliness of conflicts (you can reduce them by validating uniqueness on the client/read side before sending the command) and the cost of conflicts.
If you can structuraly dismiss conflicts (like the company => boss case, or when unicity is inside an aggregate), do it !
In other cases, unicity should not affect the domain, and there's a good reason for it :
your unicity crosses aggregate boundaries. If your boudaries are well defined, a temporary breaking of unicity should be ok, since consistency cannot be garanteed accross boundaries.
If not... move your boundaries.
Cool stuff, man. I like this "Instead of having a UserName property on the Employee entity, why not have a UserNames key/value collection on the Company that will give the Employee for a given user name ?"
Thanks for the article. I've been looking to get more into CQRS since discovering CouchDB and Rinat's (http://abdullin.com/cqrs/) write-up on it. I've implemented a DDD repository pattern that let's you inject either a Hibernate or Couch impl and next I'm going to be writing some Commands against it.