Event Sourcing And Concurrent Updates

When we have to deal with concurrent updates, we generally think about two possibilities:

  • Pessimistic locking: an object is explicitly locked by a system
  • Optimistic concurrency control: the system determines whether another client has changed the object before to update it. There are several possible implementations, like using a CAS or providing an expected version of the object.

The bigger the chances to get a concurrent update are, the better it is to favor a pessimistic locking approach. Indeed, if an optimistic transaction fails, the cost to retry it may be very expensive (back and forth to the client, delay, retry etc.).

In addition, an Event Sourcing strategy can be a good solution to deal with highly concurrent environments.

A common strategy with Event Sourcing is to implement something on top of an optimistic concurrency control. For example, a client will request a change on a given object with providing an expected version. If the request fails because the current version is greater than the expected one, instead of naively throwing an error, the system can:

  • First, retrieve the event(s) persisted since the expected version
  • Then check whether these event(s) are really conflicting from a business perspective

Let’s take the following example:

In black, the events already persisted (CustomerCreated and FamilySituationUpdated). At this moment, the current version of the customer is v2.

Let’s now imagine the system receives two concurrent updates with the same expected version (v3). The first event ContactInfoUpdated will be persisted because of the version matches but what about the NewContractSubscribed one?

As we described above, the system can retrieve all the events since v3 (here only ContactInfoUpdated) and determine whether they are actually conflicting or not. In this example, we can imagine that adding a new contract is not a conflict with the update of a contact information so the system will commit the transaction.

To determine whether an event is in conflict with a set of already persisted events, there are several strategies.

We can, for example, implement a blacklist strategy. Any event type registers as well a set of conflicting events. If any of these events are found, the transaction is not committed.
Furthermore, we can also decide to do it with a custom conflict checker function. Bear in mind, though, that one of the benefits of this strategy is to be faster than a simple optimistic concurrency one. If the duration to execute our function is too important (and the chances to get concurrent updates are high), it will impact the system throughput.

Another possible strategy to deal with concurrent updates could be to vary the data model granularity. For example, instead of having a simple event stream for a customer, we could split up its representation in several streams (e.g. GeneralInformation and ContractInformation). Then we will have to apply an optimistic concurrency control for each stream (GeneralInformation has its own version and ContractInformation has also its own version).

Yet, this solution may have some limitations. Indeed we must guarantee the conflict resolution is stream-autonomous, meaning the events stored in one stream are enough to guarantee the integrity of our whole object.

On the opposite, though, a coarse-grained representation (e.g. one stream for a single customer), may lead to trigger the conflict resolution mechanism too often (and once again to impact the system throughput). Therefore the granularity decision remains a very important one, regardless of our strategy.