Hey guys, I am writing this blog post for beginners like me who want to understand CRDTs or distributed systems much.
Quoting Wikipedia – “In distributed computing, a conflict-free replicated data type (CRDT) is a data structure which can be replicated across multiple computers in a network, where the replicas can be updated independently and concurrently without coordination between the replicas, and where it is always mathematically possible to resolve inconsistencies which might result.”
I know it might be difficult to grasp it all. I will try to explain it intuitively.
Before we move ahead, Let’s pause for a second to understand one of the key concept of distributed systems and that it CAP theorem.
CAP theorem states that a distributed database system can only have 2 of the 3: Consistency, Availability and Partition Tolerance.
Consistency basically means that all nodes see the same data at the same time. High Availability is a condition which states that every request gets a response of success/failure. Achieving availability in a distributed system requires that the system remains operational 100% of the time. Now, Partition Tolerance condition states that the system continues to run, despite the number of messages being delayed by the network between nodes. Partition Tolerance is a necessity. Hence, the trade-off is between Consistency and Availability.
Great! Now let’s dive a bit deeper and understand how consistency, availability and fault tolerance plays role in distributed system design.
Imagine that you have a database which spans across continents. We aim to make it fault tolerant and consistent. No matter how many times the connections drop between nodes, node restarts or nodes completely goes down, we want data to be saved and consistent. This is a difficult problem right? Yes indeed.
You know there are many flavours of consistency which you can choose for distributed systems like strong consistency, eventual consistency, strong eventual consistency, optimistic consistency etc.
In case you are wondering CRDTs guarantees you strong eventual consistency.
Now coming back to the design, We have to make sure that whatever is being written in the database is replicated across. Now, I can imagine two ways to do replication, i.e Synchronous and Asynchronous.
Synchronous replications is easy to implement right? Take the write and make sure that it is written across the node(via acknowledgements) and there is no conflicts and only then return the response. Unfortunately, this is not scalable.
we are left with Async replications now. This comes with its own nuances. Couple of problems to think of is Order Replication and Data duplication.
Let me explain both the problems.
Order Independence – Suppose user did couple of operations in order R1 and R2 at node 1. however node 2 received the orders of operation as R2 and then R1. This can happen in a situation when R1 was sent the network dropped off and when network came back R2 was received first then R1.
Data duplication – Say we solved the above problem by introducing acknowledgement. Now, node 2 received R2 but during acknowledgement time connection dropped. Then node 1 will send R2 to node 2 again and this will cause node 2 to do the operation R2 twice.
Now how do we solve the data replication problem? Consensus Algos (Conflict Resolution : diverge → Rollback → Converge)?
Yes, Algorithms like Paxos and Raft can be used but they are hard to implement. Even if we implement it, we will have to bear huge computational cost. Again not Scalable!
Don’t lose hope! CRDTs are there to rescue you!
CRDTs are objects that can be updated without expensive synchronization/consensus and they are guaranteed to converge eventually if all concurrent updates are commutative and if all updates are executed by each replica eventually.
CRDTs have two important properties – commutative and idempotency.
Commutative property introduces order independence in the system while idempotency guarantees that data is not redelivered or duplicated.
Marc Shapiro in his paper talks about two models of replication – state-based and operation-based. Based on the approach two types or CRDTs are defined – CvRDT (convergent replicated data type) and CmRDT ( commutative replicated data type).
State-based replication – it first updates its local state, and then some time later it sends its full state to another replica. So occasionally every replica is sending its full state to some other replica in the system. CvRDTs satisfy this property.
Operation-based replication – In this approach a replica doesn’t send its full state to another replica, which can be huge. Instead it just sends/broadcasts the update operation to all the other replicas in the system and expects them to replay that update. CmRDTs satisfy this property.
Although, CRDTs are addressing an interesting and a fundamental problem in distributed systems, but they too have limitations.
Since, CRDTs don’t use consensus by design they only solve updates which are commutative and not all updates can be commutative!
However CRDTs are being used by a lot of tools like soundcloud, Redis, Riak, online gaming etc.
References and Further Reading –
Above links are good to develop deeper understanding of CRDTs. I will be writing more articles on CRDTs, mostly paper reviews. So subscribe/follow the blog.
Happy coding! 🙂